DPDK patches and discussions
 help / color / mirror / Atom feed
From: Bruce Richardson <bruce.richardson@intel.com>
To: "Loftus, Ciara" <ciara.loftus@intel.com>
Cc: "Hore, Soumyadeep" <soumyadeep.hore@intel.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"Singh, Aman Deep" <aman.deep.singh@intel.com>,
	"Subbarao, Manoj Kumar" <manoj.kumar.subbarao@intel.com>
Subject: Re: [PATCH v4 3/6] net/intel: add TxPP Support for E830
Date: Thu, 12 Jun 2025 17:44:19 +0100	[thread overview]
Message-ID: <aEsD4z-6QPFGkh6t@bricha3-mobl1.ger.corp.intel.com> (raw)
In-Reply-To: <DM3PPF7D18F34A1DCBEBCDE7C91289CCB5A8E74A@DM3PPF7D18F34A1.namprd11.prod.outlook.com>

On Thu, Jun 12, 2025 at 11:31:30AM +0100, Loftus, Ciara wrote:
> > Subject: [PATCH v4 3/6] net/intel: add TxPP Support for E830
> > 
> > Add support for Tx Time based queues. This is used to schedule
> > packets based on Tx timestamp.
> > 
> > Signed-off-by: Soumyadeep Hore <soumyadeep.hore@intel.com>

Hi Soumyadeep,

more comments inline below.

/Bruce

> > ---
> >  drivers/net/intel/common/tx.h              |  14 ++
> >  drivers/net/intel/ice/base/ice_lan_tx_rx.h |   4 +
> >  drivers/net/intel/ice/ice_ethdev.c         |   3 +-
> >  drivers/net/intel/ice/ice_ethdev.h         |  11 +
> >  drivers/net/intel/ice/ice_rxtx.c           | 229 ++++++++++++++++++++-
> >  drivers/net/intel/ice/ice_rxtx.h           |   9 +
> >  6 files changed, 261 insertions(+), 9 deletions(-)
> > 
> > diff --git a/drivers/net/intel/common/tx.h b/drivers/net/intel/common/tx.h
> > index b0a68bae44..1fbbe40cce 100644
> > --- a/drivers/net/intel/common/tx.h
> > +++ b/drivers/net/intel/common/tx.h
> > @@ -30,6 +30,19 @@ struct ci_tx_entry_vec {
> > 
> >  typedef void (*ice_tx_release_mbufs_t)(struct ci_tx_queue *txq);
> > 
> > +/**
> > + * Structure associated with Tx Time based queue
> > + */
> > +struct txtime {
> > +	volatile struct ice_ts_desc	*ice_ts_ring;  /* Tx time ring virtual
> > address */
> > +	uint16_t nb_ts_desc;	/* number of Tx Time descriptors */
> > +	uint16_t ts_tail;  /* current value of tail register */
> > +	int ts_offset;	/* dynamic mbuf Tx timestamp field offset */
> > +	uint64_t ts_flag;	/* dynamic mbuf Tx timestamp flag */
> > +	rte_iova_t ts_ring_dma;	/* TX time ring DMA address */
> > +	const struct rte_memzone *ts_mz;
> > +};
> 
> I recommend taking this out of the common file since it is only used by one Intel driver. If support is added for another we can move it at that point.
> 

Agree, until this is used by multiple drivers I'd keep it in the
ice-specific files. If here, it needs a "ci_" prefix.

Some other comments on the struct:
* you have mixed tabs and spaces in the definition and comments. There is
  no need for a tab before "*ice_ts_ring", it should be a single space
  only. The whitespace before the comments should probably be just spaces
  too, so you can line them up correctly.
* The ts_ring_dma field is unnecessary I think. Doing a quick grep of the
  code shows it being used only once, when setting up the queue context.
  For that scenario, it's as easy to get the dma address from the ts_mz
  pointer instead.

> > +
> >  struct ci_tx_queue {
> >  	union { /* TX ring virtual address */
> >  		volatile struct i40e_tx_desc *i40e_tx_ring;
> > @@ -77,6 +90,7 @@ struct ci_tx_queue {
> >  	union {
> >  		struct { /* ICE driver specific values */
> >  			uint32_t q_teid; /* TX schedule node id. */
> > +			struct txtime tsq; /* Tx Time based queue */
> >  		};
> >  		struct { /* I40E driver specific values */
> >  			uint8_t dcb_tc;
> > diff --git a/drivers/net/intel/ice/base/ice_lan_tx_rx.h
> > b/drivers/net/intel/ice/base/ice_lan_tx_rx.h
> > index f92382346f..cd02d2b4eb 100644
> > --- a/drivers/net/intel/ice/base/ice_lan_tx_rx.h
> > +++ b/drivers/net/intel/ice/base/ice_lan_tx_rx.h
> > @@ -1278,6 +1278,8 @@ struct ice_ts_desc {
> >  #define ICE_TXTIME_MAX_QUEUE		2047
> >  #define ICE_SET_TXTIME_MAX_Q_AMOUNT	127
> >  #define ICE_OP_TXTIME_MAX_Q_AMOUNT	2047
> > +#define ICE_TXTIME_FETCH_TS_DESC_DFLT	8
> > +#define ICE_TXTIME_FETCH_PROFILE_CNT	16
> >  /* Tx Time queue context data
> >   *
> >   * The sizes of the variables may be larger than needed due to crossing byte
> > @@ -1303,10 +1305,12 @@ struct ice_txtime_ctx {
> >  	u8 drbell_mode_32;
> >  #define ICE_TXTIME_CTX_DRBELL_MODE_32	1
> >  	u8 ts_res;
> > +#define ICE_TXTIME_CTX_RESOLUTION_128NS 7
> >  	u8 ts_round_type;
> >  	u8 ts_pacing_slot;
> >  	u8 merging_ena;
> >  	u8 ts_fetch_prof_id;
> > +#define ICE_TXTIME_CTX_FETCH_PROF_ID_0 0
> >  	u8 ts_fetch_cache_line_aln_thld;
> >  	u8 tx_pipe_delay_mode;
> >  	u8 int_q_state; /* width not needed - internal - DO NOT WRITE!!! */
> > diff --git a/drivers/net/intel/ice/ice_ethdev.c
> > b/drivers/net/intel/ice/ice_ethdev.c
> > index 9478ba92df..3af9f6ba38 100644
> > --- a/drivers/net/intel/ice/ice_ethdev.c
> > +++ b/drivers/net/intel/ice/ice_ethdev.c
> > @@ -4139,7 +4139,8 @@ ice_dev_info_get(struct rte_eth_dev *dev, struct
> > rte_eth_dev_info *dev_info)
> >  			RTE_ETH_TX_OFFLOAD_VXLAN_TNL_TSO |
> >  			RTE_ETH_TX_OFFLOAD_GRE_TNL_TSO |
> >  			RTE_ETH_TX_OFFLOAD_IPIP_TNL_TSO |
> > -			RTE_ETH_TX_OFFLOAD_GENEVE_TNL_TSO;
> > +			RTE_ETH_TX_OFFLOAD_GENEVE_TNL_TSO |
> > +			RTE_ETH_TX_OFFLOAD_SEND_ON_TIMESTAMP;
> >  		dev_info->flow_type_rss_offloads |= ICE_RSS_OFFLOAD_ALL;
> >  	}
> > 
> > diff --git a/drivers/net/intel/ice/ice_ethdev.h
> > b/drivers/net/intel/ice/ice_ethdev.h
> > index bfe093afca..f39c4b0471 100644
> > --- a/drivers/net/intel/ice/ice_ethdev.h
> > +++ b/drivers/net/intel/ice/ice_ethdev.h
> > @@ -17,6 +17,17 @@
> >  #include "base/ice_flow.h"
> >  #include "base/ice_sched.h"
> > 
> > +#define FIELD_GET(_mask, _reg) \
> > +	(__extension__ ({ \
> > +		typeof(_mask) _x = (_mask); \
> > +		(typeof(_x))(((_reg) & (_x)) >> rte_bsf32(_x)); \
> > +	}))
> > +#define FIELD_PREP(_mask, _val) \
> > +	(__extension__ ({ \
> > +		typeof(_mask) _x = (_mask); \
> > +		((typeof(_x))(_val) << rte_bsf32(_x)) & (_x); \
> > +	}))
> > +
> 
> I suggest moving these macros to ice_rxtx.h.
> 
> 
> >  #define ICE_ADMINQ_LEN               32
> >  #define ICE_SBIOQ_LEN                32
> >  #define ICE_MAILBOXQ_LEN             32
> > diff --git a/drivers/net/intel/ice/ice_rxtx.c b/drivers/net/intel/ice/ice_rxtx.c
> > index ba1435b9de..94d42ddf06 100644
> > --- a/drivers/net/intel/ice/ice_rxtx.c
> > +++ b/drivers/net/intel/ice/ice_rxtx.c
> > @@ -740,6 +740,50 @@ ice_rx_queue_stop(struct rte_eth_dev *dev,
> > uint16_t rx_queue_id)
> >  	return 0;
> >  }
> > 
> > +/**
> > + * ice_setup_txtime_ctx - setup a struct ice_txtime_ctx instance
> > + * @txq: The queue on which tstamp ring to configure
> > + * @txtime_ctx: Pointer to the Tx time queue context structure to be
> > initialized
> > + * @txtime_ena: Tx time enable flag, set to true if Tx time should be enabled
> > + */
> > +static int
> > +ice_setup_txtime_ctx(struct ci_tx_queue *txq,
> > +		     struct ice_txtime_ctx *txtime_ctx, bool txtime_ena)
> > +{
> > +	struct ice_vsi *vsi = txq->ice_vsi;
> > +	struct ice_hw *hw = ICE_VSI_TO_HW(vsi);
> > +
> > +	txtime_ctx->base = txq->tsq.ts_ring_dma >>
> > ICE_TX_CMPLTNQ_CTX_BASE_S;
> > +
> > +	/* Tx time Queue Length */
> > +	txtime_ctx->qlen = txq->tsq.nb_ts_desc;
> > +
> > +	if (txtime_ena)
> > +		txtime_ctx->txtime_ena_q = 1;
> > +
> > +	/* PF number */
> > +	txtime_ctx->pf_num = hw->pf_id;
> > +
> > +	switch (vsi->type) {
> > +	case ICE_VSI_PF:
> > +		txtime_ctx->vmvf_type = ICE_TLAN_CTX_VMVF_TYPE_PF;
> > +		break;
> > +	default:
> > +		PMD_DRV_LOG(ERR, "Unable to set VMVF type for VSI type
> > %d",
> > +				vsi->type);
> > +		return -EINVAL;
> > +	}
> > +
> > +	/* make sure the context is associated with the right VSI */
> > +	txtime_ctx->src_vsi = vsi->vsi_id;
> > +
> > +	txtime_ctx->ts_res = ICE_TXTIME_CTX_RESOLUTION_128NS;
> > +	txtime_ctx->drbell_mode_32 = ICE_TXTIME_CTX_DRBELL_MODE_32;
> > +	txtime_ctx->ts_fetch_prof_id = ICE_TXTIME_CTX_FETCH_PROF_ID_0;
> > +
> > +	return 0;
> > +}
> > +
> >  int
> >  ice_tx_queue_start(struct rte_eth_dev *dev, uint16_t tx_queue_id)
> >  {
> > @@ -799,11 +843,6 @@ ice_tx_queue_start(struct rte_eth_dev *dev,
> > uint16_t tx_queue_id)
> >  	ice_set_ctx(hw, (uint8_t *)&tx_ctx, txq_elem->txqs[0].txq_ctx,
> >  		    ice_tlan_ctx_info);
> > 
> > -	txq->qtx_tail = hw->hw_addr + QTX_COMM_DBELL(txq->reg_idx);
> > -
> > -	/* Init the Tx tail register*/
> > -	ICE_PCI_REG_WRITE(txq->qtx_tail, 0);
> > -
> >  	/* Fix me, we assume TC always 0 here */
> >  	err = ice_ena_vsi_txq(hw->port_info, vsi->idx, 0, tx_queue_id, 1,
> >  			txq_elem, buf_len, NULL);
> > @@ -826,6 +865,39 @@ ice_tx_queue_start(struct rte_eth_dev *dev,
> > uint16_t tx_queue_id)
> >  	/* record what kind of descriptor cleanup we need on teardown */
> >  	txq->vector_tx = ad->tx_vec_allowed;
> > 
> > +	if (txq->tsq.ts_flag > 0) {
> > +		struct ice_aqc_set_txtime_qgrp *ts_elem;
> > +		u8 ts_buf_len = ice_struct_size(ts_elem, txtimeqs, 1);
> > +		struct ice_txtime_ctx txtime_ctx = { 0 };
> > +
> > +		ts_elem = ice_malloc(hw, ts_buf_len);
> > +		ice_setup_txtime_ctx(txq, &txtime_ctx,
> > +						true);
> 
> Nit: this does not need to be broken into two lines.
> 
> > +		ice_set_ctx(hw, (u8 *)&txtime_ctx,
> > +				ts_elem->txtimeqs[0].txtime_ctx,
> > +				ice_txtime_ctx_info);
> > +
> > +		txq->qtx_tail = hw->hw_addr +
> > +				E830_GLQTX_TXTIME_DBELL_LSB(txq-
> > >reg_idx);
> > +

Another nit: this doesn't need breaking into two lines either.

> > +		/* Init the Tx time tail register*/
> > +		ICE_PCI_REG_WRITE(txq->qtx_tail, 0);
> > +
> > +		err = ice_aq_set_txtimeq(hw, txq->reg_idx, 1, ts_elem,
> > +				ts_buf_len, NULL);
> > +		rte_free(ts_elem);
> > +		if (err) {
> > +			PMD_DRV_LOG(ERR, "Failed to set Tx Time queue
> > context, error: %d", err);
> > +			rte_free(txq_elem);
> > +			return err;
> > +		}
> > +	} else {
> > +		txq->qtx_tail = hw->hw_addr + QTX_COMM_DBELL(txq-
> > >reg_idx);
> > +
> > +		/* Init the Tx tail register*/
> > +		ICE_PCI_REG_WRITE(txq->qtx_tail, 0);

Can this init of the tail register be put outside the if statement to save
duplicating it? Does it work ok if the set_timeq() call happens before we
init the tail?

> > +	}
> > +
> >  	dev->data->tx_queue_state[tx_queue_id] =
> > RTE_ETH_QUEUE_STATE_STARTED;
> > 
> >  	rte_free(txq_elem);
> > @@ -1046,6 +1118,20 @@ ice_reset_tx_queue(struct ci_tx_queue *txq)
> > 
> >  	txq->last_desc_cleaned = (uint16_t)(txq->nb_tx_desc - 1);
> >  	txq->nb_tx_free = (uint16_t)(txq->nb_tx_desc - 1);
> > +
> > +	if (txq->tsq.ts_flag > 0) {
> > +		size = sizeof(struct ice_ts_desc) * txq->tsq.nb_ts_desc;
> > +		for (i = 0; i < size; i++)
> > +			((volatile char *)txq->tsq.ice_ts_ring)[i] = 0;
> > +
> > +		for (i = 0; i < txq->tsq.nb_ts_desc; i++) {
> > +			volatile struct ice_ts_desc *tsd =
> > +							&txq-
> > >tsq.ice_ts_ring[i];
> > +			tsd->tx_desc_idx_tstamp = 0;
> > +		}
> > +

Both these loops are zeroing the ring. Just use one memset, which should
reduce the if block nicely e.g.:

  if (txq->tsq.ts_flag != 0) {
  	size = sizeof(txq->tsq.ice_ts_ring[0]) * txq->tsq.nb_ts_desc;
  	memset(txq->tsq.ice_ts_ring, 0, size);
  	txq->rsq.ts_tail = 0;
  }


> > +		txq->tsq.ts_tail = 0;
> > +	}
> >  }
> > 
> >  int
> > @@ -1080,6 +1166,19 @@ ice_tx_queue_stop(struct rte_eth_dev *dev,
> > uint16_t tx_queue_id)
> >  	q_ids[0] = txq->reg_idx;
> >  	q_teids[0] = txq->q_teid;
> > 
> > +	if (txq->tsq.ts_flag > 0) {
> > +		struct ice_aqc_ena_dis_txtime_qgrp txtime_pg;
> > +		dev->dev_ops->timesync_disable(dev);
> > +		status = ice_aq_ena_dis_txtimeq(hw, q_ids[0], 1, 0,
> > +					     &txtime_pg, NULL);

nit: Can be a single line. We can use up to 100 chars per line to avoid splits.

> > +		if (status != ICE_SUCCESS) {
> > +			PMD_DRV_LOG(DEBUG, "Failed to disable Tx time
> > queue");
> > +			return -EINVAL;
> > +		}
> > +		txq->tsq.ts_flag = 0;
> > +		txq->tsq.ts_offset = -1;
> > +	}
> > +

Resetting ts_flag and ts_offset on stop looks wrong to me. The
"tx_queue_start" function looks at ts_flag to see if it should enable the
txtime queue. That means that calling stop and start on the queue will
cause the queue to lose the configured time functionality.

> >  	/* Fix me, we assume TC always 0 here */
> >  	status = ice_dis_vsi_txq(hw->port_info, vsi->idx, 0, 1, &q_handle,
> >  				q_ids, q_teids, ICE_NO_RESET, 0, NULL);
> > @@ -1166,6 +1265,7 @@ ice_rx_queue_setup(struct rte_eth_dev *dev,
> >  		   struct rte_mempool *mp)
> >  {
> >  	struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private);
> > +	struct ice_hw *hw = ICE_DEV_PRIVATE_TO_HW(dev->data-
> > >dev_private);
> >  	struct ice_adapter *ad =
> >  		ICE_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
> >  	struct ice_vsi *vsi = pf->main_vsi;
> > @@ -1249,7 +1349,7 @@ ice_rx_queue_setup(struct rte_eth_dev *dev,
> >  	rxq->xtr_field_offs = ad->devargs.xtr_field_offs;
> > 
> >  	/* Allocate the maximum number of RX ring hardware descriptor. */
> > -	len = ICE_MAX_RING_DESC;
> > +	len = ICE_MAX_NUM_DESC_BY_MAC(hw);
> > 
> >  	/**
> >  	 * Allocating a little more memory because vectorized/bulk_alloc Rx
> > @@ -1337,6 +1437,37 @@ ice_rx_queue_release(void *rxq)
> >  	rte_free(q);
> >  }
> > 
> > +/**
> > + * ice_calc_ts_ring_count - Calculate the number of timestamp descriptors
> > + * @hw: pointer to the hardware structure
> > + * @tx_desc_count: number of Tx descriptors in the ring
> > + *
> > + * Return: the number of timestamp descriptors
> > + */
> > +static uint16_t
> > +ice_calc_ts_ring_count(struct ice_hw *hw, u16 tx_desc_count)
> > +{
> > +	u16 prof = ICE_TXTIME_CTX_FETCH_PROF_ID_0;
> > +	u16 max_fetch_desc = 0;
> > +	u16 fetch;
> > +	u32 reg;
> > +	u16 i;
> > +
> > +	for (i = 0; i < ICE_TXTIME_FETCH_PROFILE_CNT; i++) {
> > +		reg = rd32(hw, E830_GLTXTIME_FETCH_PROFILE(prof, 0));
> > +		fetch =
> > FIELD_GET(E830_GLTXTIME_FETCH_PROFILE_FETCH_TS_DESC_M,
> > +				  reg);
> > +		max_fetch_desc = max(fetch, max_fetch_desc);
> > +	}
> > +
> > +	if (!max_fetch_desc)
> > +		max_fetch_desc = ICE_TXTIME_FETCH_TS_DESC_DFLT;
> > +
> > +	max_fetch_desc = RTE_ALIGN(max_fetch_desc,
> > ICE_REQ_DESC_MULTIPLE);
> > +
> > +	return tx_desc_count + max_fetch_desc;
> > +}
> > +
> >  int
> >  ice_tx_queue_setup(struct rte_eth_dev *dev,
> >  		   uint16_t queue_idx,
> > @@ -1345,6 +1476,7 @@ ice_tx_queue_setup(struct rte_eth_dev *dev,
> >  		   const struct rte_eth_txconf *tx_conf)
> >  {
> >  	struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private);
> > +	struct ice_hw *hw = ICE_DEV_PRIVATE_TO_HW(dev->data-
> > >dev_private);
> >  	struct ice_vsi *vsi = pf->main_vsi;
> >  	struct ci_tx_queue *txq;
> >  	const struct rte_memzone *tz;
> > @@ -1469,7 +1601,8 @@ ice_tx_queue_setup(struct rte_eth_dev *dev,
> >  	}
> > 
> >  	/* Allocate TX hardware ring descriptors. */
> > -	ring_size = sizeof(struct ice_tx_desc) * ICE_MAX_RING_DESC;
> > +	ring_size = sizeof(struct ice_tx_desc) *
> > +				ICE_MAX_NUM_DESC_BY_MAC(hw);
> >  	ring_size = RTE_ALIGN(ring_size, ICE_DMA_MEM_ALIGN);
> >  	tz = rte_eth_dma_zone_reserve(dev, "ice_tx_ring", queue_idx,
> >  				      ring_size, ICE_RING_BASE_ALIGN,
> > @@ -1507,6 +1640,42 @@ ice_tx_queue_setup(struct rte_eth_dev *dev,
> >  		return -ENOMEM;
> >  	}
> > 
> > +	if (vsi->type == ICE_VSI_PF &&
> > +		(offloads & RTE_ETH_TX_OFFLOAD_SEND_ON_TIMESTAMP)
> > &&
> > +		txq->tsq.ts_offset == 0 && hw->phy_model == ICE_PHY_E830)
> 
> Perhaps we should log an error if the user tries to enable this offload on an unsupported device. At the moment we silently fail.
> 

Also, indentation on this "if" condition is misleading. It's indented by
just one extra tab, which means that is aligns with the body of the block.

> > {
> > +		int ret =
> > +			rte_mbuf_dyn_tx_timestamp_register(&txq-
> > >tsq.ts_offset,
> > +							 &txq->tsq.ts_flag);
> > +		if (ret) {
> > +			PMD_INIT_LOG(ERR, "Cannot register Tx mbuf
> > field/flag "
> > +							"for timestamp");
> > +			return -EINVAL;
> > +		}
> > +		dev->dev_ops->timesync_enable(dev);
> > +
> > +		ring_size = sizeof(struct ice_ts_desc) *
> > +					ICE_MAX_NUM_DESC_BY_MAC(hw);
> > +		ring_size = RTE_ALIGN(ring_size, ICE_DMA_MEM_ALIGN);
> > +		const struct rte_memzone *ts_z =
> > +					rte_eth_dma_zone_reserve(dev,
> > "ice_tstamp_ring",
> > +					queue_idx, ring_size,
> > ICE_RING_BASE_ALIGN,
> > +				    socket_id);
> > +		if (!ts_z) {
> > +			ice_tx_queue_release(txq);
> > +			PMD_INIT_LOG(ERR, "Failed to reserve DMA memory
> > "
> > +							"for TX timestamp");
> > +			return -ENOMEM;
> > +		}
> > +		txq->tsq.ts_mz = ts_z;
> > +		txq->tsq.ice_ts_ring = ts_z->addr;
> > +		txq->tsq.ts_ring_dma = ts_z->iova;
> > +		txq->tsq.nb_ts_desc =
> > +				ice_calc_ts_ring_count(ICE_VSI_TO_HW(vsi),
> > +							txq->nb_tx_desc);

Nit: these 3 lines can all be squashed into one (99 chars long < 100!)

More seriously:
With current ring size values and alignment, I don't think it can happen,
but this assignment is unsafe and with different values could cause buffer
overflows when you clear the ring. Here is now:

* The ring size allocated is based on ICE_MAX_NUM_DESC_BY_MAC(). If alignment
  does not increase that max value, then the allocated ring size for the
  timestamp ring will be exactly ICE_MAX_NUM_DESC_BY_MAC().
* The ice_calc_ts_ring_count() value always returns a value >= to the input
  value, which means that if the user configures the tx_ring size to be the
  max value, then nb_ts_desc will be set to a value > ICE_MAX_NUM_DESC_BY_MAC().

On a semi-related note - is there a particular reason why the code is
aligning the memory size allocation for the timestamp ring to be a multiple of
4k? I can understand why the start of the area gets aligned (to 128bytes),
but not sure why we need a size that is 4k aligned. Can you explain?

If the ring alignment is unnecessary and is dropped, I think we could
definitely have overflow if the user configured a Tx ring to the max size.

> > +	} else {
> > +		txq->tsq.ice_ts_ring = NULL;
> > +	}
> > +
> >  	ice_reset_tx_queue(txq);
> >  	txq->q_set = true;
> >  	dev->data->tx_queues[queue_idx] = txq;
> > @@ -1539,6 +1708,8 @@ ice_tx_queue_release(void *txq)
> > 
> >  	ci_txq_release_all_mbufs(q, false);
> >  	rte_free(q->sw_ring);
> > +	if (q->tsq.ts_mz)
> > +		rte_memzone_free(q->tsq.ts_mz);
> >  	rte_memzone_free(q->mz);
> >  	rte_free(q);
> >  }
> > @@ -2961,6 +3132,7 @@ ice_xmit_pkts(void *tx_queue, struct rte_mbuf
> > **tx_pkts, uint16_t nb_pkts)
> >  	struct rte_mbuf *m_seg;
> >  	uint32_t cd_tunneling_params;
> >  	uint16_t tx_id;
> > +	uint16_t ts_id = -1;
> >  	uint16_t nb_tx;
> >  	uint16_t nb_used;
> >  	uint16_t nb_ctx;
> > @@ -2979,6 +3151,9 @@ ice_xmit_pkts(void *tx_queue, struct rte_mbuf
> > **tx_pkts, uint16_t nb_pkts)
> >  	tx_id = txq->tx_tail;
> >  	txe = &sw_ring[tx_id];
> > 
> > +	if (txq->tsq.ts_flag > 0)
> > +		ts_id = txq->tsq.ts_tail;
> > +
> >  	/* Check if the descriptor ring needs to be cleaned. */
> >  	if (txq->nb_tx_free < txq->tx_free_thresh)
> >  		(void)ice_xmit_cleanup(txq);
> > @@ -3166,10 +3341,48 @@ ice_xmit_pkts(void *tx_queue, struct rte_mbuf
> > **tx_pkts, uint16_t nb_pkts)
> >  		txd->cmd_type_offset_bsz |=
> >  			rte_cpu_to_le_64(((uint64_t)td_cmd) <<
> >  					 ICE_TXD_QW1_CMD_S);
> > +
> > +		if (txq->tsq.ts_flag > 0) {
> > +			uint64_t txtime = *RTE_MBUF_DYNFIELD(tx_pkt,
> > +					txq->tsq.ts_offset, uint64_t *);
> > +			uint32_t tstamp = (uint32_t)(txtime % NS_PER_S) >>
> > +
> > 	ICE_TXTIME_CTX_RESOLUTION_128NS;
> > +			if (tx_id == 0)
> > +				txq-
> > >tsq.ice_ts_ring[ts_id].tx_desc_idx_tstamp =
> > +
> > 	rte_cpu_to_le_32(FIELD_PREP(ICE_TXTIME_TX_DESC_IDX_M,
> > +						txq->nb_tx_desc) |
> > FIELD_PREP(ICE_TXTIME_STAMP_M,
> > +						tstamp));
> > +			else
> > +				txq-
> > >tsq.ice_ts_ring[ts_id].tx_desc_idx_tstamp =
> > +
> > 	rte_cpu_to_le_32(FIELD_PREP(ICE_TXTIME_TX_DESC_IDX_M,
> > +						tx_id) |
> > FIELD_PREP(ICE_TXTIME_STAMP_M, tstamp));
> > +			ts_id++;
> > +			/* Handling MDD issue causing Tx Hang */

Please provide more detail here. My reading is that on timestamp ring
wrap-around we need to have at least "fetch-size" descriptors populated in
the ring, so duplicate the last valid descriptor at the start of the ring
"fetch-size" times. Is that an accurate summary?

> > +			if (ts_id == txq->tsq.nb_ts_desc) {
> > +				uint16_t fetch = txq->tsq.nb_ts_desc - txq-
> > >nb_tx_desc;
> > +				ts_id = 0;
> > +				for (; ts_id < fetch; ts_id++) {
> > +					if (tx_id == 0)
> > +						txq-
> > >tsq.ice_ts_ring[ts_id].tx_desc_idx_tstamp =
> > +
> > 	rte_cpu_to_le_32(FIELD_PREP(ICE_TXTIME_TX_DESC_IDX_M,
> > +								txq-
> > >nb_tx_desc) | FIELD_PREP(ICE_TXTIME_STAMP_M,
> > +								tstamp));
> > +					else
> > +						txq-
> > >tsq.ice_ts_ring[ts_id].tx_desc_idx_tstamp =
> > +
> > 	rte_cpu_to_le_32(FIELD_PREP(ICE_TXTIME_TX_DESC_IDX_M,
> > +								tx_id) |
> > FIELD_PREP(ICE_TXTIME_STAMP_M, tstamp));
> > +				}
> > +			}
> > +		}
> >  	}
> >  end_of_tx:
> >  	/* update Tail register */
> > -	ICE_PCI_REG_WRITE(txq->qtx_tail, tx_id);
> > +	if (txq->tsq.ts_flag > 0) {
> > +		ICE_PCI_REG_WRITE(txq->qtx_tail, ts_id);
> > +		txq->tsq.ts_tail = ts_id;
> > +	} else {
> > +		ICE_PCI_REG_WRITE(txq->qtx_tail, tx_id);
> > +	}
> >  	txq->tx_tail = tx_id;
> > 
> >  	return nb_tx;
> > diff --git a/drivers/net/intel/ice/ice_rxtx.h b/drivers/net/intel/ice/ice_rxtx.h
> > index 500d630679..a9e8b5c5e9 100644
> > --- a/drivers/net/intel/ice/ice_rxtx.h
> > +++ b/drivers/net/intel/ice/ice_rxtx.h
> > @@ -11,9 +11,18 @@
> >  #define ICE_ALIGN_RING_DESC  32
> >  #define ICE_MIN_RING_DESC    64
> >  #define ICE_MAX_RING_DESC    (8192 - 32)
> > +#define ICE_MAX_RING_DESC_E830	  8096
> > +#define ICE_MAX_NUM_DESC_BY_MAC(hw) ((hw)->phy_model == \
> > +					ICE_PHY_E830 ? \
> > +				    ICE_MAX_RING_DESC_E830 : \
> > +				    ICE_MAX_RING_DESC)
> >  #define ICE_DMA_MEM_ALIGN    4096
> >  #define ICE_RING_BASE_ALIGN  128
> > 
> > +#define ICE_TXTIME_TX_DESC_IDX_M	RTE_GENMASK32(12, 0)
> > +#define ICE_TXTIME_STAMP_M		RTE_GENMASK32(31, 13)
> > +#define ICE_REQ_DESC_MULTIPLE	32
> > +
> >  #define ICE_RX_MAX_BURST 32
> >  #define ICE_TX_MAX_BURST 32
> > 
> > --
> > 2.47.1
> 

  reply	other threads:[~2025-06-12 16:44 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-06 21:19 [PATCH v1 0/6] Add " Soumyadeep Hore
2025-06-06 21:19 ` [PATCH v1 1/6] net/intel: update E830 Tx Time Queue Context Structure Soumyadeep Hore
2025-06-07 17:08   ` [PATCH v2 0/6] Add TxPP Support for E830 Soumyadeep Hore
2025-06-07 17:08     ` [PATCH v2 1/6] net/intel: update E830 Tx Time Queue Context Structure Soumyadeep Hore
2025-06-07 17:09     ` [PATCH v2 2/6] net/intel: add read clock feature in ICE Soumyadeep Hore
2025-06-07 17:09     ` [PATCH v2 3/6] net/intel: add TxPP Support for E830 Soumyadeep Hore
2025-06-07 17:09     ` [PATCH v2 4/6] net/intel: add AVX2 Support for TxPP Soumyadeep Hore
2025-06-07 17:09     ` [PATCH v2 5/6] net/intel: add AVX512 " Soumyadeep Hore
2025-06-07 17:09     ` [PATCH v2 6/6] doc: announce TxPP support for E830 adapters Soumyadeep Hore
2025-06-08 11:32   ` [PATCH v3 0/6] Add TxPP Support for E830 Soumyadeep Hore
2025-06-08 11:32     ` [PATCH v3 1/6] net/intel: update E830 Tx Time Queue Context Structure Soumyadeep Hore
2025-06-08 11:32     ` [PATCH v3 2/6] net/intel: add read clock feature in ICE Soumyadeep Hore
2025-06-09 13:57       ` Bruce Richardson
2025-06-10 11:50         ` Hore, Soumyadeep
2025-06-08 11:32     ` [PATCH v3 3/6] net/intel: add TxPP Support for E830 Soumyadeep Hore
2025-06-09 12:52       ` Bruce Richardson
2025-06-10 11:47         ` Hore, Soumyadeep
2025-06-09 14:39       ` Bruce Richardson
2025-06-10 11:53         ` Hore, Soumyadeep
2025-06-08 11:32     ` [PATCH v3 4/6] net/intel: add AVX2 Support for TxPP Soumyadeep Hore
2025-06-09 15:19       ` Bruce Richardson
2025-06-08 11:32     ` [PATCH v3 5/6] net/intel: add AVX512 " Soumyadeep Hore
2025-06-09 15:21       ` Bruce Richardson
2025-06-08 11:32     ` [PATCH v3 6/6] doc: announce TxPP support for E830 adapters Soumyadeep Hore
2025-06-09 13:38       ` Bruce Richardson
2025-06-10 13:11   ` [PATCH v4 0/6] Add TxPP Support for E830 Soumyadeep Hore
2025-06-10 13:11     ` [PATCH v4 1/6] net/intel: update E830 Tx Time Queue Context Structure Soumyadeep Hore
2025-06-10 13:11     ` [PATCH v4 2/6] net/intel: add read clock feature in ICE Soumyadeep Hore
2025-06-12 10:14       ` Loftus, Ciara
2025-06-12 15:48         ` Bruce Richardson
2025-06-10 13:11     ` [PATCH v4 3/6] net/intel: add TxPP Support for E830 Soumyadeep Hore
2025-06-12 10:31       ` Loftus, Ciara
2025-06-12 16:44         ` Bruce Richardson [this message]
2025-06-10 13:11     ` [PATCH v4 4/6] net/intel: add AVX2 Support for TxPP Soumyadeep Hore
2025-06-10 13:11     ` [PATCH v4 5/6] net/intel: add AVX512 " Soumyadeep Hore
2025-06-10 13:11     ` [PATCH v4 6/6] doc: announce TxPP support for E830 adapters Soumyadeep Hore
2025-06-12 16:46       ` Bruce Richardson
2025-06-06 21:19 ` [PATCH v1 2/6] net/intel: add read clock feature in ICE Soumyadeep Hore
2025-06-06 21:19 ` [PATCH v1 3/6] net/intel: add TxPP Support for E830 Soumyadeep Hore
2025-06-06 21:19 ` [PATCH v1 4/6] net/intel: add AVX2 Support for TxPP Soumyadeep Hore
2025-06-06 21:19 ` [PATCH v1 5/6] net/intel: add AVX512 " Soumyadeep Hore
2025-06-06 21:19 ` [PATCH v1 6/6] doc: announce TxPP support for E830 adapters Soumyadeep Hore

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aEsD4z-6QPFGkh6t@bricha3-mobl1.ger.corp.intel.com \
    --to=bruce.richardson@intel.com \
    --cc=aman.deep.singh@intel.com \
    --cc=ciara.loftus@intel.com \
    --cc=dev@dpdk.org \
    --cc=manoj.kumar.subbarao@intel.com \
    --cc=soumyadeep.hore@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).