From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 58A9EA04AA;
	Mon,  7 Sep 2020 23:43:27 +0200 (CEST)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id 8525B1C1C2;
	Mon,  7 Sep 2020 23:41:03 +0200 (CEST)
Received: from mga01.intel.com (mga01.intel.com [192.55.52.88])
 by dpdk.org (Postfix) with ESMTP id E10FA1C13A
 for <dev@dpdk.org>; Mon,  7 Sep 2020 23:40:50 +0200 (CEST)
IronPort-SDR: zGOZe/P/Qy9vKBfDY6GqqM5/NZ8U1fRBVhmM0/zGdaQuB3BDkg6NQn6XJIVvDkTuIIP63Ctyml
 PnPJqEEKkXYA==
X-IronPort-AV: E=McAfee;i="6000,8403,9737"; a="176099076"
X-IronPort-AV: E=Sophos;i="5.76,403,1592895600"; d="scan'208";a="176099076"
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from fmsmga002.fm.intel.com ([10.253.24.26])
 by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 07 Sep 2020 14:40:50 -0700
IronPort-SDR: YnVOTnP5gfg6NlwpLRzQUO3iDz3YPPvAi9FX0FbGGnD/nzRxB2Yc8GaF9oWg6LBv7GaFO+Fd9M
 o8Q8LpQlo4SA==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.76,403,1592895600"; d="scan'208";a="336190164"
Received: from silpixa00400573.ir.intel.com (HELO
 silpixa00400573.ger.corp.intel.com) ([10.237.223.107])
 by fmsmga002.fm.intel.com with ESMTP; 07 Sep 2020 14:40:49 -0700
From: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
To: dev@dpdk.org
Date: Mon,  7 Sep 2020 22:40:08 +0100
Message-Id: <20200907214032.95052-18-cristian.dumitrescu@intel.com>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20200907214032.95052-1-cristian.dumitrescu@intel.com>
References: <20200826151445.51500-2-cristian.dumitrescu@intel.com>
 <20200907214032.95052-1-cristian.dumitrescu@intel.com>
Subject: [dpdk-dev] [PATCH v2 17/41] pipeline: introduce SWX cksub
	instruction
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

The cksub (i.e. checksum subtract) instruction is used to update the
1's complement sum commonly used by protocols such as IPv4, TCP or
UDP.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
---
 lib/librte_pipeline/rte_swx_pipeline.c | 116 +++++++++++++++++++++++++
 1 file changed, 116 insertions(+)

diff --git a/lib/librte_pipeline/rte_swx_pipeline.c b/lib/librte_pipeline/rte_swx_pipeline.c
index 96e6c98aa..364c7d75a 100644
--- a/lib/librte_pipeline/rte_swx_pipeline.c
+++ b/lib/librte_pipeline/rte_swx_pipeline.c
@@ -297,6 +297,12 @@ enum instruction_type {
 	INSTR_ALU_CKADD_FIELD,    /* src = H */
 	INSTR_ALU_CKADD_STRUCT20, /* src = h.header, with sizeof(header) = 20 */
 	INSTR_ALU_CKADD_STRUCT,   /* src = h.hdeader, with any sizeof(header) */
+
+	/* cksub dst src
+	 * dst = dst '- src
+	 * dst = H, src = H
+	 */
+	INSTR_ALU_CKSUB_FIELD,
 };
 
 struct instr_operand {
@@ -3034,6 +3040,36 @@ instr_alu_ckadd_translate(struct rte_swx_pipeline *p,
 	return 0;
 }
 
+static int
+instr_alu_cksub_translate(struct rte_swx_pipeline *p,
+			  struct action *action __rte_unused,
+			  char **tokens,
+			  int n_tokens,
+			  struct instruction *instr,
+			  struct instruction_data *data __rte_unused)
+{
+	char *dst = tokens[1], *src = tokens[2];
+	struct header *hdst, *hsrc;
+	struct field *fdst, *fsrc;
+
+	CHECK(n_tokens == 3, EINVAL);
+
+	fdst = header_field_parse(p, dst, &hdst);
+	CHECK(fdst && (fdst->n_bits == 16), EINVAL);
+
+	fsrc = header_field_parse(p, src, &hsrc);
+	CHECK(fsrc, EINVAL);
+
+	instr->type = INSTR_ALU_CKSUB_FIELD;
+	instr->alu.dst.struct_id = (uint8_t)hdst->struct_id;
+	instr->alu.dst.n_bits = fdst->n_bits;
+	instr->alu.dst.offset = fdst->offset / 8;
+	instr->alu.src.struct_id = (uint8_t)hsrc->struct_id;
+	instr->alu.src.n_bits = fsrc->n_bits;
+	instr->alu.src.offset = fsrc->offset / 8;
+	return 0;
+}
+
 static inline void
 instr_alu_add_exec(struct rte_swx_pipeline *p)
 {
@@ -3273,6 +3309,77 @@ instr_alu_ckadd_field_exec(struct rte_swx_pipeline *p)
 	thread_ip_inc(p);
 }
 
+static inline void
+instr_alu_cksub_field_exec(struct rte_swx_pipeline *p)
+{
+	struct thread *t = &p->threads[p->thread_id];
+	struct instruction *ip = t->ip;
+	uint8_t *dst_struct, *src_struct;
+	uint16_t *dst16_ptr, dst;
+	uint64_t *src64_ptr, src64, src64_mask, src;
+	uint64_t r;
+
+	TRACE("[Thread %2u] cksub (field)\n", p->thread_id);
+
+	/* Structs. */
+	dst_struct = t->structs[ip->alu.dst.struct_id];
+	dst16_ptr = (uint16_t *)&dst_struct[ip->alu.dst.offset];
+	dst = *dst16_ptr;
+
+	src_struct = t->structs[ip->alu.src.struct_id];
+	src64_ptr = (uint64_t *)&src_struct[ip->alu.src.offset];
+	src64 = *src64_ptr;
+	src64_mask = UINT64_MAX >> (64 - ip->alu.src.n_bits);
+	src = src64 & src64_mask;
+
+	r = dst;
+	r = ~r & 0xFFFF;
+
+	/* Subtraction in 1's complement arithmetic (i.e. a '- b) is the same as
+	 * the following sequence of operations in 2's complement arithmetic:
+	 *    a '- b = (a - b) % 0xFFFF.
+	 *
+	 * In order to prevent an underflow for the below subtraction, in which
+	 * a 33-bit number (the subtrahend) is taken out of a 16-bit number (the
+	 * minuend), we first add a multiple of the 0xFFFF modulus to the
+	 * minuend. The number we add to the minuend needs to be a 34-bit number
+	 * or higher, so for readability reasons we picked the 36-bit multiple.
+	 * We are effectively turning the 16-bit minuend into a 36-bit number:
+	 *    (a - b) % 0xFFFF = (a + 0xFFFF00000 - b) % 0xFFFF.
+	 */
+	r += 0xFFFF00000ULL; /* The output r is a 36-bit number. */
+
+	/* A 33-bit number is subtracted from a 36-bit number (the input r). The
+	 * result (the output r) is a 36-bit number.
+	 */
+	r -= (src >> 32) + (src & 0xFFFFFFFF);
+
+	/* The first input is a 16-bit number. The second input is a 20-bit
+	 * number. Their sum is a 21-bit number.
+	 */
+	r = (r & 0xFFFF) + (r >> 16);
+
+	/* The first input is a 16-bit number (0 .. 0xFFFF). The second input is
+	 * a 5-bit number (0 .. 31). The sum is a 17-bit number (0 .. 0x1001E).
+	 */
+	r = (r & 0xFFFF) + (r >> 16);
+
+	/* When the input r is (0 .. 0xFFFF), the output r is equal to the input
+	 * r, so the output is (0 .. 0xFFFF). When the input r is (0x10000 ..
+	 * 0x1001E), the output r is (0 .. 31). So no carry bit can be
+	 * generated, therefore the output r is always a 16-bit number.
+	 */
+	r = (r & 0xFFFF) + (r >> 16);
+
+	r = ~r & 0xFFFF;
+	r = r ? r : 0xFFFF;
+
+	*dst16_ptr = (uint16_t)r;
+
+	/* Thread. */
+	thread_ip_inc(p);
+}
+
 static inline void
 instr_alu_ckadd_struct20_exec(struct rte_swx_pipeline *p)
 {
@@ -3502,6 +3609,14 @@ instr_translate(struct rte_swx_pipeline *p,
 						 instr,
 						 data);
 
+	if (!strcmp(tokens[tpos], "cksub"))
+		return instr_alu_cksub_translate(p,
+						 action,
+						 &tokens[tpos],
+						 n_tokens - tpos,
+						 instr,
+						 data);
+
 	CHECK(0, EINVAL);
 }
 
@@ -3677,6 +3792,7 @@ static instr_exec_t instruction_table[] = {
 	[INSTR_ALU_CKADD_FIELD] = instr_alu_ckadd_field_exec,
 	[INSTR_ALU_CKADD_STRUCT] = instr_alu_ckadd_struct_exec,
 	[INSTR_ALU_CKADD_STRUCT20] = instr_alu_ckadd_struct20_exec,
+	[INSTR_ALU_CKSUB_FIELD] = instr_alu_cksub_field_exec,
 };
 
 static inline void
-- 
2.17.1