From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id B993CA04B5; Thu, 1 Oct 2020 12:27:53 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 422D91DB06; Thu, 1 Oct 2020 12:20:46 +0200 (CEST) Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by dpdk.org (Postfix) with ESMTP id D9E0E1DB79 for ; Thu, 1 Oct 2020 12:20:38 +0200 (CEST) IronPort-SDR: q2wWhLsT/yfps3VGAHo7LAfdy/g8+Fu+WeKChX9I167f3nyomrEtdM0LKF5VthHMhPUV/RcOXm H/rn5B9kqftg== X-IronPort-AV: E=McAfee;i="6000,8403,9760"; a="224297097" X-IronPort-AV: E=Sophos;i="5.77,323,1596524400"; d="scan'208";a="224297097" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Oct 2020 03:20:37 -0700 IronPort-SDR: D7tJneMhNl82RduACizwxX+GrXKqtyBKFQJEU88Z7HkuGpZBmg5wY0KrxWGmcCqogpJ84PlrSV D4jOEF/AwMiw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,323,1596524400"; d="scan'208";a="515443409" Received: from silpixa00400573.ir.intel.com (HELO silpixa00400573.ger.corp.intel.com) ([10.237.223.107]) by fmsmga005.fm.intel.com with ESMTP; 01 Oct 2020 03:20:36 -0700 From: Cristian Dumitrescu To: dev@dpdk.org Cc: thomas@monjalon.net, david.marchand@redhat.com Date: Thu, 1 Oct 2020 11:19:45 +0100 Message-Id: <20201001102010.36861-18-cristian.dumitrescu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20201001102010.36861-1-cristian.dumitrescu@intel.com> References: <20200930063416.68428-2-cristian.dumitrescu@intel.com> <20201001102010.36861-1-cristian.dumitrescu@intel.com> Subject: [dpdk-dev] [PATCH v7 17/42] pipeline: introduce SWX cksub instruction X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The cksub (i.e. checksum subtract) instruction is used to update the 1's complement sum commonly used by protocols such as IPv4, TCP or UDP. Signed-off-by: Cristian Dumitrescu --- lib/librte_pipeline/rte_swx_pipeline.c | 116 +++++++++++++++++++++++++ 1 file changed, 116 insertions(+) diff --git a/lib/librte_pipeline/rte_swx_pipeline.c b/lib/librte_pipeline/rte_swx_pipeline.c index 96e6c98aa..364c7d75a 100644 --- a/lib/librte_pipeline/rte_swx_pipeline.c +++ b/lib/librte_pipeline/rte_swx_pipeline.c @@ -297,6 +297,12 @@ enum instruction_type { INSTR_ALU_CKADD_FIELD, /* src = H */ INSTR_ALU_CKADD_STRUCT20, /* src = h.header, with sizeof(header) = 20 */ INSTR_ALU_CKADD_STRUCT, /* src = h.hdeader, with any sizeof(header) */ + + /* cksub dst src + * dst = dst '- src + * dst = H, src = H + */ + INSTR_ALU_CKSUB_FIELD, }; struct instr_operand { @@ -3034,6 +3040,36 @@ instr_alu_ckadd_translate(struct rte_swx_pipeline *p, return 0; } +static int +instr_alu_cksub_translate(struct rte_swx_pipeline *p, + struct action *action __rte_unused, + char **tokens, + int n_tokens, + struct instruction *instr, + struct instruction_data *data __rte_unused) +{ + char *dst = tokens[1], *src = tokens[2]; + struct header *hdst, *hsrc; + struct field *fdst, *fsrc; + + CHECK(n_tokens == 3, EINVAL); + + fdst = header_field_parse(p, dst, &hdst); + CHECK(fdst && (fdst->n_bits == 16), EINVAL); + + fsrc = header_field_parse(p, src, &hsrc); + CHECK(fsrc, EINVAL); + + instr->type = INSTR_ALU_CKSUB_FIELD; + instr->alu.dst.struct_id = (uint8_t)hdst->struct_id; + instr->alu.dst.n_bits = fdst->n_bits; + instr->alu.dst.offset = fdst->offset / 8; + instr->alu.src.struct_id = (uint8_t)hsrc->struct_id; + instr->alu.src.n_bits = fsrc->n_bits; + instr->alu.src.offset = fsrc->offset / 8; + return 0; +} + static inline void instr_alu_add_exec(struct rte_swx_pipeline *p) { @@ -3273,6 +3309,77 @@ instr_alu_ckadd_field_exec(struct rte_swx_pipeline *p) thread_ip_inc(p); } +static inline void +instr_alu_cksub_field_exec(struct rte_swx_pipeline *p) +{ + struct thread *t = &p->threads[p->thread_id]; + struct instruction *ip = t->ip; + uint8_t *dst_struct, *src_struct; + uint16_t *dst16_ptr, dst; + uint64_t *src64_ptr, src64, src64_mask, src; + uint64_t r; + + TRACE("[Thread %2u] cksub (field)\n", p->thread_id); + + /* Structs. */ + dst_struct = t->structs[ip->alu.dst.struct_id]; + dst16_ptr = (uint16_t *)&dst_struct[ip->alu.dst.offset]; + dst = *dst16_ptr; + + src_struct = t->structs[ip->alu.src.struct_id]; + src64_ptr = (uint64_t *)&src_struct[ip->alu.src.offset]; + src64 = *src64_ptr; + src64_mask = UINT64_MAX >> (64 - ip->alu.src.n_bits); + src = src64 & src64_mask; + + r = dst; + r = ~r & 0xFFFF; + + /* Subtraction in 1's complement arithmetic (i.e. a '- b) is the same as + * the following sequence of operations in 2's complement arithmetic: + * a '- b = (a - b) % 0xFFFF. + * + * In order to prevent an underflow for the below subtraction, in which + * a 33-bit number (the subtrahend) is taken out of a 16-bit number (the + * minuend), we first add a multiple of the 0xFFFF modulus to the + * minuend. The number we add to the minuend needs to be a 34-bit number + * or higher, so for readability reasons we picked the 36-bit multiple. + * We are effectively turning the 16-bit minuend into a 36-bit number: + * (a - b) % 0xFFFF = (a + 0xFFFF00000 - b) % 0xFFFF. + */ + r += 0xFFFF00000ULL; /* The output r is a 36-bit number. */ + + /* A 33-bit number is subtracted from a 36-bit number (the input r). The + * result (the output r) is a 36-bit number. + */ + r -= (src >> 32) + (src & 0xFFFFFFFF); + + /* The first input is a 16-bit number. The second input is a 20-bit + * number. Their sum is a 21-bit number. + */ + r = (r & 0xFFFF) + (r >> 16); + + /* The first input is a 16-bit number (0 .. 0xFFFF). The second input is + * a 5-bit number (0 .. 31). The sum is a 17-bit number (0 .. 0x1001E). + */ + r = (r & 0xFFFF) + (r >> 16); + + /* When the input r is (0 .. 0xFFFF), the output r is equal to the input + * r, so the output is (0 .. 0xFFFF). When the input r is (0x10000 .. + * 0x1001E), the output r is (0 .. 31). So no carry bit can be + * generated, therefore the output r is always a 16-bit number. + */ + r = (r & 0xFFFF) + (r >> 16); + + r = ~r & 0xFFFF; + r = r ? r : 0xFFFF; + + *dst16_ptr = (uint16_t)r; + + /* Thread. */ + thread_ip_inc(p); +} + static inline void instr_alu_ckadd_struct20_exec(struct rte_swx_pipeline *p) { @@ -3502,6 +3609,14 @@ instr_translate(struct rte_swx_pipeline *p, instr, data); + if (!strcmp(tokens[tpos], "cksub")) + return instr_alu_cksub_translate(p, + action, + &tokens[tpos], + n_tokens - tpos, + instr, + data); + CHECK(0, EINVAL); } @@ -3677,6 +3792,7 @@ static instr_exec_t instruction_table[] = { [INSTR_ALU_CKADD_FIELD] = instr_alu_ckadd_field_exec, [INSTR_ALU_CKADD_STRUCT] = instr_alu_ckadd_struct_exec, [INSTR_ALU_CKADD_STRUCT20] = instr_alu_ckadd_struct20_exec, + [INSTR_ALU_CKSUB_FIELD] = instr_alu_cksub_field_exec, }; static inline void -- 2.17.1