From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 4F63FA0583; Fri, 20 Mar 2020 10:14:53 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 94DDCF94; Fri, 20 Mar 2020 10:14:52 +0100 (CET) Received: from qrelay32.mxroute.com (qrelay32.mxroute.com [172.82.139.32]) by dpdk.org (Postfix) with ESMTP id 03C3B3B5 for ; Fri, 20 Mar 2020 10:14:50 +0100 (CET) Received: from filter003.mxroute.com ([168.235.111.26] 168-235-111-26.cloud.ramnode.com) (Authenticated sender: mN4UYu2MZsgR) by qrelay32.mxroute.com (ZoneMTA) with ESMTPA id 170f73940780006ab5.001 for ; Fri, 20 Mar 2020 09:14:49 +0000 X-Zone-Loop: 95778a322fb97cbd34d4ea58d94b2039222d82e5c509 X-Originating-IP: [168.235.111.26] Received: from galaxy.mxroute.com (unknown [23.92.70.113]) by filter003.mxroute.com (Postfix) with ESMTPS id C1ED160012; Fri, 20 Mar 2020 09:14:43 +0000 (UTC) Received: from [192.198.151.43] by galaxy.mxroute.com with esmtpsa (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128) (Exim 4.91) (envelope-from ) id 1jFDMV-0001PW-5X; Fri, 20 Mar 2020 04:50:59 -0400 To: Pavan Nikhilesh Bhagavatula , Jerin Jacob Kollanukkaran , Nithin Kumar Dabilpuram Cc: "dev@dpdk.org" , "thomas@monjalon.net" , "david.marchand@redhat.com" , "mattias.ronnblom@ericsson.com" , Kiran Kumar Kokkilagadda References: <20200318213551.3489504-1-jerinj@marvell.com> <20200318213551.3489504-21-jerinj@marvell.com> <02c4c25a-83ba-dac5-20e6-7b140cbcb4f1@ashroe.eu> <5a99e696-3853-5782-0a4c-0debcc74faa8@ashroe.eu> From: Ray Kinsella Autocrypt: addr=mdr@ashroe.eu; keydata= mQINBFv8B3wBEAC+5ImcgbIvadt3axrTnt7Sxch3FsmWTTomXfB8YiuHT8KL8L/bFRQSL1f6 ASCHu3M89EjYazlY+vJUWLr0BhK5t/YI7bQzrOuYrl9K94vlLwzD19s/zB/g5YGGR5plJr0s JtJsFGEvF9LL3e+FKMRXveQxBB8A51nAHfwG0WSyx53d61DYz7lp4/Y4RagxaJoHp9lakn8j HV2N6rrnF+qt5ukj5SbbKWSzGg5HQF2t0QQ5tzWhCAKTfcPlnP0GymTBfNMGOReWivi3Qqzr S51Xo7hoGujUgNAM41sxpxmhx8xSwcQ5WzmxgAhJ/StNV9cb3HWIoE5StCwQ4uXOLplZNGnS uxNdegvKB95NHZjRVRChg/uMTGpg9PqYbTIFoPXjuk27sxZLRJRrueg4tLbb3HM39CJwSB++ YICcqf2N+GVD48STfcIlpp12/HI+EcDSThzfWFhaHDC0hyirHxJyHXjnZ8bUexI/5zATn/ux TpMbc/vicJxeN+qfaVqPkCbkS71cHKuPluM3jE8aNCIBNQY1/j87k5ELzg3qaesLo2n1krBH bKvFfAmQuUuJT84/IqfdVtrSCTabvDuNBDpYBV0dGbTwaRfE7i+LiJJclUr8lOvHUpJ4Y6a5 0cxEPxm498G12Z3NoY/mP5soItPIPtLR0rA0fage44zSPwp6cQARAQABtBxSYXkgS2luc2Vs bGEgPG1kckBhc2hyb2UuZXU+iQJUBBMBCAA+FiEEcDUDlKDJaDuJlfZfdJdaH/sCCpsFAlv8 B3wCGyMFCQlmAYAFCwkIBwIGFQoJCAsCBBYCAwECHgECF4AACgkQdJdaH/sCCptdtRAAl0oE msa+djBVYLIsax+0f8acidtWg2l9f7kc2hEjp9h9aZCpPchQvhhemtew/nKavik3RSnLTAyn B3C/0GNlmvI1l5PFROOgPZwz4xhJKGN7jOsRrbkJa23a8ly5UXwF3Vqnlny7D3z+7cu1qq/f VRK8qFyWkAb+xgqeZ/hTcbJUWtW+l5Zb+68WGEp8hB7TuJLEWb4+VKgHTpQ4vElYj8H3Z94a 04s2PJMbLIZSgmKDASnyrKY0CzTpPXx5rSJ1q+B1FCsfepHLqt3vKSALa3ld6bJ8fSJtDUJ7 JLiU8dFZrywgDIVme01jPbjJtUScW6jONLvhI8Z2sheR71UoKqGomMHNQpZ03ViVWBEALzEt TcjWgJFn8yAmxqM4nBnZ+hE3LbMo34KCHJD4eg18ojDt3s9VrDLa+V9fNxUHPSib9FD9UX/1 +nGfU/ZABmiTuUDM7WZdXri7HaMpzDRJUKI6b+/uunF8xH/h/MHW16VuMzgI5dkOKKv1LejD dT5mA4R+2zBS+GsM0oa2hUeX9E5WwjaDzXtVDg6kYq8YvEd+m0z3M4e6diFeLS77/sAOgaYL 92UcoKD+Beym/fVuC6/55a0e12ksTmgk5/ZoEdoNQLlVgd2INtvnO+0k5BJcn66ZjKn3GbEC VqFbrnv1GnA58nEInRCTzR1k26h9nmS5Ag0EW/wHfAEQAMth1vHr3fOZkVOPfod3M6DkQir5 xJvUW5EHgYUjYCPIa2qzgIVVuLDqZgSCCinyooG5dUJONVHj3nCbITCpJp4eB3PI84RPfDcC hf/V34N/Gx5mTeoymSZDBmXT8YtvV/uJvn+LvHLO4ZJdvq5ZxmDyxfXFmkm3/lLw0+rrNdK5 pt6OnVlCqEU9tcDBezjUwDtOahyV20XqxtUttN4kQWbDRkhT+HrA9WN9l2HX91yEYC+zmF1S OhBqRoTPLrR6g4sCWgFywqztpvZWhyIicJipnjac7qL/wRS+wrWfsYy6qWLIV80beN7yoa6v ccnuy4pu2uiuhk9/edtlmFE4dNdoRf7843CV9k1yRASTlmPkU59n0TJbw+okTa9fbbQgbIb1 pWsAuicRHyLUIUz4f6kPgdgty2FgTKuPuIzJd1s8s6p2aC1qo+Obm2gnBTduB+/n1Jw+vKpt 07d+CKEKu4CWwvZZ8ktJJLeofi4hMupTYiq+oMzqH+V1k6QgNm0Da489gXllU+3EFC6W1qKj tkvQzg2rYoWeYD1Qn8iXcO4Fpk6wzylclvatBMddVlQ6qrYeTmSbCsk+m2KVrz5vIyja0o5Y yfeN29s9emXnikmNfv/dA5fpi8XCANNnz3zOfA93DOB9DBf0TQ2/OrSPGjB3op7RCfoPBZ7u AjJ9dM7VABEBAAGJAjwEGAEIACYWIQRwNQOUoMloO4mV9l90l1of+wIKmwUCW/wHfAIbDAUJ CWYBgAAKCRB0l1of+wIKm3KlD/9w/LOG5rtgtCUWPl4B3pZvGpNym6XdK8cop9saOnE85zWf u+sKWCrxNgYkYP7aZrYMPwqDvilxhbTsIJl5HhPgpTO1b0i+c0n1Tij3EElj5UCg3q8mEc17 c+5jRrY3oz77g7E3oPftAjaq1ybbXjY4K32o3JHFR6I8wX3m9wJZJe1+Y+UVrrjY65gZFxcA thNVnWKErarVQGjeNgHV4N1uF3pIx3kT1N4GSnxhoz4Bki91kvkbBhUgYfNflGURfZT3wIKK +d50jd7kqRouXUCzTdzmDh7jnYrcEFM4nvyaYu0JjSS5R672d9SK5LVIfWmoUGzqD4AVmUW8 pcv461+PXchuS8+zpltR9zajl72Q3ymlT4BTAQOlCWkD0snBoKNUB5d2EXPNV13nA0qlm4U2 GpROfJMQXjV6fyYRvttKYfM5xYKgRgtP0z5lTAbsjg9WFKq0Fndh7kUlmHjuAIwKIV4Tzo75 QO2zC0/NTaTjmrtiXhP+vkC4pcrOGNsbHuaqvsc/ZZ0siXyYsqbctj/sCd8ka2r94u+c7o4l BGaAm+FtwAfEAkXHu4y5Phuv2IRR+x1wTey1U1RaEPgN8xq0LQ1OitX4t2mQwjdPihZQBCnZ wzOrkbzlJMNrMKJpEgulmxAHmYJKgvZHXZXtLJSejFjR0GdHJcL5rwVOMWB8cg== Message-ID: <20a3cb35-d57b-4799-f084-919f3f55da6f@ashroe.eu> Date: Fri, 20 Mar 2020 09:14:19 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-AuthUser: mdr@ashroe.eu Subject: Re: [dpdk-dev] [EXT] Re: [PATCH v1 20/26] node: ipv4 lookup for x86 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 19/03/2020 16:13, Pavan Nikhilesh Bhagavatula wrote: > > >> -----Original Message----- >> From: Ray Kinsella >> Sent: Thursday, March 19, 2020 9:21 PM >> To: Pavan Nikhilesh Bhagavatula ; Jerin >> Jacob Kollanukkaran ; Nithin Kumar Dabilpuram >> >> Cc: dev@dpdk.org; thomas@monjalon.net; >> david.marchand@redhat.com; mattias.ronnblom@ericsson.com; Kiran >> Kumar Kokkilagadda >> Subject: Re: [EXT] Re: [dpdk-dev] [PATCH v1 20/26] node: ipv4 lookup >> for x86 >> >> >> >> On 19/03/2020 14:22, Pavan Nikhilesh Bhagavatula wrote: >>>> On 18/03/2020 21:35, jerinj@marvell.com wrote: >>>>> From: Pavan Nikhilesh >>>>> >>>>> Add IPv4 lookup process function for ip4_lookup >>>>> rte_node. This node performs LPM lookup using x86_64 >>>>> vector supported RTE_LPM API on every packet received >>>>> and forwards it to a next node that is identified by >>>>> lookup result. >>>>> >>>>> Signed-off-by: Pavan Nikhilesh >>>>> Signed-off-by: Nithin Dabilpuram >>>>> Signed-off-by: Kiran Kumar K >>>>> --- >>>>> lib/librte_node/ip4_lookup.c | 245 >>>> +++++++++++++++++++++++++++++++++++ >>>>> 1 file changed, 245 insertions(+) >>>>> >>>>> diff --git a/lib/librte_node/ip4_lookup.c >>>> b/lib/librte_node/ip4_lookup.c >>>>> index d7fcd1158..c003e9c91 100644 >>>>> --- a/lib/librte_node/ip4_lookup.c >>>>> +++ b/lib/librte_node/ip4_lookup.c >>>>> @@ -264,6 +264,251 @@ ip4_lookup_node_process(struct >> rte_graph >>>> *graph, struct rte_node *node, >>>>> return nb_objs; >>>>> } >>>>> >>>>> +#elif defined(RTE_ARCH_X86) >>>>> + >>>>> +/* X86 SSE */ >>>>> +static uint16_t >>>>> +ip4_lookup_node_process(struct rte_graph *graph, struct >> rte_node >>>> *node, >>>>> + void **objs, uint16_t nb_objs) >>>>> +{ >>>>> + struct rte_mbuf *mbuf0, *mbuf1, *mbuf2, *mbuf3, **pkts; >>>>> + rte_edge_t next0, next1, next2, next3, next_index; >>>>> + struct rte_ipv4_hdr *ipv4_hdr; >>>>> + struct rte_ether_hdr *eth_hdr; >>>>> + uint32_t ip0, ip1, ip2, ip3; >>>>> + void **to_next, **from; >>>>> + uint16_t last_spec = 0; >>>>> + uint16_t n_left_from; >>>>> + struct rte_lpm *lpm; >>>>> + uint16_t held = 0; >>>>> + uint32_t drop_nh; >>>>> + rte_xmm_t dst; >>>>> + __m128i dip; /* SSE register */ >>>>> + int rc, i; >>>>> + >>>>> + /* Speculative next */ >>>>> + next_index = RTE_NODE_IP4_LOOKUP_NEXT_REWRITE; >>>>> + /* Drop node */ >>>>> + drop_nh = >>>> ((uint32_t)RTE_NODE_IP4_LOOKUP_NEXT_PKT_DROP) << 16; >>>>> + >>>>> + /* Get socket specific LPM from ctx */ >>>>> + lpm = *((struct rte_lpm **)node->ctx); >>>>> + >>>>> + pkts = (struct rte_mbuf **)objs; >>>>> + from = objs; >>>>> + n_left_from = nb_objs; >>>> >>>> I doubt this initial prefetch of the first 4 packets has any benefit. >>> >>> Ack will remove in v2 for x86. >>> >>>> >>>>> + if (n_left_from >= 4) { >>>>> + for (i = 0; i < 4; i++) { >>>>> + rte_prefetch0(rte_pktmbuf_mtod(pkts[i], >>>>> + struct rte_ether_hdr >>>> *) + >>>>> + 1); >>>>> + } >>>>> + } >>>>> + >>>>> + /* Get stream for the speculated next node */ >>>>> + to_next = rte_node_next_stream_get(graph, node, >>>> next_index, nb_objs); >>>> >>>> Suggest you don't reuse the hand-unrolling optimization from FD.io >>>> VPP. >>>> I have never found any performance benefit from them, and they >>>> make the code unnecessarily verbose. >>>> >>> >>> How would be take the benefit of rte_lpm_lookupx4 without >> unrolling the loop?. >>> Also, in future if we are using rte_rib and fib with a CPU supporting >> wider SIMD we might >>> need to unroll them further (AVX256 AND 512 currently >> rte_lpm_lookup uses only 128bit >>> since it is only uses SSE extension). >> >> Let the compiler do it for you, but using a constant vector length. >> for (int i=0; i < 4; ++i) { ... } >> > > Ok, I think I misunderstood the previous comment. > It was only for the prefetches in the loop right? no, it was for all the needless repetition. hand-unrolling loops serve no purpose but to add verbosity. > >>> >>>> >>>>> + while (n_left_from >= 4) { >>>>> + /* Prefetch next-next mbufs */ >>>>> + if (likely(n_left_from >= 11)) { >>>>> + rte_prefetch0(pkts[8]); >>>>> + rte_prefetch0(pkts[9]); >>>>> + rte_prefetch0(pkts[10]); >>>>> + rte_prefetch0(pkts[11]); >>>>> + } >>>>> + >>>>> + /* Prefetch next mbuf data */ >>>>> + if (likely(n_left_from >= 7)) { >>>>> + rte_prefetch0(rte_pktmbuf_mtod(pkts[4], >>>>> + struct rte_ether_hdr >>>> *) + >>>>> + 1); >>>>> + rte_prefetch0(rte_pktmbuf_mtod(pkts[5], >>>>> + struct rte_ether_hdr >>>> *) + >>>>> + 1); >>>>> + rte_prefetch0(rte_pktmbuf_mtod(pkts[6], >>>>> + struct rte_ether_hdr >>>> *) + >>>>> + 1); >>>>> + rte_prefetch0(rte_pktmbuf_mtod(pkts[7], >>>>> + struct rte_ether_hdr >>>> *) + >>>>> + 1); >>>>> + } >>>>> + >>>>> + mbuf0 = pkts[0]; >>>>> + mbuf1 = pkts[1]; >>>>> + mbuf2 = pkts[2]; >>>>> + mbuf3 = pkts[3]; >>>>> + >>>>> + pkts += 4; >>>>> + n_left_from -= 4; >>>>> + >>>>> + /* Extract DIP of mbuf0 */ >>>>> + eth_hdr = rte_pktmbuf_mtod(mbuf0, struct >>>> rte_ether_hdr *); >>>>> + ipv4_hdr = (struct rte_ipv4_hdr *)(eth_hdr + 1); >>>>> + ip0 = ipv4_hdr->dst_addr; >>>>> + /* Extract cksum, ttl as ipv4 hdr is in cache */ >>>>> + rte_node_mbuf_priv1(mbuf0)->cksum = ipv4_hdr- >>>>> hdr_checksum; >>>>> + rte_node_mbuf_priv1(mbuf0)->ttl = ipv4_hdr- >>>>> time_to_live; >>>>> + >>>>> + /* Extract DIP of mbuf1 */ >>>>> + eth_hdr = rte_pktmbuf_mtod(mbuf1, struct >>>> rte_ether_hdr *); >>>>> + ipv4_hdr = (struct rte_ipv4_hdr *)(eth_hdr + 1); >>>>> + ip1 = ipv4_hdr->dst_addr; >>>>> + /* Extract cksum, ttl as ipv4 hdr is in cache */ >>>>> + rte_node_mbuf_priv1(mbuf1)->cksum = ipv4_hdr- >>>>> hdr_checksum; >>>>> + rte_node_mbuf_priv1(mbuf1)->ttl = ipv4_hdr- >>>>> time_to_live; >>>>> + >>>>> + /* Extract DIP of mbuf2 */ >>>>> + eth_hdr = rte_pktmbuf_mtod(mbuf2, struct >>>> rte_ether_hdr *); >>>>> + ipv4_hdr = (struct rte_ipv4_hdr *)(eth_hdr + 1); >>>>> + ip2 = ipv4_hdr->dst_addr; >>>>> + /* Extract cksum, ttl as ipv4 hdr is in cache */ >>>>> + rte_node_mbuf_priv1(mbuf2)->cksum = ipv4_hdr- >>>>> hdr_checksum; >>>>> + rte_node_mbuf_priv1(mbuf2)->ttl = ipv4_hdr- >>>>> time_to_live; >>>>> + >>>>> + /* Extract DIP of mbuf3 */ >>>>> + eth_hdr = rte_pktmbuf_mtod(mbuf3, struct >>>> rte_ether_hdr *); >>>>> + ipv4_hdr = (struct rte_ipv4_hdr *)(eth_hdr + 1); >>>>> + ip3 = ipv4_hdr->dst_addr; >>>>> + >>>>> + /* Prepare for lookup x4 */ >>>>> + dip = _mm_set_epi32(ip3, ip2, ip1, ip0); >>>>> + >>>>> + /* Byte swap 4 IPV4 addresses. */ >>>>> + const __m128i bswap_mask = _mm_set_epi8( >>>>> + 12, 13, 14, 15, 8, 9, 10, 11, 4, 5, 6, 7, 0, 1, 2, 3); >>>>> + dip = _mm_shuffle_epi8(dip, bswap_mask); >>>>> + >>>>> + /* Extract cksum, ttl as ipv4 hdr is in cache */ >>>>> + rte_node_mbuf_priv1(mbuf3)->cksum = ipv4_hdr- >>>>> hdr_checksum; >>>>> + rte_node_mbuf_priv1(mbuf3)->ttl = ipv4_hdr- >>>>> time_to_live; >>>>> + >>>>> + /* Perform LPM lookup to get NH and next node */ >>>>> + rte_lpm_lookupx4(lpm, dip, dst.u32, drop_nh); >>>>> + >>>>> + /* Extract next node id and NH */ >>>>> + rte_node_mbuf_priv1(mbuf0)->nh = dst.u32[0] & >>>> 0xFFFF; >>>>> + next0 = (dst.u32[0] >> 16); >>>>> + >>>>> + rte_node_mbuf_priv1(mbuf1)->nh = dst.u32[1] & >>>> 0xFFFF; >>>>> + next1 = (dst.u32[1] >> 16); >>>>> + >>>>> + rte_node_mbuf_priv1(mbuf2)->nh = dst.u32[2] & >>>> 0xFFFF; >>>>> + next2 = (dst.u32[2] >> 16); >>>>> + >>>>> + rte_node_mbuf_priv1(mbuf3)->nh = dst.u32[3] & >>>> 0xFFFF; >>>>> + next3 = (dst.u32[3] >> 16); >>>>> + >>>>> + /* Enqueue four to next node */ >>>>> + rte_edge_t fix_spec = >>>>> + (next_index ^ next0) | (next_index ^ next1) | >>>>> + (next_index ^ next2) | (next_index ^ next3); >>>>> + >>>>> + if (unlikely(fix_spec)) { >>>>> + /* Copy things successfully speculated till now >>>> */ >>>>> + rte_memcpy(to_next, from, last_spec * >>>> sizeof(from[0])); >>>>> + from += last_spec; >>>>> + to_next += last_spec; >>>>> + held += last_spec; >>>>> + last_spec = 0; >>>>> + >>>>> + /* Next0 */ >>>>> + if (next_index == next0) { >>>>> + to_next[0] = from[0]; >>>>> + to_next++; >>>>> + held++; >>>>> + } else { >>>>> + rte_node_enqueue_x1(graph, node, >>>> next0, >>>>> + from[0]); >>>>> + } >>>>> + >>>>> + /* Next1 */ >>>>> + if (next_index == next1) { >>>>> + to_next[0] = from[1]; >>>>> + to_next++; >>>>> + held++; >>>>> + } else { >>>>> + rte_node_enqueue_x1(graph, node, >>>> next1, >>>>> + from[1]); >>>>> + } >>>>> + >>>>> + /* Next2 */ >>>>> + if (next_index == next2) { >>>>> + to_next[0] = from[2]; >>>>> + to_next++; >>>>> + held++; >>>>> + } else { >>>>> + rte_node_enqueue_x1(graph, node, >>>> next2, >>>>> + from[2]); >>>>> + } >>>>> + >>>>> + /* Next3 */ >>>>> + if (next_index == next3) { >>>>> + to_next[0] = from[3]; >>>>> + to_next++; >>>>> + held++; >>>>> + } else { >>>>> + rte_node_enqueue_x1(graph, node, >>>> next3, >>>>> + from[3]); >>>>> + } >>>>> + >>>>> + from += 4; >>>>> + >>>>> + } else { >>>>> + last_spec += 4; >>>>> + } >>>>> + } >>>>> + >>>>> + while (n_left_from > 0) { >>>>> + uint32_t next_hop; >>>>> + >>>>> + mbuf0 = pkts[0]; >>>>> + >>>>> + pkts += 1; >>>>> + n_left_from -= 1; >>>>> + >>>>> + /* Extract DIP of mbuf0 */ >>>>> + eth_hdr = rte_pktmbuf_mtod(mbuf0, struct >>>> rte_ether_hdr *); >>>>> + ipv4_hdr = (struct rte_ipv4_hdr *)(eth_hdr + 1); >>>>> + /* Extract cksum, ttl as ipv4 hdr is in cache */ >>>>> + rte_node_mbuf_priv1(mbuf0)->cksum = ipv4_hdr- >>>>> hdr_checksum; >>>>> + rte_node_mbuf_priv1(mbuf0)->ttl = ipv4_hdr- >>>>> time_to_live; >>>>> + >>>>> + rc = rte_lpm_lookup(lpm, rte_be_to_cpu_32(ipv4_hdr- >>>>> dst_addr), >>>>> + &next_hop); >>>>> + next_hop = (rc == 0) ? next_hop : drop_nh; >>>>> + >>>>> + rte_node_mbuf_priv1(mbuf0)->nh = next_hop & >>>> 0xFFFF; >>>>> + next0 = (next_hop >> 16); >>>>> + >>>>> + if (unlikely(next_index ^ next0)) { >>>>> + /* Copy things successfully speculated till now >>>> */ >>>>> + rte_memcpy(to_next, from, last_spec * >>>> sizeof(from[0])); >>>>> + from += last_spec; >>>>> + to_next += last_spec; >>>>> + held += last_spec; >>>>> + last_spec = 0; >>>>> + >>>>> + rte_node_enqueue_x1(graph, node, next0, >>>> from[0]); >>>>> + from += 1; >>>>> + } else { >>>>> + last_spec += 1; >>>>> + } >>>>> + } >>>>> + >>>>> + /* !!! Home run !!! */ >>>>> + if (likely(last_spec == nb_objs)) { >>>>> + rte_node_next_stream_move(graph, node, >>>> next_index); >>>>> + return nb_objs; >>>>> + } >>>>> + >>>>> + held += last_spec; >>>>> + /* Copy things successfully speculated till now */ >>>>> + rte_memcpy(to_next, from, last_spec * sizeof(from[0])); >>>>> + rte_node_next_stream_put(graph, node, next_index, held); >>>>> + >>>>> + return nb_objs; >>>>> +} >>>>> + >>>>> #else >>>>> >>>>> static uint16_t >>>>>