From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-0016ce01.pphosted.com (mx0b-0016ce01.pphosted.com [67.231.156.153]) by dpdk.org (Postfix) with ESMTP id 0CD273787 for ; Sun, 8 Nov 2015 20:40:29 +0100 (CET) Received: from pps.filterd (m0085408.ppops.net [127.0.0.1]) by mx0b-0016ce01.pphosted.com (8.15.0.59/8.15.0.59) with SMTP id tA8Jb3ho001784 for ; Sun, 8 Nov 2015 11:40:29 -0800 Received: from avcashub1.qlogic.com (avcashub1.qlogic.com [198.70.193.115]) by mx0b-0016ce01.pphosted.com with ESMTP id 1y1h5uhg4h-1 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT) for ; Sun, 08 Nov 2015 11:40:29 -0800 Received: from avluser05.qlc.com (10.1.113.115) by qlc.com (10.1.4.190) with Microsoft SMTP Server id 14.3.235.1; Sun, 8 Nov 2015 11:40:28 -0800 Received: (from hpatil@localhost) by avluser05.qlc.com (8.14.4/8.14.4/Submit) id tA8JeRTp003060; Sun, 8 Nov 2015 11:40:27 -0800 From: To: Date: Sun, 8 Nov 2015 11:39:56 -0800 Message-ID: <1447011596-2993-1-git-send-email-harish.patil@qlogic.com> X-Mailer: git-send-email 1.7.10.3 MIME-Version: 1.0 Content-Type: text/plain X-Proofpoint-Virus-Version: vendor=nai engine=5700 definitions=7979 signatures=670655 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 clxscore=1015 suspectscore=1 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1507310000 definitions=main-1511080368 Subject: [dpdk-dev] [PATCH] l3fwd: Fix l3fwd crash due to unaligned load/store intrinsics X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 08 Nov 2015 19:40:30 -0000 From: Harish Patil l3fwd app expects PMDs to return packets whose L2 header is 16-byte aligned due to usage of _mm_load_si128()/_mm_store_si128() intrinsics in the app. However, most of the protocol stacks expects packets such that its IP/L3 header be aligned on a 16-byte boundary. Based on the recommendations received on dpdk-dev, we are changing the l3fwd app to use _mm_loadu_si128()/_mm_loadu_si128() so that the address need not be 16-byte aligned and thereby preventing crash. We have tested that there is no performance impact due to this change. Signed-off-by: Harish Patil --- examples/l3fwd/main.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c index 1f3e5c6..4b8b754 100644 --- a/examples/l3fwd/main.c +++ b/examples/l3fwd/main.c @@ -1220,14 +1220,14 @@ process_packet(struct lcore_conf *qconf, struct rte_mbuf *pkt, dst_ipv4 = rte_be_to_cpu_32(dst_ipv4); dp = get_dst_port(qconf, pkt, dst_ipv4, portid); - te = _mm_load_si128((__m128i *)eth_hdr); + te = _mm_loadu_si128((__m128i *)eth_hdr); ve = val_eth[dp]; dst_port[0] = dp; rfc1812_process(ipv4_hdr, dst_port, pkt->packet_type); te = _mm_blend_epi16(te, ve, MASK_ETH); - _mm_store_si128((__m128i *)eth_hdr, te); + _mm_storeu_si128((__m128i *)eth_hdr, te); } /* @@ -1313,16 +1313,16 @@ processx4_step3(struct rte_mbuf *pkt[FWDSTEP], uint16_t dst_port[FWDSTEP]) p[3] = rte_pktmbuf_mtod(pkt[3], __m128i *); ve[0] = val_eth[dst_port[0]]; - te[0] = _mm_load_si128(p[0]); + te[0] = _mm_loadu_si128(p[0]); ve[1] = val_eth[dst_port[1]]; - te[1] = _mm_load_si128(p[1]); + te[1] = _mm_loadu_si128(p[1]); ve[2] = val_eth[dst_port[2]]; - te[2] = _mm_load_si128(p[2]); + te[2] = _mm_loadu_si128(p[2]); ve[3] = val_eth[dst_port[3]]; - te[3] = _mm_load_si128(p[3]); + te[3] = _mm_loadu_si128(p[3]); /* Update first 12 bytes, keep rest bytes intact. */ te[0] = _mm_blend_epi16(te[0], ve[0], MASK_ETH); @@ -1330,10 +1330,10 @@ processx4_step3(struct rte_mbuf *pkt[FWDSTEP], uint16_t dst_port[FWDSTEP]) te[2] = _mm_blend_epi16(te[2], ve[2], MASK_ETH); te[3] = _mm_blend_epi16(te[3], ve[3], MASK_ETH); - _mm_store_si128(p[0], te[0]); - _mm_store_si128(p[1], te[1]); - _mm_store_si128(p[2], te[2]); - _mm_store_si128(p[3], te[3]); + _mm_storeu_si128(p[0], te[0]); + _mm_storeu_si128(p[1], te[1]); + _mm_storeu_si128(p[2], te[2]); + _mm_storeu_si128(p[3], te[3]); rfc1812_process((struct ipv4_hdr *)((struct ether_hdr *)p[0] + 1), &dst_port[0], pkt[0]->packet_type); -- 1.8.3.1