From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <hpatil@qlogic.com>
Received: from mx0b-0016ce01.pphosted.com (mx0b-0016ce01.pphosted.com
 [67.231.156.153]) by dpdk.org (Postfix) with ESMTP id 0CD273787
 for <dev@dpdk.org>; Sun,  8 Nov 2015 20:40:29 +0100 (CET)
Received: from pps.filterd (m0085408.ppops.net [127.0.0.1])
 by mx0b-0016ce01.pphosted.com (8.15.0.59/8.15.0.59) with SMTP id
 tA8Jb3ho001784 for <dev@dpdk.org>; Sun, 8 Nov 2015 11:40:29 -0800
Received: from avcashub1.qlogic.com (avcashub1.qlogic.com [198.70.193.115])
 by mx0b-0016ce01.pphosted.com with ESMTP id 1y1h5uhg4h-1
 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT)
 for <dev@dpdk.org>; Sun, 08 Nov 2015 11:40:29 -0800
Received: from avluser05.qlc.com (10.1.113.115) by qlc.com (10.1.4.190) with
 Microsoft SMTP Server id 14.3.235.1; Sun, 8 Nov 2015 11:40:28 -0800
Received: (from hpatil@localhost)	by avluser05.qlc.com (8.14.4/8.14.4/Submit)
 id tA8JeRTp003060;	Sun, 8 Nov 2015 11:40:27 -0800
From: <harish.patil@qlogic.com>
To: <dev@dpdk.org>
Date: Sun, 8 Nov 2015 11:39:56 -0800
Message-ID: <1447011596-2993-1-git-send-email-harish.patil@qlogic.com>
X-Mailer: git-send-email 1.7.10.3
MIME-Version: 1.0
Content-Type: text/plain
X-Proofpoint-Virus-Version: vendor=nai engine=5700 definitions=7979
 signatures=670655
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0
 clxscore=1015
 suspectscore=1 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0
 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1507310000
 definitions=main-1511080368
Subject: [dpdk-dev] [PATCH] l3fwd: Fix l3fwd crash due to unaligned
	load/store intrinsics
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Sun, 08 Nov 2015 19:40:30 -0000

From: Harish Patil <harish.patil@qlogic.com>

l3fwd app expects PMDs to return packets whose L2 header is
16-byte aligned due to usage of _mm_load_si128()/_mm_store_si128()
intrinsics in the app. However, most of the protocol stacks expects
packets such that its IP/L3 header be aligned on a 16-byte boundary.

Based on the recommendations received on dpdk-dev, we are changing
the l3fwd app to use _mm_loadu_si128()/_mm_loadu_si128() so that the
address need not be 16-byte aligned and thereby preventing crash.
We have tested that there is no performance impact due to this
change.

Signed-off-by: Harish Patil <harish.patil@qlogic.com>
---
 examples/l3fwd/main.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
index 1f3e5c6..4b8b754 100644
--- a/examples/l3fwd/main.c
+++ b/examples/l3fwd/main.c
@@ -1220,14 +1220,14 @@ process_packet(struct lcore_conf *qconf, struct rte_mbuf *pkt,
 	dst_ipv4 = rte_be_to_cpu_32(dst_ipv4);
 	dp = get_dst_port(qconf, pkt, dst_ipv4, portid);
 
-	te = _mm_load_si128((__m128i *)eth_hdr);
+	te = _mm_loadu_si128((__m128i *)eth_hdr);
 	ve = val_eth[dp];
 
 	dst_port[0] = dp;
 	rfc1812_process(ipv4_hdr, dst_port, pkt->packet_type);
 
 	te =  _mm_blend_epi16(te, ve, MASK_ETH);
-	_mm_store_si128((__m128i *)eth_hdr, te);
+	_mm_storeu_si128((__m128i *)eth_hdr, te);
 }
 
 /*
@@ -1313,16 +1313,16 @@ processx4_step3(struct rte_mbuf *pkt[FWDSTEP], uint16_t dst_port[FWDSTEP])
 	p[3] = rte_pktmbuf_mtod(pkt[3], __m128i *);
 
 	ve[0] = val_eth[dst_port[0]];
-	te[0] = _mm_load_si128(p[0]);
+	te[0] = _mm_loadu_si128(p[0]);
 
 	ve[1] = val_eth[dst_port[1]];
-	te[1] = _mm_load_si128(p[1]);
+	te[1] = _mm_loadu_si128(p[1]);
 
 	ve[2] = val_eth[dst_port[2]];
-	te[2] = _mm_load_si128(p[2]);
+	te[2] = _mm_loadu_si128(p[2]);
 
 	ve[3] = val_eth[dst_port[3]];
-	te[3] = _mm_load_si128(p[3]);
+	te[3] = _mm_loadu_si128(p[3]);
 
 	/* Update first 12 bytes, keep rest bytes intact. */
 	te[0] =  _mm_blend_epi16(te[0], ve[0], MASK_ETH);
@@ -1330,10 +1330,10 @@ processx4_step3(struct rte_mbuf *pkt[FWDSTEP], uint16_t dst_port[FWDSTEP])
 	te[2] =  _mm_blend_epi16(te[2], ve[2], MASK_ETH);
 	te[3] =  _mm_blend_epi16(te[3], ve[3], MASK_ETH);
 
-	_mm_store_si128(p[0], te[0]);
-	_mm_store_si128(p[1], te[1]);
-	_mm_store_si128(p[2], te[2]);
-	_mm_store_si128(p[3], te[3]);
+	_mm_storeu_si128(p[0], te[0]);
+	_mm_storeu_si128(p[1], te[1]);
+	_mm_storeu_si128(p[2], te[2]);
+	_mm_storeu_si128(p[3], te[3]);
 
 	rfc1812_process((struct ipv4_hdr *)((struct ether_hdr *)p[0] + 1),
 		&dst_port[0], pkt[0]->packet_type);
-- 
1.8.3.1