From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 52F74A051C; Fri, 17 Jan 2020 19:11:12 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 586391AFF; Fri, 17 Jan 2020 19:11:11 +0100 (CET) Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by dpdk.org (Postfix) with ESMTP id AB6311515 for ; Fri, 17 Jan 2020 19:11:09 +0100 (CET) Received: from pps.filterd (m0127361.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 00HIAC0J157793; Fri, 17 Jan 2020 13:11:06 -0500 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 2xk0qgmcem-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 17 Jan 2020 13:11:06 -0500 Received: from m0127361.ppops.net (m0127361.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 00HIATYL158594; Fri, 17 Jan 2020 13:11:05 -0500 Received: from ppma02dal.us.ibm.com (a.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.10]) by mx0a-001b2d01.pphosted.com with ESMTP id 2xk0qgmce1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 17 Jan 2020 13:11:05 -0500 Received: from pps.filterd (ppma02dal.us.ibm.com [127.0.0.1]) by ppma02dal.us.ibm.com (8.16.0.27/8.16.0.27) with SMTP id 00HIASgc026598; Fri, 17 Jan 2020 18:11:04 GMT Received: from b03cxnp08026.gho.boulder.ibm.com (b03cxnp08026.gho.boulder.ibm.com [9.17.130.18]) by ppma02dal.us.ibm.com with ESMTP id 2xjvwp3ygp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 17 Jan 2020 18:11:04 +0000 Received: from b03ledav002.gho.boulder.ibm.com (b03ledav002.gho.boulder.ibm.com [9.17.130.233]) by b03cxnp08026.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 00HIB3ws51970316 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 17 Jan 2020 18:11:03 GMT Received: from b03ledav002.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1B9DD136051; Fri, 17 Jan 2020 18:11:03 +0000 (GMT) Received: from b03ledav002.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0E439136055; Fri, 17 Jan 2020 18:11:00 +0000 (GMT) Received: from Davids-MacBook-Pro.local (unknown [9.211.114.189]) by b03ledav002.gho.boulder.ibm.com (Postfix) with ESMTP; Fri, 17 Jan 2020 18:11:00 +0000 (GMT) To: Honnappa Nagarahalli , Olivier Matz Cc: "sthemmin@microsoft.com" , "jerinj@marvell.com" , "bruce.richardson@intel.com" , "david.marchand@redhat.com" , "pbhagavatula@marvell.com" , "konstantin.ananyev@intel.com" , "yipeng1.wang@intel.com" , "dev@dpdk.org" , Dharmik Thakkar , Ruifeng Wang , Gavin Hu , nd References: <20190906190510.11146-1-honnappa.nagarahalli@arm.com> <20200116052511.8557-1-honnappa.nagarahalli@arm.com> <20200116052511.8557-3-honnappa.nagarahalli@arm.com> <20200117163417.GY22738@platinum> From: David Christensen Message-ID: Date: Fri, 17 Jan 2020 10:10:59 -0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:68.0) Gecko/20100101 Thunderbird/68.3.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.572 definitions=2020-01-17_05:2020-01-16, 2020-01-17 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 mlxscore=0 suspectscore=0 lowpriorityscore=0 phishscore=0 spamscore=0 malwarescore=0 priorityscore=1501 mlxlogscore=999 clxscore=1011 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-1910280000 definitions=main-2001170142 Subject: Re: [dpdk-dev] [PATCH v9 2/6] lib/ring: apis to support configurable element size X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" >>> +static __rte_always_inline void >>> +enqueue_elems_128(struct rte_ring *r, uint32_t prod_head, >>> + const void *obj_table, uint32_t n) >>> +{ >>> + unsigned int i; >>> + const uint32_t size = r->size; >>> + uint32_t idx = prod_head & r->mask; >>> + rte_int128_t *ring = (rte_int128_t *)&r[1]; >>> + const rte_int128_t *obj = (const rte_int128_t *)obj_table; >>> + if (likely(idx + n < size)) { >>> + for (i = 0; i < (n & ~0x1); i += 2, idx += 2) >>> + memcpy((void *)(ring + idx), >>> + (const void *)(obj + i), 32); >>> + switch (n & 0x1) { >>> + case 1: >>> + memcpy((void *)(ring + idx), >>> + (const void *)(obj + i), 16); >>> + } >>> + } else { >>> + for (i = 0; idx < size; i++, idx++) >>> + memcpy((void *)(ring + idx), >>> + (const void *)(obj + i), 16); >>> + /* Start at the beginning */ >>> + for (idx = 0; i < n; i++, idx++) >>> + memcpy((void *)(ring + idx), >>> + (const void *)(obj + i), 16); >>> + } >>> +} >>> + >>> +/* the actual enqueue of elements on the ring. >>> + * Placed here since identical code needed in both >>> + * single and multi producer enqueue functions. >>> + */ >>> +static __rte_always_inline void >>> +enqueue_elems(struct rte_ring *r, uint32_t prod_head, const void >> *obj_table, >>> + uint32_t esize, uint32_t num) >>> +{ >>> + /* 8B and 16B copies implemented individually to retain >>> + * the current performance. >>> + */ >>> + if (esize == 8) >>> + enqueue_elems_64(r, prod_head, obj_table, num); >>> + else if (esize == 16) >>> + enqueue_elems_128(r, prod_head, obj_table, num); >>> + else { >>> + uint32_t idx, scale, nr_idx, nr_num, nr_size; >>> + >>> + /* Normalize to uint32_t */ >>> + scale = esize / sizeof(uint32_t); >>> + nr_num = num * scale; >>> + idx = prod_head & r->mask; >>> + nr_idx = idx * scale; >>> + nr_size = r->size * scale; >>> + enqueue_elems_32(r, nr_size, nr_idx, obj_table, nr_num); >>> + } >>> +} >> >> Following Konstatin's comment on v7, enqueue_elems_128() was modified to >> ensure it won't crash if the object is unaligned. Are we sure that this same >> problem cannot also occurs with 64b copies on all supported architectures? (I >> mean 64b access that is only aligned on 32b) > Konstantin mentioned that the 64b load/store instructions on x86 can handle unaligned access. On aarch64, the load/store (non-atomic, which will be used in this case) can handle unaligned access. > > + David Christensen to comment for PPC The vectorized version of memcpy for Power can handle unaligned access as well. Dave