From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-io1-f65.google.com (mail-io1-f65.google.com [209.85.166.65]) by dpdk.org (Postfix) with ESMTP id 28DE11B456 for ; Thu, 22 Nov 2018 16:54:53 +0100 (CET) Received: by mail-io1-f65.google.com with SMTP id m19so6927119ioh.3 for ; Thu, 22 Nov 2018 07:54:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=qbMf1VAcZb+7tsMrZmHjFyXtbfJR2YxM5QHYTNV3mpM=; b=R5xD32V9qswJjcshQ9goWf4Mm+KgVkotwSxIFOI54vlwCw4o//K0A4aCVN72Ssv2ga 9/4nBbKoM4T3ZWs3mDQfTiDpzcCo1UxMe4EHwKLDbj8vxy4YQgX/MxYEDyjIVb45KsBz wi0lmEGQSr1xpyKxWk75hLxqr4WW/du53wU6tO7hf2DbihRaREdeOsh2UUUTJGr68Byc Gu1oZrkvjFC4Osz/Fe5WoMgFGKhMzXo8UN8E14NuNA8lXOSy7mXWc7LXHzrxXd79avvV Ef2NdlZYNmLFUjaTe/3AaaMRHf/m/vpIhAFyyobWnrtnV/42FcLuZ2CZOdjl09jkcLWc tvTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=qbMf1VAcZb+7tsMrZmHjFyXtbfJR2YxM5QHYTNV3mpM=; b=HS8PVktcrcmbFyA6EUOdhjG2utMX/oWmVZd7XVphD7jxfYXGznBRfnvHpIO78ZFOCv FTWTbmmvShzXRvrkLPjqkzopnxYWZn0bOI5rpF3EQ59H9hxoibw99w4UeHQTxTkd7CXz SDCbzdzvCWhwqT+holpsPLy50kl7WBf+ScJv2X3bubgX5dBCiPM0MLRzF9QkHAwmxWn8 rjlT4APWF1uKEyxV4ONju0ywrfZMzpjNkYvzwUMjUYRzZax/jAobidVJH8jiq/4Up9UV tIeG0fkVz4I4LePYcKGkzWHUqllDZhCL08V4efhzQrqBPG4JqaQ7GNmxBDzLXcqRgzqI KMNA== X-Gm-Message-State: AA+aEWbjX+DIYUzk3lOzIoW+Pr7WvNfCnQi3QnVHKZNv9TJTZAvhY971 AHNqEIsRCfio7w1i2dGHbnyb/AwESu957+eutuE= X-Google-Smtp-Source: AFSGD/UmhQE9jC6x2/UrOSC/MS8kSLovrbJfoYUNPeqQqSwpReTDjP38hRlcRxmK3ZNLVSQjspl5xuRXNqa9AM7d7eg= X-Received: by 2002:a5e:a911:: with SMTP id c17mr7939661iod.127.1542902092300; Thu, 22 Nov 2018 07:54:52 -0800 (PST) MIME-Version: 1.0 References: <71CBA720-633D-4CFE-805C-606DAAEDD356@intel.com> <3C60E59D-36AD-4382-8CC3-89D4EEB0140D@intel.com> <76959924-D9DB-4C58-BB05-E33107AD98AC@intel.com> <485F0372-7486-473B-ACDA-F42A2D86EF03@intel.com> In-Reply-To: <485F0372-7486-473B-ACDA-F42A2D86EF03@intel.com> From: Harsh Patel Date: Thu, 22 Nov 2018 21:24:40 +0530 Message-ID: To: "Wiles, Keith" Cc: Kyle Larose , users@dpdk.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-users] Query on handling packets X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Nov 2018 15:54:53 -0000 Hi Thank you so much for the reply and for the solution. We used the given code. We were amazed by the pointer arithmetic you used, got to learn something new. But still we are under performing.The same bottleneck of ~2.5Mbps is seen. We also checked if the raw socket was using any extra (logical) cores than the DPDK. We found that raw socket has 2 logical threads running on 2 logical CPUs. Whereas, the DPDK version has 6 logical threads on 2 logical CPUs. We also ran the 6 threads on 4 logical CPUs, still we see the same bottleneck. We have updated our code (you can use the same links from previous mail). It would be helpful if you could help us in finding what causes the bottleneck. Thanks and Regards, Harsh and Hrishikesh On Mon, Nov 19, 2018, 19:19 Wiles, Keith wrote: > > > > On Nov 17, 2018, at 4:05 PM, Kyle Larose wrote: > > > > On Sat, Nov 17, 2018 at 5:22 AM Harsh Patel > wrote: > >> > >> Hello, > >> Thanks a lot for going through the code and providing us with so much > >> information. > >> We removed all the memcpy/malloc from the data path as you suggested and > > ... > >> After removing this, we are able to see a performance gain but not as > good > >> as raw socket. > >> > > > > You're using an unordered_map to map your buffer pointers back to the > > mbufs. While it may not do a memcpy all the time, It will likely end > > up doing a malloc arbitrarily when you insert or remove entries from > > the map. If it needs to resize the table, it'll be even worse. You may > > want to consider using librte_hash: > > https://doc.dpdk.org/api/rte__hash_8h.html instead. Or, even better, > > see if you can design the system to avoid needing to do a lookup like > > this. Can you return a handle with the mbuf pointer and the data > > together? > > > > You're also using floating point math where it's unnecessary (the > > timing check). Just multiply the numerator by 1000000 prior to doing > > the division. I doubt you'll overflow a uint64_t with that. It's not > > as efficient as integer math, though I'm not sure offhand it'd cause a > > major perf problem. > > > > One final thing: using a raw socket, the kernel will take over > > transmitting and receiving to the NIC itself. that means it is free to > > use multiple CPUs for the rx and tx. I notice that you only have one > > rx/tx queue, meaning at most one CPU can send and receive packets. > > When running your performance test with the raw socket, you may want > > to see how busy the system is doing packet sends and receives. Is it > > using more than one CPU's worth of processing? Is it using less, but > > when combined with your main application's usage, the overall system > > is still using more than one? > > Along with the floating point math, I would remove all floating point math > and use the rte_rdtsc() function to use cycles. Using something like: > > uint64_t cur_tsc, next_tsc, timo = (rte_timer_get_hz() / 16); /* One > 16th of a second use 2/4/8/16/32 power of two numbers to make the math > simple divide */ > > cur_tsc = rte_rdtsc(); > > next_tsc = cur_tsc + timo; /* Now next_tsc the next time to flush */ > > while(1) { > cur_tsc = rte_rdtsc(); > if (cur_tsc >= next_tsc) { > flush(); > next_tsc += timo; > } > /* Do other stuff */ > } > > For the m_bufPktMap I would use the rte_hash or do not use a hash at all > by grabbing the buffer address and subtract the > mbuf = (struct rte_mbuf *)RTE_PTR_SUB(buf, sizeof(struct rte_mbuf) + > RTE_MAX_HEADROOM); > > > DpdkNetDevice:Write(uint8_t *buffer, size_t length) > { > struct rte_mbuf *pkt; > uint64_t cur_tsc; > > pkt = (struct rte_mbuf *)RTE_PTR_SUB(buffer, sizeof(struct > rte_mbuf) + RTE_MAX_HEADROOM); > > /* No need to test pkt, but buffer maybe tested to make sure it is > not null above the math above */ > > pkt->pk_len = length; > pkt->data_len = length; > > rte_eth_tx_buffer(m_portId, 0, m_txBuffer, pkt); > > cur_tsc = rte_rdtsc(); > > /* next_tsc is a private variable */ > if (cur_tsc >= next_tsc) { > rte_eth_tx_buffer_flush(m_portId, 0, m_txBuffer); /* > hardcoded the queue id, should be fixed */ > next_tsc = cur_tsc + timo; /* timo is a fixed number of > cycles to wait */ > } > return length; > } > > DpdkNetDevice::Read() > { > struct rte_mbuf *pkt; > > if (m_rxBuffer->length == 0) { > m_rxBuffer->next = 0; > m_rxBuffer->length = rte_eth_rx_burst(m_portId, 0, > m_rxBuffer->pmts, MAX_PKT_BURST); > > if (m_rxBuffer->length == 0) > return std::make_pair(NULL, -1); > } > > pkt = m_rxBuffer->pkts[m_rxBuffer->next++]; > > /* do not use rte_pktmbuf_read() as it does a copy for the > complete packet */ > > return std:make_pair(rte_pktmbuf_mtod(pkt, char *), pkt->pkt_len); > } > > void > DpdkNetDevice::FreeBuf(uint8_t *buf) > { > struct rte_mbuf *pkt; > > if (!buf) > return; > pkt = (struct rte_mbuf *)RTE_PKT_SUB(buf, sizeof(rte_mbuf) + > RTE_MAX_HEADROOM); > > rte_pktmbuf_free(pkt); > } > > When your code is done with the buffer, then convert the buffer address > back to a rte_mbuf pointer and call rte_pktmbuf_free(pkt); This should > eliminate the copy and floating point code. Converting my C code to C++ > priceless :-) > > Hopefully the buffer address passed is the original buffer address and has > not be adjusted. > > > Regards, > Keith > >