From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail04.ics.ntt-tx.co.jp (mail05.ics.ntt-tx.co.jp [210.232.35.69]) by dpdk.org (Postfix) with ESMTP id 3C8832D13 for ; Tue, 16 Jan 2018 06:16:44 +0100 (CET) Received: from gwchk03.silk.ntt-tx.co.jp (gwchk03.silk.ntt-tx.co.jp [10.107.0.111]) by mail04.ics.ntt-tx.co.jp (unknown) with ESMTP id w0G5GhFq007975 for unknown; Tue, 16 Jan 2018 14:16:43 +0900 Received: (from root@localhost) by gwchk03.silk.ntt-tx.co.jp (unknown) id w0G5GgQV025483 for unknown; Tue, 16 Jan 2018 14:16:42 +0900 Received: from gwchk.silk.ntt-tx.co.jp [10.107.0.110] by gwchk03.silk.ntt-tx.co.jp with ESMTP id QAA25479; Tue, 16 Jan 2018 14:16:42 +0900 Received: from imss03.silk.ntt-tx.co.jp (localhost [127.0.0.1]) by imss03.silk.ntt-tx.co.jp (unknown) with ESMTP id w0G5GgG0009179 for unknown; Tue, 16 Jan 2018 14:16:42 +0900 Received: from mgate02.silk.ntt-tx.co.jp (smtp02.silk.ntt-tx.co.jp [10.107.0.37]) by imss03.silk.ntt-tx.co.jp (unknown) with ESMTP id w0G5GgGH009170 for unknown; Tue, 16 Jan 2018 14:16:42 +0900 Message-Id: <201801160516.w0G5GgGH009170@imss03.silk.ntt-tx.co.jp> Received: from localhost by mgate02.silk.ntt-tx.co.jp (unknown) id w0G5Gfx0026680 ; Tue, 16 Jan 2018 14:16:42 +0900 From: x-fn-spp@sl.ntt-tx.co.jp To: spp@dpdk.org Date: Tue, 16 Jan 2018 14:16:27 +0900 X-Mailer: git-send-email 1.9.1 In-Reply-To: <3e13a243-6c3f-d849-f2f4-67732e5a44cb@intel.com> References: <3e13a243-6c3f-d849-f2f4-67732e5a44cb@intel.com> X-TM-AS-MML: No Subject: [spp] [PATCH 16/30] doc: add description for explanation section X-BeenThere: spp@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Soft Patch Panel List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Jan 2018 05:16:45 -0000 From: Hiroyuki Nakamura Add details explanation for essential parts of spp_vf for developers. Signed-off-by: Naoki Takada --- docs/spp_vf/spp_vf.md | 221 ++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 214 insertions(+), 7 deletions(-) diff --git a/docs/spp_vf/spp_vf.md b/docs/spp_vf/spp_vf.md index 786e712..19ea403 100644 --- a/docs/spp_vf/spp_vf.md +++ b/docs/spp_vf/spp_vf.md @@ -4,8 +4,8 @@ SPP_VF is a SR-IOV like network functionality using DPDK for NFV. ## Overview -The application distributes incoming packets depends on MAC address -similar to SR-IOV functionality. +The application distributes incoming packets referring virtual MAC +address similar to SR-IOV functionality. Network configuration is defined in JSON config file which is imported while launching the application. The configuration is able to change after initialization by sending @@ -13,14 +13,14 @@ commnad from spp controller. SPP_VF is a multi-thread application. It consists of manager thread and forwarder threads. -There are three types of forwarder for 1:1, 1:N and N:1. +There are three types of forwarder for 1:1, 1:N and N:1 as following. * forward: 1:1 * classifier_mac: 1:N (Destination is determined by MAC address) * merge: N:1 This is an example of network configration, in which one classifier_mac, -one merger and four forwarders are runnig in spp_vf process for two +one merger and four forwarders are runnig in SPP_VF process for two destinations of vhost interface. Incoming packets from rx on host1 are sent to each of vhosts on guest by looking MAC address in the packet.. @@ -95,9 +95,15 @@ file and return json_t object as a result. In spp_config_load_file(), configuration of classifier table and resource assignment of threads are loaded into config of spp. -After importing config, each of threads are launched. +### Forwarding + +SPP_VF supports three types of forwarding, for 1:1, 1:N and N:1. +After importing config, each of forwarding threads are launched +from`rte_eal_remote_launch()`. ```c + /* spp_vf.c */ + /* Start thread */ unsigned int lcore_id = 0; RTE_LCORE_FOREACH_SLAVE(lcore_id) { @@ -113,9 +119,210 @@ After importing config, each of threads are launched. } ``` -### Forwarding +`spp_classifier_mac_do()` is a forwarding function of 1:N defined in +`classifier_mac.c`. +Configuration of destination is managed as a table structured info. +`classifier_mac_info` and `classifier_mac_mng_info` struct are for +the purpose. + +TODO(yasufum) add desc for table structure and it's doubled for +redundancy. + + ```c + /* classifier_mac.c */ + + /* classifier information */ + struct classifier_mac_info { + struct rte_hash *classifier_table; + int num_active_classified; + int active_classifieds[RTE_MAX_ETHPORTS]; + int default_classified; + }; + + /* classifier management information */ + struct classifier_mac_mng_info { + struct classifier_mac_info info[NUM_CLASSIFIER_MAC_INFO]; + volatile int ref_index; + volatile int upd_index; + struct classified_data classified_data[RTE_MAX_ETHPORTS]; + }; + ``` + +In `spp_classifier_mac_do()`, it receives packets from rx port and send them +to destinations with `classify_packet()`. +`classifier_info` is an argument of `classify_packet()` and is used to decide +the destinations. + + ```c + /* classifier_mac.c */ + + while(likely(core_info->status == SPP_CORE_IDLE) || + likely(core_info->status == SPP_CORE_FORWARD)) { + + while(likely(core_info->status == SPP_CORE_FORWARD)) { + /* change index of update side */ + change_update_index(classifier_mng_info, lcore_id); + + /* decide classifier infomation of the current cycle */ + classifier_info = classifier_mng_info->info + + classifier_mng_info->ref_index; + + /* drain tx packets, if buffer is not filled for interval */ + cur_tsc = rte_rdtsc(); + if (unlikely(cur_tsc - prev_tsc > drain_tsc)) { + for (i = 0; i < n_classified_data; i++) { + if (unlikely(classified_data[i].num_pkt != 0)) { + RTE_LOG(DEBUG, SPP_CLASSIFIER_MAC, + "transimit packets (drain). " + "index=%d, " + "num_pkt=%hu, " + "interval=%lu\n", + i, + classified_data[i].num_pkt, + cur_tsc - prev_tsc); + transmit_packet(&classified_data[i]); + } + } + prev_tsc = cur_tsc; + } + + /* retrieve packets */ + n_rx = rte_eth_rx_burst(core_info->rx_ports[0].dpdk_port, 0, + rx_pkts, MAX_PKT_BURST); + if (unlikely(n_rx == 0)) + continue; + +#ifdef SPP_RINGLATENCYSTATS_ENABLE + if (core_info->rx_ports[0].if_type == RING) + spp_ringlatencystats_calculate_latency( + core_info->rx_ports[0].if_no, rx_pkts, n_rx); +#endif + + /* classify and transmit (filled) */ + classify_packet(rx_pkts, n_rx, classifier_info, classified_data); + } + } + ``` + +On the other hand, `spp_forward` is for 1:1 or N:1 (called as merge) +forwarding defined in `spp_forward.c`. +Source and destination ports are decided from `core_info` +and given to `set_use_interface()` in which first argment is +destination info and second one is source. + + ```c + /* spp_forward.c */ + + /* RX/TX Info setting */ + rxtx_num = core_info->num_rx_port; + for (if_cnt = 0; if_cnt < rxtx_num; if_cnt++) { + set_use_interface(&patch[if_cnt].rx, + &core_info->rx_ports[if_cnt]); + if (core_info->type == SPP_CONFIG_FORWARD) { + /* FORWARD */ + set_use_interface(&patch[if_cnt].tx, + &core_info->tx_ports[if_cnt]); + } else { + /* MERGE */ + set_use_interface(&patch[if_cnt].tx, + &core_info->tx_ports[0]); + } + } + ``` + + After ports are decided, forwarding is executed. + + ```c + /* spp_forward.c */ + + int cnt, nb_rx, nb_tx, buf; + struct spp_core_port_info *rx; + struct spp_core_port_info *tx; + struct rte_mbuf *bufs[MAX_PKT_BURST]; + while (likely(core_info->status == SPP_CORE_IDLE) || + likely(core_info->status == SPP_CORE_FORWARD)) { + while (likely(core_info->status == SPP_CORE_FORWARD)) { + for (cnt = 0; cnt < rxtx_num; cnt++) { + rx = &patch[cnt].rx; + tx = &patch[cnt].tx; + + /* Packet receive */ + nb_rx = rte_eth_rx_burst(rx->dpdk_port, 0, bufs, MAX_PKT_BURST); + if (unlikely(nb_rx == 0)) { + continue; + } + +#ifdef SPP_RINGLATENCYSTATS_ENABLE + if (rx->if_type == RING) { + /* Receive port is RING */ + spp_ringlatencystats_calculate_latency(rx->if_no, + bufs, nb_rx); + } + if (tx->if_type == RING) { + /* Send port is RING */ + spp_ringlatencystats_add_time_stamp(tx->if_no, + bufs, nb_rx); + } +#endif /* SPP_RINGLATENCYSTATS_ENABLE */ + + /* Send packet */ + nb_tx = rte_eth_tx_burst(tx->dpdk_port, 0, bufs, nb_rx); + + /* Free any unsent packets. */ + if (unlikely(nb_tx < nb_rx)) { + for (buf = nb_tx; buf < nb_rx; buf++) { + rte_pktmbuf_free(bufs[buf]); + } + } + } + } + } + ``` +### L2 Multicast Support -### Packet Cloning +Multicast for resolving ARP requests is also supported in SPP_VF. +It is implemented as `handle_l2multicast_packet()` and called from +`classify_packet()` for incoming multicast packets. + ```c + /* classify_packet() in classifier_mac.c */ + + /* L2 multicast(include broadcast) ? */ + if (unlikely(is_multicast_ether_addr(ð->d_addr))) { + RTE_LOG(DEBUG, SPP_CLASSIFIER_MAC, + "multicast mac address.\n"); + handle_l2multicast_packet(rx_pkts[i], + classifier_info, classified_data); + continue; + } + ``` + +For distributing multicast packet, it is cloned with +`rte_mbuf_refcnt_update()`. + ```c + /* classifier_mac.c */ + + /* handle L2 multicast(include broadcast) packet */ + static inline void + handle_l2multicast_packet(struct rte_mbuf *pkt, + struct classifier_mac_info *classifier_info, + struct classified_data *classified_data) + { + int i; + + if (unlikely(classifier_info->num_active_classified == 0)) { + RTE_LOG(ERR, SPP_CLASSIFIER_MAC, "No mac address.(l2multicast packet)\n"); + rte_pktmbuf_free(pkt); + return; + } + + rte_mbuf_refcnt_update(pkt, classifier_info->num_active_classified); + + for (i= 0; i < classifier_info->num_active_classified; i++) { + push_packet(pkt, classified_data + + (long)classifier_info->active_classifieds[i]); + } + } + ``` -- 1.9.1