* [dpdk-dev] [PATCH 0/4] Link Bonding mode 6 support (ALB) @ 2015-01-30 10:57 Michal Jastrzebski 2015-01-30 10:57 ` [dpdk-dev] [PATCH 1/4] net: changed arp_hdr struct declaration Michal Jastrzebski ` (3 more replies) 0 siblings, 4 replies; 7+ messages in thread From: Michal Jastrzebski @ 2015-01-30 10:57 UTC (permalink / raw) To: dev This patchset add support for link bonding mode 6. Additionally it changes an arp_header structure definition. Also a basic example is introduced. Using this example, Bonding will configure each client ARP table, that packets from each client will be received on different slave, mode 6 uses round-robin policy to assign slave to client IP address. Michal Jastrzebski (4): net: changed arp_hdr struct declaration. bond: added link bonding mode 6 implementation. bond: add debug info for mode 6 link bonding bond: added example application for link bonding mode 6. app/test-pmd/icmpecho.c | 27 +- config/common_linuxapp | 2 +- examples/bond/Makefile | 57 ++ examples/bond/main.c | 790 ++++++++++++++++++++++++++++ examples/bond/main.h | 46 ++ lib/librte_net/rte_arp.h | 13 +- lib/librte_pmd_bond/Makefile | 1 + lib/librte_pmd_bond/rte_eth_bond.h | 9 + lib/librte_pmd_bond/rte_eth_bond_alb.c | 251 +++++++++ lib/librte_pmd_bond/rte_eth_bond_alb.h | 109 ++++ lib/librte_pmd_bond/rte_eth_bond_api.c | 6 + lib/librte_pmd_bond/rte_eth_bond_args.c | 1 + lib/librte_pmd_bond/rte_eth_bond_pmd.c | 355 ++++++++++++- lib/librte_pmd_bond/rte_eth_bond_private.h | 2 + 14 files changed, 1623 insertions(+), 46 deletions(-) create mode 100644 examples/bond/Makefile create mode 100644 examples/bond/main.c create mode 100644 examples/bond/main.h create mode 100644 lib/librte_pmd_bond/rte_eth_bond_alb.c create mode 100644 lib/librte_pmd_bond/rte_eth_bond_alb.h -- 1.7.9.5 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [dpdk-dev] [PATCH 1/4] net: changed arp_hdr struct declaration. 2015-01-30 10:57 [dpdk-dev] [PATCH 0/4] Link Bonding mode 6 support (ALB) Michal Jastrzebski @ 2015-01-30 10:57 ` Michal Jastrzebski 2015-01-30 10:57 ` [dpdk-dev] [PATCH 2/4] bond: added link bonding mode 6 implementation Michal Jastrzebski ` (2 subsequent siblings) 3 siblings, 0 replies; 7+ messages in thread From: Michal Jastrzebski @ 2015-01-30 10:57 UTC (permalink / raw) To: dev Changed MAC address type from uint8_t[6] to struct ether_addr and IP address type from uint8_t[4] to uint32_t. Also removed union from arp_hdr struct. Updated test-pmd to match new arp_hdr version. Signed-off-by: Maciej Gajdzica <maciejx.t.gajdzica@intel.com> --- app/test-pmd/icmpecho.c | 27 ++++++++++----------------- lib/librte_net/rte_arp.h | 13 ++++++------- 2 files changed, 16 insertions(+), 24 deletions(-) diff --git a/app/test-pmd/icmpecho.c b/app/test-pmd/icmpecho.c index 08ea01d..010c5a9 100644 --- a/app/test-pmd/icmpecho.c +++ b/app/test-pmd/icmpecho.c @@ -371,18 +371,14 @@ reply_to_icmp_echo_rqsts(struct fwd_stream *fs) continue; } if (verbose_level > 0) { - memcpy(ð_addr, - arp_h->arp_data.arp_ip.arp_sha, 6); + ether_addr_copy(&arp_h->arp_data.arp_sha, ð_addr); ether_addr_dump(" sha=", ð_addr); - memcpy(&ip_addr, - arp_h->arp_data.arp_ip.arp_sip, 4); + ip_addr = arp_h->arp_data.arp_sip; ipv4_addr_dump(" sip=", ip_addr); printf("\n"); - memcpy(ð_addr, - arp_h->arp_data.arp_ip.arp_tha, 6); + ether_addr_copy(&arp_h->arp_data.arp_tha, ð_addr); ether_addr_dump(" tha=", ð_addr); - memcpy(&ip_addr, - arp_h->arp_data.arp_ip.arp_tip, 4); + ip_addr = arp_h->arp_data.arp_tip; ipv4_addr_dump(" tip=", ip_addr); printf("\n"); } @@ -402,17 +398,14 @@ reply_to_icmp_echo_rqsts(struct fwd_stream *fs) ð_h->s_addr); arp_h->arp_op = rte_cpu_to_be_16(ARP_OP_REPLY); - memcpy(ð_addr, arp_h->arp_data.arp_ip.arp_tha, 6); - memcpy(arp_h->arp_data.arp_ip.arp_tha, - arp_h->arp_data.arp_ip.arp_sha, 6); - memcpy(arp_h->arp_data.arp_ip.arp_sha, - ð_h->s_addr, 6); + ether_addr_copy(&arp_h->arp_data.arp_tha, ð_addr); + ether_addr_copy(&arp_h->arp_data.arp_sha, &arp_h->arp_data.arp_tha); + ether_addr_copy(ð_addr, &arp_h->arp_data.arp_sha); /* Swap IP addresses in ARP payload */ - memcpy(&ip_addr, arp_h->arp_data.arp_ip.arp_sip, 4); - memcpy(arp_h->arp_data.arp_ip.arp_sip, - arp_h->arp_data.arp_ip.arp_tip, 4); - memcpy(arp_h->arp_data.arp_ip.arp_tip, &ip_addr, 4); + ip_addr = arp_h->arp_data.arp_sip; + arp_h->arp_data.arp_sip = arp_h->arp_data.arp_tip; + arp_h->arp_data.arp_tip = ip_addr; pkts_burst[nb_replies++] = pkt; continue; } diff --git a/lib/librte_net/rte_arp.h b/lib/librte_net/rte_arp.h index c7b0e51..72108a1 100644 --- a/lib/librte_net/rte_arp.h +++ b/lib/librte_net/rte_arp.h @@ -39,6 +39,7 @@ */ #include <stdint.h> +#include <rte_ether.h> #ifdef __cplusplus extern "C" { @@ -48,10 +49,10 @@ extern "C" { * ARP header IPv4 payload. */ struct arp_ipv4 { - uint8_t arp_sha[6]; /* sender hardware address */ - uint8_t arp_sip[4]; /* sender IP address */ - uint8_t arp_tha[6]; /* target hardware address */ - uint8_t arp_tip[4]; /* target IP address */ + struct ether_addr arp_sha; /* sender hardware address */ + uint32_t arp_sip; /* sender IP address */ + struct ether_addr arp_tha; /* target hardware address */ + uint32_t arp_tip; /* target IP address */ } __attribute__((__packed__)); /** @@ -72,9 +73,7 @@ struct arp_hdr { #define ARP_OP_INVREQUEST 8 /* request to identify peer */ #define ARP_OP_INVREPLY 9 /* response identifying peer */ - union { - struct arp_ipv4 arp_ip; - } arp_data; + struct arp_ipv4 arp_data; } __attribute__((__packed__)); #ifdef __cplusplus -- 1.7.9.5 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [dpdk-dev] [PATCH 2/4] bond: added link bonding mode 6 implementation. 2015-01-30 10:57 [dpdk-dev] [PATCH 0/4] Link Bonding mode 6 support (ALB) Michal Jastrzebski 2015-01-30 10:57 ` [dpdk-dev] [PATCH 1/4] net: changed arp_hdr struct declaration Michal Jastrzebski @ 2015-01-30 10:57 ` Michal Jastrzebski 2015-01-30 10:57 ` [dpdk-dev] [PATCH 3/4] bond: add debug info for mode 6 link bonding Michal Jastrzebski 2015-01-30 10:57 ` [dpdk-dev] [PATCH 4/4] bond: added example application for link bonding mode 6 Michal Jastrzebski 3 siblings, 0 replies; 7+ messages in thread From: Michal Jastrzebski @ 2015-01-30 10:57 UTC (permalink / raw) To: dev This mode includes adaptive TLB and receive load balancing (RLB). In RLB the bonding driver intercepts ARP replies send by local system and overwrites its source MAC address, so that different peers send data to the server on different slave interfaces. When local system sends ARP request, it saves IP information from it. When ARP reply from that peer is received, its MAC is stored, one of slave MACs assigned and ARP reply send to that peer. Signed-off-by: Maciej Gajdzica <maciejx.t.gajdzica@intel.com> --- lib/librte_pmd_bond/Makefile | 1 + lib/librte_pmd_bond/rte_eth_bond.h | 9 + lib/librte_pmd_bond/rte_eth_bond_alb.c | 251 ++++++++++++++++++++++++++++ lib/librte_pmd_bond/rte_eth_bond_alb.h | 109 ++++++++++++ lib/librte_pmd_bond/rte_eth_bond_api.c | 6 + lib/librte_pmd_bond/rte_eth_bond_args.c | 1 + lib/librte_pmd_bond/rte_eth_bond_pmd.c | 231 ++++++++++++++++++++++--- lib/librte_pmd_bond/rte_eth_bond_private.h | 2 + 8 files changed, 589 insertions(+), 21 deletions(-) create mode 100644 lib/librte_pmd_bond/rte_eth_bond_alb.c create mode 100644 lib/librte_pmd_bond/rte_eth_bond_alb.h diff --git a/lib/librte_pmd_bond/Makefile b/lib/librte_pmd_bond/Makefile index cdff126..d111f0c 100644 --- a/lib/librte_pmd_bond/Makefile +++ b/lib/librte_pmd_bond/Makefile @@ -46,6 +46,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_api.c SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_pmd.c SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_args.c SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_8023ad.c +SRCS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += rte_eth_bond_alb.c ifeq ($(CONFIG_RTE_MBUF_REFCNT),n) $(info WARNING: Link Bonding Broadcast mode is disabled because it needs MBUF_REFCNT.) diff --git a/lib/librte_pmd_bond/rte_eth_bond.h b/lib/librte_pmd_bond/rte_eth_bond.h index 7177983..13581cb 100644 --- a/lib/librte_pmd_bond/rte_eth_bond.h +++ b/lib/librte_pmd_bond/rte_eth_bond.h @@ -101,6 +101,15 @@ extern "C" { * This mode provides an adaptive transmit load balancing. It dynamically * changes the transmitting slave, according to the computed load. Statistics * are collected in 100ms intervals and scheduled every 10ms */ +#define BONDING_MODE_ALB (6) +/**< Adaptive Load Balancing (Mode 6) + * This mode includes adaptive TLB and receive load balancing (RLB). In RLB the + * bonding driver intercepts ARP replies send by local system and overwrites its + * source MAC address, so that different peers send data to the server on + * different slave interfaces. When local system sends ARP request, it saves IP + * information from it. When ARP reply from that peer is received, its MAC is + * stored, one of slave MACs assigned and ARP reply send to that peer. + */ /* Balance Mode Transmit Policies */ #define BALANCE_XMIT_POLICY_LAYER2 (0) diff --git a/lib/librte_pmd_bond/rte_eth_bond_alb.c b/lib/librte_pmd_bond/rte_eth_bond_alb.c new file mode 100644 index 0000000..449b2f8 --- /dev/null +++ b/lib/librte_pmd_bond/rte_eth_bond_alb.c @@ -0,0 +1,251 @@ +#include "rte_eth_bond_private.h" +#include "rte_eth_bond_alb.h" + +static inline uint8_t +simple_hash(uint8_t *hash_start, int hash_size) +{ + int i; + uint8_t hash; + + hash = 0; + for (i = 0; i < hash_size; ++i) + hash ^= hash_start[i]; + + return hash; +} + +static uint8_t +calculate_slave(struct bond_dev_private *internals) +{ + uint8_t idx; + + idx = (internals->mode6.last_slave + 1)%internals->active_slave_count; + return internals->active_slaves[idx]; +} + +int +bond_mode_alb_enable(struct rte_eth_dev *bond_dev) +{ + struct bond_dev_private *internals = bond_dev->data->dev_private; + struct client_data *hash_table = internals->mode6.client_table; + + uint16_t element_size; + char mem_name[RTE_ETH_NAME_MAX_LEN]; + int socket_id = bond_dev->pci_dev->numa_node; + + /* Fill hash table with initial values */ + memset(hash_table, 0, sizeof(struct client_data) * ALB_HASH_TABLE_SIZE); + + internals->mode6.last_slave = ALB_NULL_INDEX; + internals->mode6.ntt = 0; + + /* Initialize memory pool for ARP packets to send */ + if (internals->mode6.mempool == NULL) { + /* + * 256 is size of ETH header, ARP header and nested VLAN headers. + * The value is chosen to be cache aligned. + */ + element_size = 256 + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM; + snprintf(mem_name, sizeof(mem_name), "%s_MODE6", bond_dev->data->name); + internals->mode6.mempool = rte_mempool_create(mem_name, + 512 * RTE_MAX_ETHPORTS, + element_size, + RTE_MEMPOOL_CACHE_MAX_SIZE >= 32 ? + 32 : RTE_MEMPOOL_CACHE_MAX_SIZE, + sizeof(struct rte_pktmbuf_pool_private), rte_pktmbuf_pool_init, + NULL, rte_pktmbuf_init, NULL, socket_id, 0); + + if (internals->mode6.mempool == NULL) { + RTE_LOG(ERR, PMD, "%s: Failed to initialize ALB mempool.\n", + bond_dev->data->name); + rte_panic( + "Failed to alocate memory pool ('%s')\n" "for bond device '%s'\n", + mem_name, bond_dev->data->name); + } + } + + return 0; +} + +void +bond_mode_alb_arp_recv(struct ether_hdr *eth_h, uint16_t offset, + struct bond_dev_private *internals) +{ + struct arp_hdr *arp; + + struct client_data *hash_table = internals->mode6.client_table; + struct client_data *client_info; + + uint8_t hash_index; + + arp = (struct arp_hdr *)((char *)(eth_h + 1) + offset); + + hash_index = simple_hash((uint8_t *)&arp->arp_data.arp_sip, + sizeof(uint32_t)); + client_info = &hash_table[hash_index]; + + if (arp->arp_op == rte_cpu_to_be_16(ARP_OP_REPLY)) { + /* + * We got reply for ARP Request send by the application. We need to + * update client table and issue sending update packet to that slave. + */ + rte_spinlock_lock(&internals->mode6.lock); + if (client_info->in_use == 0 || + client_info->app_ip != arp->arp_data.arp_tip || + client_info->cli_ip != arp->arp_data.arp_sip) { + client_info->in_use = 1; + client_info->app_ip = arp->arp_data.arp_tip; + client_info->cli_ip = arp->arp_data.arp_sip; + ether_addr_copy(&arp->arp_data.arp_sha, &client_info->cli_mac); + client_info->slave_idx = calculate_slave(internals); + internals->mode6.last_slave = client_info->slave_idx; + rte_eth_macaddr_get(client_info->slave_idx, &client_info->app_mac); + ether_addr_copy(&client_info->app_mac, &arp->arp_data.arp_tha); + memcpy(client_info->vlan, eth_h + 1, offset); + } else if (!is_same_ether_addr(&client_info->cli_mac, + &arp->arp_data.arp_sha)) { + /* + * We received response to broadcast message and must update + * only client MAC. + */ + ether_addr_copy(&arp->arp_data.arp_sha, &client_info->cli_mac); + } + internals->mode6.ntt = 1; + rte_spinlock_unlock(&internals->mode6.lock); + } + /* ARP Requests are forwarded to the application with no changes */ +} + +uint8_t +bond_mode_alb_arp_xmit(struct ether_hdr *eth_h, uint16_t offset, + struct bond_dev_private *internals) +{ + struct arp_hdr *arp; + + struct client_data *hash_table = internals->mode6.client_table; + struct client_data *client_info; + + uint8_t hash_index; + + struct ether_addr bonding_mac; + + arp = (struct arp_hdr *)((char *)(eth_h + 1) + offset); + + /* + * Traffic with src MAC other than bonding should be sent on + * current primary port. + */ + rte_eth_macaddr_get(internals->port_id, &bonding_mac); + if (!is_same_ether_addr(&bonding_mac, &arp->arp_data.arp_sha)) { + rte_eth_macaddr_get(internals->current_primary_port, + &arp->arp_data.arp_sha); + return internals->current_primary_port; + } + + hash_index = simple_hash((uint8_t *)&arp->arp_data.arp_tip, + sizeof(uint32_t)); + client_info = &hash_table[hash_index]; + + rte_spinlock_lock(&internals->mode6.lock); + if (arp->arp_op == rte_cpu_to_be_16(ARP_OP_REPLY)) { + if (client_info->in_use) { + if (client_info->app_ip == arp->arp_data.arp_sip && + client_info->cli_ip == arp->arp_data.arp_tip) { + /* Entry is already assigned to this client */ + if (!is_broadcast_ether_addr(&arp->arp_data.arp_tha)) { + ether_addr_copy(&arp->arp_data.arp_tha, + &client_info->cli_mac); + } + rte_eth_macaddr_get(client_info->slave_idx, &client_info->app_mac); + ether_addr_copy(&client_info->app_mac, &arp->arp_data.arp_sha); + memcpy(client_info->vlan, eth_h + 1, offset); + rte_spinlock_unlock(&internals->mode6.lock); + return client_info->slave_idx; + } + } + + /* Assign new slave to this client and update src mac in ARP */ + client_info->in_use = 1; + client_info->ntt = 0; + client_info->app_ip = arp->arp_data.arp_sip; + ether_addr_copy(&arp->arp_data.arp_tha, &client_info->cli_mac); + client_info->cli_ip = arp->arp_data.arp_tip; + client_info->slave_idx = calculate_slave(internals); + internals->mode6.last_slave = client_info->slave_idx; + rte_eth_macaddr_get(client_info->slave_idx, &client_info->app_mac); + ether_addr_copy(&client_info->app_mac, &arp->arp_data.arp_sha); + memcpy(client_info->vlan, eth_h + 1, offset); + rte_spinlock_unlock(&internals->mode6.lock); + return client_info->slave_idx; + } + + /* If packet is not ARP Reply, send it on current primary port. */ + rte_spinlock_unlock(&internals->mode6.lock); + rte_eth_macaddr_get(internals->current_primary_port, + &arp->arp_data.arp_sha); + return internals->current_primary_port; +} + +uint8_t +bond_mode_alb_arp_upd(struct client_data *client_info, + struct rte_mbuf *pkt, struct bond_dev_private *internals) +{ + struct ether_hdr *eth_h; + struct arp_hdr *arp_h; + uint8_t slave_idx; + + rte_spinlock_lock(&internals->mode6.lock); + eth_h = rte_pktmbuf_mtod(pkt, struct ether_hdr *); + + ether_addr_copy(&client_info->app_mac, ð_h->s_addr); + ether_addr_copy(&client_info->cli_mac, ð_h->d_addr); + eth_h->ether_type = rte_cpu_to_be_16(ETHER_TYPE_ARP); + + arp_h = (struct arp_hdr *)((char *)eth_h + sizeof(struct ether_hdr) + + client_info->vlan_count * sizeof(struct vlan_hdr)); + + memcpy(eth_h + 1, client_info->vlan, + client_info->vlan_count * sizeof(struct vlan_hdr)); + + ether_addr_copy(&client_info->app_mac, &arp_h->arp_data.arp_sha); + arp_h->arp_data.arp_sip = client_info->app_ip; + ether_addr_copy(&client_info->cli_mac, &arp_h->arp_data.arp_tha); + arp_h->arp_data.arp_tip = client_info->cli_ip; + + arp_h->arp_hrd = rte_cpu_to_be_16(ARP_HRD_ETHER); + arp_h->arp_pro = rte_cpu_to_be_16(ETHER_TYPE_IPv4); + arp_h->arp_hln = ETHER_ADDR_LEN; + arp_h->arp_pln = sizeof(uint32_t); + arp_h->arp_op = rte_cpu_to_be_16(ARP_OP_REPLY); + + slave_idx = client_info->slave_idx; + rte_spinlock_unlock(&internals->mode6.lock); + + return slave_idx; +} + +void +bond_mode_alb_client_list_upd(struct rte_eth_dev *bond_dev) +{ + struct bond_dev_private *internals = bond_dev->data->dev_private; + struct client_data *client_info; + + int i; + /* If active slave count is 0, it's pointless to refresh alb table */ + if (internals->active_slave_count <= 0) + return; + + rte_spinlock_lock(&internals->mode6.lock); + internals->mode6.last_slave = ALB_NULL_INDEX; + + for (i = 0; i < ALB_HASH_TABLE_SIZE; i++) { + client_info = &internals->mode6.client_table[i]; + if (client_info->in_use) { + client_info->slave_idx = calculate_slave(internals); + internals->mode6.last_slave = client_info->slave_idx; + rte_eth_macaddr_get(client_info->slave_idx, &client_info->app_mac); + internals->mode6.ntt = 1; + } + } + rte_spinlock_unlock(&internals->mode6.lock); +} diff --git a/lib/librte_pmd_bond/rte_eth_bond_alb.h b/lib/librte_pmd_bond/rte_eth_bond_alb.h new file mode 100644 index 0000000..0cfe942 --- /dev/null +++ b/lib/librte_pmd_bond/rte_eth_bond_alb.h @@ -0,0 +1,109 @@ +#ifndef RTE_ETH_BOND_ALB_H_ +#define RTE_ETH_BOND_ALB_H_ + +#include <rte_ether.h> +#include <rte_arp.h> + +#define ALB_HASH_TABLE_SIZE 256 +#define ALB_NULL_INDEX 0xFFFFFFFF + +struct client_data { + /** ARP data of single client */ + struct ether_addr app_mac; + /**< MAC address of application running DPDK */ + uint32_t app_ip; + /**< IP address of application running DPDK */ + struct ether_addr cli_mac; + /**< Client MAC address */ + uint32_t cli_ip; + /**< Client IP address */ + + uint8_t slave_idx; + /**< Index of slave on which we connect with that client */ + uint8_t in_use; + /**< Flag indicating if entry in client table is currently used */ + uint8_t ntt; + /**< Flag indicating if we need to send update to this client on next tx */ + + struct vlan_hdr vlan[2]; + /**< Content of vlan headers */ + uint8_t vlan_count; + /**< Number of nested vlan headers */ +}; + +struct mode_alb_private { + struct client_data client_table[ALB_HASH_TABLE_SIZE]; + /**< Hash table storing ARP data of every client connected */ + struct rte_mempool *mempool; + /**< Mempool for creating ARP update packets */ + uint8_t ntt; + /**< Flag indicating if we need to send update to any client on next tx */ + uint32_t last_slave; + /**< Index of last used slave in client table */ + rte_spinlock_t lock; +}; + +/** + * ALB mode initialization. + * + * @param bond_dev Pointer to bonding device. + * + * @return + * Error code - 0 on success. + */ +int +bond_mode_alb_enable(struct rte_eth_dev *bond_dev); + +/** + * Function handles ARP packet reception. If received ARP request, it is + * forwarded to application without changes. If it is ARP reply, client table + * is updated. + * + * @param eth_h ETH header of received packet. + * @param offset Vlan header offset. + * @param internals Bonding data. + */ +void +bond_mode_alb_arp_recv(struct ether_hdr *eth_h, uint16_t offset, + struct bond_dev_private *internals); + +/** + * Function handles ARP packet transmission. It also decides on which slave + * send that packet. If packet is ARP Request, it is send on primary slave. + * If it is ARP Reply, it is send on slave stored in client table for that + * connection. On Reply function also updates data in client table. + * + * @param eth_h ETH header of transmitted packet. + * @param offset Vlan header offset. + * @param internals Bonding data. + * + * @return + * Index of slave on which packet should be sent. + */ +uint8_t +bond_mode_alb_arp_xmit(struct ether_hdr *eth_h, uint16_t offset, + struct bond_dev_private *internals); + +/** + * Function fills packet with ARP data from client_info. + * + * @param client_info Data of client to which packet is sent. + * @param pkt Pointer to packet which is sent. + * @param internals Bonding data. + * + * @return + * Index of slawe on which packet should be sent. + */ +uint8_t +bond_mode_alb_arp_upd(struct client_data *client_info, + struct rte_mbuf *pkt, struct bond_dev_private *internals); + +/** + * Function updates slave indexes of active connections. + * + * @param bond_dev Pointer to bonded device struct. + */ +void +bond_mode_alb_client_list_upd(struct rte_eth_dev *bond_dev); + +#endif /* RTE_ETH_BOND_ALB_H_ */ diff --git a/lib/librte_pmd_bond/rte_eth_bond_api.c b/lib/librte_pmd_bond/rte_eth_bond_api.c index 4ab3267..92ef3ae 100644 --- a/lib/librte_pmd_bond/rte_eth_bond_api.c +++ b/lib/librte_pmd_bond/rte_eth_bond_api.c @@ -120,6 +120,9 @@ activate_slave(struct rte_eth_dev *eth_dev, uint8_t port_id) internals->active_slaves[internals->active_slave_count] = port_id; internals->active_slave_count++; + + if (internals->mode == BONDING_MODE_ALB) + bond_mode_alb_client_list_upd(eth_dev); } void @@ -152,6 +155,9 @@ deactivate_slave(struct rte_eth_dev *eth_dev, uint8_t port_id) if (eth_dev->data->dev_started && internals->mode == BONDING_MODE_8023AD) bond_mode_8023ad_start(eth_dev); + + if (internals->mode == BONDING_MODE_ALB) + bond_mode_alb_client_list_upd(eth_dev); } uint8_t diff --git a/lib/librte_pmd_bond/rte_eth_bond_args.c b/lib/librte_pmd_bond/rte_eth_bond_args.c index ca4de38..a3f7f55 100644 --- a/lib/librte_pmd_bond/rte_eth_bond_args.c +++ b/lib/librte_pmd_bond/rte_eth_bond_args.c @@ -175,6 +175,7 @@ bond_ethdev_parse_slave_mode_kvarg(const char *key __rte_unused, #endif case BONDING_MODE_8023AD: case BONDING_MODE_ADAPTIVE_TRANSMIT_LOAD_BALANCING: + case BONDING_MODE_ALB: return 0; default: RTE_BOND_LOG(ERR, "Invalid slave mode value (%s) specified", value); diff --git a/lib/librte_pmd_bond/rte_eth_bond_pmd.c b/lib/librte_pmd_bond/rte_eth_bond_pmd.c index 8b80297..b0525cc 100644 --- a/lib/librte_pmd_bond/rte_eth_bond_pmd.c +++ b/lib/librte_pmd_bond/rte_eth_bond_pmd.c @@ -56,6 +56,42 @@ /* Table for statistics in mode 5 TLB */ static uint64_t tlb_last_obytets[RTE_MAX_ETHPORTS]; +static inline size_t +get_vlan_offset(struct ether_hdr *eth_hdr) +{ + size_t vlan_offset = 0; + + /* Calculate VLAN offset */ + if (rte_cpu_to_be_16(ETHER_TYPE_VLAN) == eth_hdr->ether_type) { + struct vlan_hdr *vlan_hdr = (struct vlan_hdr *)(eth_hdr + 1); + vlan_offset = sizeof(struct vlan_hdr); + + while (rte_cpu_to_be_16(ETHER_TYPE_VLAN) == + vlan_hdr->eth_proto) { + vlan_hdr = vlan_hdr + 1; + vlan_offset += sizeof(struct vlan_hdr); + } + } + return vlan_offset; +} + +static uint16_t +get_vlan_ethertype(struct ether_hdr *eth_hdr) +{ + if (rte_cpu_to_be_16(ETHER_TYPE_VLAN) == eth_hdr->ether_type) { + struct vlan_hdr *vlan_hdr = (struct vlan_hdr *)(eth_hdr + 1); + + while (rte_cpu_to_be_16(ETHER_TYPE_VLAN) == + vlan_hdr->eth_proto) { + vlan_hdr = vlan_hdr + 1; + } + + return vlan_hdr->eth_proto; + } else { + return eth_hdr->ether_type; + } +} + static uint16_t bond_ethdev_rx_burst(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) { @@ -173,6 +209,34 @@ bond_ethdev_rx_burst_8023ad(void *queue, struct rte_mbuf **bufs, } static uint16_t +bond_ethdev_rx_burst_alb(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) +{ + struct bond_tx_queue *bd_tx_q = (struct bond_tx_queue *)queue; + struct bond_dev_private *internals = bd_tx_q->dev_private; + + struct ether_hdr *eth_h; + + uint16_t ether_type, offset; + uint16_t nb_recv_pkts; + + int i; + + nb_recv_pkts = bond_ethdev_rx_burst(queue, bufs, nb_pkts); + + for (i = 0; i < nb_recv_pkts; i++) { + eth_h = rte_pktmbuf_mtod(bufs[i], struct ether_hdr *); + offset = get_vlan_offset(eth_h); + ether_type = get_vlan_ethertype(eth_h); + + if (ether_type == rte_cpu_to_be_16(ETHER_TYPE_ARP)) { + bond_mode_alb_arp_recv(eth_h, offset, internals); + } + } + + return nb_recv_pkts; +} + +static uint16_t bond_ethdev_tx_burst_round_robin(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) { @@ -281,25 +345,6 @@ ipv6_hash(struct ipv6_hdr *ipv6_hdr) (word_src_addr[3] ^ word_dst_addr[3]); } -static inline size_t -get_vlan_offset(struct ether_hdr *eth_hdr) -{ - size_t vlan_offset = 0; - - /* Calculate VLAN offset */ - if (rte_cpu_to_be_16(ETHER_TYPE_VLAN) == eth_hdr->ether_type) { - struct vlan_hdr *vlan_hdr = (struct vlan_hdr *)(eth_hdr + 1); - vlan_offset = sizeof(struct vlan_hdr); - - while (rte_cpu_to_be_16(ETHER_TYPE_VLAN) == - vlan_hdr->eth_proto) { - vlan_hdr = vlan_hdr + 1; - vlan_offset += sizeof(struct vlan_hdr); - } - } - return vlan_offset; -} - uint16_t xmit_l2_hash(const struct rte_mbuf *buf, uint8_t slave_count) { @@ -525,6 +570,134 @@ bond_ethdev_tx_burst_tlb(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) } static uint16_t +bond_ethdev_tx_burst_alb(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) +{ + struct bond_tx_queue *bd_tx_q = (struct bond_tx_queue *)queue; + struct bond_dev_private *internals = bd_tx_q->dev_private; + + struct ether_hdr *eth_h; + uint16_t ether_type, offset; + + struct client_data *client_info; + + /* + * We create transmit buffers for every slave and one additional to send + * through tlb. In worst case every packet will be send on one port. + */ + struct rte_mbuf *slave_bufs[RTE_MAX_ETHPORTS + 1][nb_pkts]; + uint16_t slave_bufs_pkts[RTE_MAX_ETHPORTS + 1] = { 0 }; + + /* + * We create separate transmit buffers for update packets as they wont be + * counted in num_tx_total. + */ + struct rte_mbuf *update_bufs[RTE_MAX_ETHPORTS][ALB_HASH_TABLE_SIZE]; + uint16_t update_bufs_pkts[RTE_MAX_ETHPORTS] = { 0 }; + + struct rte_mbuf *upd_pkt; + size_t pkt_size; + + uint16_t num_send, num_not_send = 0; + uint16_t num_tx_total = 0; + uint8_t slave_idx; + + int i, j; + + /* Search tx buffer for ARP packets and forward them to alb */ + for (i = 0; i < nb_pkts; i++) { + eth_h = rte_pktmbuf_mtod(bufs[i], struct ether_hdr *); + offset = get_vlan_offset(eth_h); + ether_type = get_vlan_ethertype(eth_h); + + if (ether_type == rte_cpu_to_be_16(ETHER_TYPE_ARP)) { + slave_idx = bond_mode_alb_arp_xmit(eth_h, offset, internals); + + /* Change src mac in eth header */ + rte_eth_macaddr_get(slave_idx, ð_h->s_addr); + + /* Add packet to slave tx buffer */ + slave_bufs[slave_idx][slave_bufs_pkts[slave_idx]] = bufs[i]; + slave_bufs_pkts[slave_idx]++; + } else { + /* If packet is not ARP, send it with TLB policy */ + slave_bufs[RTE_MAX_ETHPORTS][slave_bufs_pkts[RTE_MAX_ETHPORTS]] = + bufs[i]; + slave_bufs_pkts[RTE_MAX_ETHPORTS]++; + } + } + + /* Update connected client ARP tables */ + if (internals->mode6.ntt) { + for (i = 0; i < ALB_HASH_TABLE_SIZE; i++) { + client_info = &internals->mode6.client_table[i]; + + if (client_info->in_use) { + /* Allocate new packet to send ARP update on current slave */ + upd_pkt = rte_pktmbuf_alloc(internals->mode6.mempool); + if (upd_pkt == NULL) { + RTE_LOG(ERR, PMD, "Failed to allocate ARP packet from pool\n"); + continue; + } + pkt_size = sizeof(struct ether_hdr) + sizeof(struct arp_hdr); + upd_pkt->data_len = pkt_size; + upd_pkt->pkt_len = pkt_size; + + slave_idx = bond_mode_alb_arp_upd(client_info, upd_pkt, + internals); + + /* Add packet to update tx buffer */ + update_bufs[slave_idx][update_bufs_pkts[slave_idx]] = upd_pkt; + update_bufs_pkts[slave_idx]++; + } + } + internals->mode6.ntt = 0; + } + + /* Send ARP packets on proper slaves */ + for (i = 0; i < RTE_MAX_ETHPORTS; i++) { + if (slave_bufs_pkts[i] > 0) { + num_send = rte_eth_tx_burst(i, bd_tx_q->queue_id, + slave_bufs[i], slave_bufs_pkts[i]); + for (j = 0; j < slave_bufs_pkts[i] - num_send; j++) { + bufs[nb_pkts - 1 - num_not_send - j] = + slave_bufs[i][nb_pkts - 1 - j]; + } + + num_tx_total += num_send; + num_not_send += slave_bufs_pkts[i] - num_send; + } + } + + /* Send update packets on proper slaves */ + for (i = 0; i < RTE_MAX_ETHPORTS; i++) { + if (update_bufs_pkts[i] > 0) { + num_send = rte_eth_tx_burst(i, bd_tx_q->queue_id, update_bufs[i], + update_bufs_pkts[i]); + for (j = num_send; j < update_bufs_pkts[i]; j++) { + rte_pktmbuf_free(update_bufs[i][j]); + } + } + } + + /* Send non-ARP packets using tlb policy */ + if (slave_bufs_pkts[RTE_MAX_ETHPORTS] > 0) { + num_send = bond_ethdev_tx_burst_tlb(queue, + slave_bufs[RTE_MAX_ETHPORTS], + slave_bufs_pkts[RTE_MAX_ETHPORTS]); + + for (j = 0; j < slave_bufs_pkts[RTE_MAX_ETHPORTS]; j++) { + bufs[nb_pkts - 1 - num_not_send - j] = + slave_bufs[RTE_MAX_ETHPORTS][nb_pkts - 1 - j]; + } + + num_tx_total += num_send; + num_not_send += slave_bufs_pkts[RTE_MAX_ETHPORTS] - num_send; + } + + return num_tx_total; +} + +static uint16_t bond_ethdev_tx_burst_balance(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) { @@ -852,6 +1025,7 @@ mac_address_slaves_update(struct rte_eth_dev *bonded_eth_dev) break; case BONDING_MODE_ACTIVE_BACKUP: case BONDING_MODE_ADAPTIVE_TRANSMIT_LOAD_BALANCING: + case BONDING_MODE_ALB: default: for (i = 0; i < internals->slave_count; i++) { if (internals->slaves[i].port_id == @@ -917,6 +1091,13 @@ bond_ethdev_mode_set(struct rte_eth_dev *eth_dev, int mode) eth_dev->tx_pkt_burst = bond_ethdev_tx_burst_tlb; eth_dev->rx_pkt_burst = bond_ethdev_rx_burst_active_backup; break; + case BONDING_MODE_ALB: + if (bond_mode_alb_enable(eth_dev) != 0) + return -1; + + eth_dev->tx_pkt_burst = bond_ethdev_tx_burst_alb; + eth_dev->rx_pkt_burst = bond_ethdev_rx_burst_alb; + break; default: return -1; } @@ -1132,7 +1313,8 @@ bond_ethdev_start(struct rte_eth_dev *eth_dev) if (internals->mode == BONDING_MODE_8023AD) bond_mode_8023ad_start(eth_dev); - if (internals->mode == BONDING_MODE_ADAPTIVE_TRANSMIT_LOAD_BALANCING) + if (internals->mode == BONDING_MODE_ADAPTIVE_TRANSMIT_LOAD_BALANCING || + internals->mode == BONDING_MODE_ALB) bond_ethdev_update_tlb_slave_cb(internals); return 0; @@ -1164,7 +1346,8 @@ bond_ethdev_stop(struct rte_eth_dev *eth_dev) } } - if (internals->mode == BONDING_MODE_ADAPTIVE_TRANSMIT_LOAD_BALANCING) { + if (internals->mode == BONDING_MODE_ADAPTIVE_TRANSMIT_LOAD_BALANCING || + internals->mode == BONDING_MODE_ALB) { rte_eal_alarm_cancel(bond_ethdev_update_tlb_slave_cb, internals); } @@ -1362,8 +1545,12 @@ bond_ethdev_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats) { struct bond_dev_private *internals = dev->data->dev_private; struct rte_eth_stats slave_stats; + int i; + /* clear bonded stats before populating from slaves */ + memset(stats, 0, sizeof(*stats)); + for (i = 0; i < internals->slave_count; i++) { rte_eth_stats_get(internals->slaves[i].port_id, &slave_stats); @@ -1418,6 +1605,7 @@ bond_ethdev_promiscuous_enable(struct rte_eth_dev *eth_dev) /* Promiscuous mode is propagated only to primary slave */ case BONDING_MODE_ACTIVE_BACKUP: case BONDING_MODE_ADAPTIVE_TRANSMIT_LOAD_BALANCING: + case BONDING_MODE_ALB: default: rte_eth_promiscuous_enable(internals->current_primary_port); } @@ -1447,6 +1635,7 @@ bond_ethdev_promiscuous_disable(struct rte_eth_dev *dev) /* Promiscuous mode is propagated only to primary slave */ case BONDING_MODE_ACTIVE_BACKUP: case BONDING_MODE_ADAPTIVE_TRANSMIT_LOAD_BALANCING: + case BONDING_MODE_ALB: default: rte_eth_promiscuous_disable(internals->current_primary_port); } diff --git a/lib/librte_pmd_bond/rte_eth_bond_private.h b/lib/librte_pmd_bond/rte_eth_bond_private.h index e01e66b..e69e301 100644 --- a/lib/librte_pmd_bond/rte_eth_bond_private.h +++ b/lib/librte_pmd_bond/rte_eth_bond_private.h @@ -43,6 +43,7 @@ extern "C" { #include "rte_eth_bond.h" #include "rte_eth_bond_8023ad_private.h" +#include "rte_eth_bond_alb.h" #define PMD_BOND_SLAVE_PORT_KVARG ("slave") #define PMD_BOND_PRIMARY_SLAVE_KVARG ("primary") @@ -152,6 +153,7 @@ struct bond_dev_private { /**< Arary of bonded slaves details */ struct mode8023ad_private mode4; + struct mode_alb_private mode6; uint32_t rx_offload_capa; /** Rx offload capability */ uint32_t tx_offload_capa; /** Tx offload capability */ -- 1.7.9.5 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [dpdk-dev] [PATCH 3/4] bond: add debug info for mode 6 link bonding 2015-01-30 10:57 [dpdk-dev] [PATCH 0/4] Link Bonding mode 6 support (ALB) Michal Jastrzebski 2015-01-30 10:57 ` [dpdk-dev] [PATCH 1/4] net: changed arp_hdr struct declaration Michal Jastrzebski 2015-01-30 10:57 ` [dpdk-dev] [PATCH 2/4] bond: added link bonding mode 6 implementation Michal Jastrzebski @ 2015-01-30 10:57 ` Michal Jastrzebski 2015-01-30 11:09 ` Jastrzebski, MichalX K 2015-01-30 10:57 ` [dpdk-dev] [PATCH 4/4] bond: added example application for link bonding mode 6 Michal Jastrzebski 3 siblings, 1 reply; 7+ messages in thread From: Michal Jastrzebski @ 2015-01-30 10:57 UTC (permalink / raw) To: dev Signed-off-by: Michal Jastrzebski <michalx.k.jastrzebski@intel.com> --- config/common_linuxapp | 2 +- lib/librte_pmd_bond/rte_eth_bond_pmd.c | 124 ++++++++++++++++++++++++++++++++ 2 files changed, 125 insertions(+), 1 deletion(-) diff --git a/config/common_linuxapp b/config/common_linuxapp index 2f9643b..1cc2d7e 100644 --- a/config/common_linuxapp +++ b/config/common_linuxapp @@ -220,7 +220,7 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=n # Compile link bonding PMD library # CONFIG_RTE_LIBRTE_PMD_BOND=y - +CONFIG_RTE_LIBRTE_BOND_DEBUG_ALB=n # # Compile software PMD backed by AF_PACKET sockets (Linux only) # diff --git a/lib/librte_pmd_bond/rte_eth_bond_pmd.c b/lib/librte_pmd_bond/rte_eth_bond_pmd.c index b0525cc..348c653 100644 --- a/lib/librte_pmd_bond/rte_eth_bond_pmd.c +++ b/lib/librte_pmd_bond/rte_eth_bond_pmd.c @@ -208,6 +208,78 @@ bond_ethdev_rx_burst_8023ad(void *queue, struct rte_mbuf **bufs, return num_rx_total; } +#ifdef RTE_LIBRTE_BOND_DEBUG_ALB +uint32_t burstnumberRX; +uint32_t burstnumberTX; + +static void +arp_op_name(uint16_t arp_op, char *buf) +{ + switch (arp_op) { + case ARP_OP_REQUEST: + snprintf(buf, sizeof("ARP Request"), "%s", "ARP Request"); + return; + case ARP_OP_REPLY: + snprintf(buf, sizeof("ARP Reply"), "%s", "ARP Reply"); + return; + case ARP_OP_REVREQUEST: + snprintf(buf, sizeof("Reverse ARP Request"), "%s", "Reverse ARP Request"); + return; + case ARP_OP_REVREPLY: + snprintf(buf, sizeof("Reverse ARP Reply"), "%s", "Reverse ARP Reply"); + return; + case ARP_OP_INVREQUEST: + snprintf(buf, sizeof("Peer Identify Request"), "%s", "Peer Identify Request"); + return; + case ARP_OP_INVREPLY: + snprintf(buf, sizeof("Peer Identify Reply"), "%s", "Peer Identify Reply"); + return; + default: + break; + } + snprintf(buf, sizeof("Unknown"), "%s", "Unknown"); + return; +} +#define MaxIPv4String 16 +static void +ipv4_addr_to_dot(uint32_t be_ipv4_addr, char *buf, uint8_t buf_size) +{ + uint32_t ipv4_addr; + + ipv4_addr = rte_be_to_cpu_32(be_ipv4_addr); + snprintf(buf, buf_size, "%d.%d.%d.%d", (ipv4_addr >> 24) & 0xFF, + (ipv4_addr >> 16) & 0xFF, (ipv4_addr >> 8) & 0xFF, + ipv4_addr & 0xFF); +} + +#define MODE6_DEBUG(info, src_ip, dst_ip, eth_h, arp_op, port, burstnumber) \ + RTE_LOG(DEBUG, PMD, info \ + "port:%d " \ + "SrcMAC:%02X:%02X:%02X:%02X:%02X:%02X " \ + "SrcIP:%s " \ + "DstMAC:%02X:%02X:%02X:%02X:%02X:%02X " \ + "DstIP:%s " \ + "%s " \ + "%d\n", \ + port, \ + eth_h->s_addr.addr_bytes[0], \ + eth_h->s_addr.addr_bytes[1], \ + eth_h->s_addr.addr_bytes[2], \ + eth_h->s_addr.addr_bytes[3], \ + eth_h->s_addr.addr_bytes[4], \ + eth_h->s_addr.addr_bytes[5], \ + src_ip, \ + eth_h->d_addr.addr_bytes[0], \ + eth_h->d_addr.addr_bytes[1], \ + eth_h->d_addr.addr_bytes[2], \ + eth_h->d_addr.addr_bytes[3], \ + eth_h->d_addr.addr_bytes[4], \ + eth_h->d_addr.addr_bytes[5], \ + dst_ip, \ + arp_op, \ + ++burstnumber) +#endif + static uint16_t bond_ethdev_rx_burst_alb(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) { @@ -222,6 +294,13 @@ bond_ethdev_rx_burst_alb(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) int i; nb_recv_pkts = bond_ethdev_rx_burst(queue, bufs, nb_pkts); +#ifdef RTE_LIBRTE_BOND_DEBUG_ALB + struct arp_hdr *arp_h; + struct ipv4_hdr *ipv4_h; + char src_ip[16]; + char dst_ip[16]; + char ArpOp[24]; +#endif for (i = 0; i < nb_recv_pkts; i++) { eth_h = rte_pktmbuf_mtod(bufs[i], struct ether_hdr *); @@ -229,8 +308,23 @@ bond_ethdev_rx_burst_alb(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) ether_type = get_vlan_ethertype(eth_h); if (ether_type == rte_cpu_to_be_16(ETHER_TYPE_ARP)) { +#ifdef RTE_LIBRTE_BOND_DEBUG_ALB + arp_h = (struct arp_hdr *)((char *)(eth_h + 1) + offset); + ipv4_addr_to_dot(arp_h->arp_data.arp_sip, src_ip, MaxIPv4String); + ipv4_addr_to_dot(arp_h->arp_data.arp_tip, dst_ip, MaxIPv4String); + arp_op_name(rte_be_to_cpu_16(arp_h->arp_op), ArpOp); + MODE6_DEBUG("RX ARP:", src_ip, dst_ip, eth_h, ArpOp, bufs[i]->port, burstnumberRX); +#endif bond_mode_alb_arp_recv(eth_h, offset, internals); } +#ifdef RTE_LIBRTE_BOND_DEBUG_ALB + else if (ether_type == rte_cpu_to_be_16(ETHER_TYPE_IPv4)) { + ipv4_h = (struct ipv4_hdr *)((char *)(eth_h + 1) + offset); + ipv4_addr_to_dot(ipv4_h->src_addr, src_ip, MaxIPv4String); + ipv4_addr_to_dot(ipv4_h->dst_addr, dst_ip, MaxIPv4String); + MODE6_DEBUG("RX IPv4:", src_ip, dst_ip, eth_h, "", bufs[i]->port, burstnumberRX); + } +#endif } return nb_recv_pkts; @@ -653,6 +747,12 @@ bond_ethdev_tx_burst_alb(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) internals->mode6.ntt = 0; } +#ifdef RTE_LIBRTE_BOND_DEBUG_ALB + struct arp_hdr *arp_h; + char src_ip[16]; + char dst_ip[16]; + char ArpOp[24]; +#endif /* Send ARP packets on proper slaves */ for (i = 0; i < RTE_MAX_ETHPORTS; i++) { if (slave_bufs_pkts[i] > 0) { @@ -665,6 +765,19 @@ bond_ethdev_tx_burst_alb(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) num_tx_total += num_send; num_not_send += slave_bufs_pkts[i] - num_send; + +#ifdef RTE_LIBRTE_BOND_DEBUG_ALB + /* Print TX stats including update packets */ + for (j = 0; j < slave_bufs_pkts[i]; j++) { + eth_h = rte_pktmbuf_mtod(slave_bufs[i][j], struct ether_hdr *); + offset = get_vlan_offset(eth_h); + arp_h = (struct arp_hdr *)((char *)(eth_h + 1) + offset); + ipv4_addr_to_dot(arp_h->arp_data.arp_sip, src_ip, MaxIPv4String); + ipv4_addr_to_dot(arp_h->arp_data.arp_tip, dst_ip, MaxIPv4String); + arp_op_name(rte_be_to_cpu_16(arp_h->arp_op), ArpOp); + MODE6_DEBUG("TX ARP:", src_ip, dst_ip, eth_h, ArpOp, i, burstnumberTX); + } +#endif } } @@ -676,6 +789,17 @@ bond_ethdev_tx_burst_alb(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts) for (j = num_send; j < update_bufs_pkts[i]; j++) { rte_pktmbuf_free(update_bufs[i][j]); } +#ifdef RTE_LIBRTE_BOND_DEBUG_ALB + for (j = 0; j < update_bufs_pkts[i]; j++) { + eth_h = rte_pktmbuf_mtod(update_bufs[i][j], struct ether_hdr *); + offset = get_vlan_offset(eth_h); + arp_h = (struct arp_hdr *)((char *)(eth_h + 1) + offset); + ipv4_addr_to_dot(arp_h->arp_data.arp_sip, src_ip, MaxIPv4String); + ipv4_addr_to_dot(arp_h->arp_data.arp_tip, dst_ip, MaxIPv4String); + arp_op_name(rte_be_to_cpu_16(arp_h->arp_op), ArpOp); + MODE6_DEBUG("TX ARPupd:", src_ip, dst_ip, eth_h, ArpOp, i, burstnumberTX); + } +#endif } } -- 1.7.9.5 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dpdk-dev] [PATCH 3/4] bond: add debug info for mode 6 link bonding 2015-01-30 10:57 ` [dpdk-dev] [PATCH 3/4] bond: add debug info for mode 6 link bonding Michal Jastrzebski @ 2015-01-30 11:09 ` Jastrzebski, MichalX K 0 siblings, 0 replies; 7+ messages in thread From: Jastrzebski, MichalX K @ 2015-01-30 11:09 UTC (permalink / raw) To: dev > -----Original Message----- > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Michal Jastrzebski > Sent: Friday, January 30, 2015 11:58 AM > To: dev@dpdk.org > Subject: [dpdk-dev] [PATCH 3/4] bond: add debug info for mode 6 link > bonding > > > Signed-off-by: Michal Jastrzebski <michalx.k.jastrzebski@intel.com> > --- > config/common_linuxapp | 2 +- > lib/librte_pmd_bond/rte_eth_bond_pmd.c | 124 > ++++++++++++++++++++++++++++++++ > 2 files changed, 125 insertions(+), 1 deletion(-) > > diff --git a/config/common_linuxapp b/config/common_linuxapp > index 2f9643b..1cc2d7e 100644 > --- a/config/common_linuxapp > +++ b/config/common_linuxapp > @@ -220,7 +220,7 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=n > # Compile link bonding PMD library > # > CONFIG_RTE_LIBRTE_PMD_BOND=y > - > +CONFIG_RTE_LIBRTE_BOND_DEBUG_ALB=n > # > # Compile software PMD backed by AF_PACKET sockets (Linux only) > # > diff --git a/lib/librte_pmd_bond/rte_eth_bond_pmd.c > b/lib/librte_pmd_bond/rte_eth_bond_pmd.c > index b0525cc..348c653 100644 > --- a/lib/librte_pmd_bond/rte_eth_bond_pmd.c > +++ b/lib/librte_pmd_bond/rte_eth_bond_pmd.c > @@ -208,6 +208,78 @@ bond_ethdev_rx_burst_8023ad(void *queue, struct > rte_mbuf **bufs, > return num_rx_total; > } > > +#ifdef RTE_LIBRTE_BOND_DEBUG_ALB > +uint32_t burstnumberRX; > +uint32_t burstnumberTX; > + > +static void > +arp_op_name(uint16_t arp_op, char *buf) > +{ > + switch (arp_op) { > + case ARP_OP_REQUEST: > + snprintf(buf, sizeof("ARP Request"), "%s", "ARP Request"); > + return; > + case ARP_OP_REPLY: > + snprintf(buf, sizeof("ARP Reply"), "%s", "ARP Reply"); > + return; > + case ARP_OP_REVREQUEST: > + snprintf(buf, sizeof("Reverse ARP Request"), "%s", "Reverse > ARP Request"); > + return; > + case ARP_OP_REVREPLY: > + snprintf(buf, sizeof("Reverse ARP Reply"), "%s", "Reverse ARP > Reply"); > + return; > + case ARP_OP_INVREQUEST: > + snprintf(buf, sizeof("Peer Identify Request"), "%s", "Peer > Identify Request"); > + return; > + case ARP_OP_INVREPLY: > + snprintf(buf, sizeof("Peer Identify Reply"), "%s", "Peer > Identify Reply"); > + return; > + default: > + break; > + } > + snprintf(buf, sizeof("Unknown"), "%s", "Unknown"); > + return; > +} > +#define MaxIPv4String 16 > +static void > +ipv4_addr_to_dot(uint32_t be_ipv4_addr, char *buf, uint8_t buf_size) > +{ > + uint32_t ipv4_addr; > + > + ipv4_addr = rte_be_to_cpu_32(be_ipv4_addr); > + snprintf(buf, buf_size, "%d.%d.%d.%d", (ipv4_addr >> 24) & 0xFF, > + (ipv4_addr >> 16) & 0xFF, (ipv4_addr >> 8) & 0xFF, > + ipv4_addr & 0xFF); > +} > + > +#define MODE6_DEBUG(info, src_ip, dst_ip, eth_h, arp_op, port, > burstnumber) \ > + RTE_LOG(DEBUG, PMD, info \ > + "port:%d " \ > + "SrcMAC:%02X:%02X:%02X:%02X:%02X:%02X " \ > + "SrcIP:%s " \ > + "DstMAC:%02X:%02X:%02X:%02X:%02X:%02X " \ > + "DstIP:%s " \ > + "%s " \ > + "%d\n", \ > + port, \ > + eth_h->s_addr.addr_bytes[0], \ > + eth_h->s_addr.addr_bytes[1], \ > + eth_h->s_addr.addr_bytes[2], \ > + eth_h->s_addr.addr_bytes[3], \ > + eth_h->s_addr.addr_bytes[4], \ > + eth_h->s_addr.addr_bytes[5], \ > + src_ip, \ > + eth_h->d_addr.addr_bytes[0], \ > + eth_h->d_addr.addr_bytes[1], \ > + eth_h->d_addr.addr_bytes[2], \ > + eth_h->d_addr.addr_bytes[3], \ > + eth_h->d_addr.addr_bytes[4], \ > + eth_h->d_addr.addr_bytes[5], \ > + dst_ip, \ > + arp_op, \ > + ++burstnumber) > +#endif > + > static uint16_t > bond_ethdev_rx_burst_alb(void *queue, struct rte_mbuf **bufs, uint16_t > nb_pkts) > { > @@ -222,6 +294,13 @@ bond_ethdev_rx_burst_alb(void *queue, struct > rte_mbuf **bufs, uint16_t nb_pkts) > int i; > > nb_recv_pkts = bond_ethdev_rx_burst(queue, bufs, nb_pkts); > +#ifdef RTE_LIBRTE_BOND_DEBUG_ALB > + struct arp_hdr *arp_h; > + struct ipv4_hdr *ipv4_h; > + char src_ip[16]; > + char dst_ip[16]; > + char ArpOp[24]; > +#endif > > for (i = 0; i < nb_recv_pkts; i++) { > eth_h = rte_pktmbuf_mtod(bufs[i], struct ether_hdr *); > @@ -229,8 +308,23 @@ bond_ethdev_rx_burst_alb(void *queue, struct > rte_mbuf **bufs, uint16_t nb_pkts) > ether_type = get_vlan_ethertype(eth_h); > > if (ether_type == rte_cpu_to_be_16(ETHER_TYPE_ARP)) { > +#ifdef RTE_LIBRTE_BOND_DEBUG_ALB > + arp_h = (struct arp_hdr *)((char *)(eth_h + 1) + > offset); > + ipv4_addr_to_dot(arp_h->arp_data.arp_sip, src_ip, > MaxIPv4String); > + ipv4_addr_to_dot(arp_h->arp_data.arp_tip, dst_ip, > MaxIPv4String); > + arp_op_name(rte_be_to_cpu_16(arp_h->arp_op), > ArpOp); > + MODE6_DEBUG("RX ARP:", src_ip, dst_ip, eth_h, > ArpOp, bufs[i]->port, burstnumberRX); > +#endif > bond_mode_alb_arp_recv(eth_h, offset, internals); > } > +#ifdef RTE_LIBRTE_BOND_DEBUG_ALB > + else if (ether_type == rte_cpu_to_be_16(ETHER_TYPE_IPv4)) { > + ipv4_h = (struct ipv4_hdr *)((char *)(eth_h + 1) + > offset); > + ipv4_addr_to_dot(ipv4_h->src_addr, src_ip, > MaxIPv4String); > + ipv4_addr_to_dot(ipv4_h->dst_addr, dst_ip, > MaxIPv4String); > + MODE6_DEBUG("RX IPv4:", src_ip, dst_ip, eth_h, "", > bufs[i]->port, burstnumberRX); > + } > +#endif > } > > return nb_recv_pkts; > @@ -653,6 +747,12 @@ bond_ethdev_tx_burst_alb(void *queue, struct > rte_mbuf **bufs, uint16_t nb_pkts) > internals->mode6.ntt = 0; > } > > +#ifdef RTE_LIBRTE_BOND_DEBUG_ALB > + struct arp_hdr *arp_h; > + char src_ip[16]; > + char dst_ip[16]; > + char ArpOp[24]; > +#endif > /* Send ARP packets on proper slaves */ > for (i = 0; i < RTE_MAX_ETHPORTS; i++) { > if (slave_bufs_pkts[i] > 0) { > @@ -665,6 +765,19 @@ bond_ethdev_tx_burst_alb(void *queue, struct > rte_mbuf **bufs, uint16_t nb_pkts) > > num_tx_total += num_send; > num_not_send += slave_bufs_pkts[i] - num_send; > + > +#ifdef RTE_LIBRTE_BOND_DEBUG_ALB > + /* Print TX stats including update packets */ > + for (j = 0; j < slave_bufs_pkts[i]; j++) { > + eth_h = rte_pktmbuf_mtod(slave_bufs[i][j], > struct ether_hdr *); > + offset = get_vlan_offset(eth_h); > + arp_h = (struct arp_hdr *)((char *)(eth_h + 1) > + offset); > + ipv4_addr_to_dot(arp_h->arp_data.arp_sip, > src_ip, MaxIPv4String); > + ipv4_addr_to_dot(arp_h->arp_data.arp_tip, > dst_ip, MaxIPv4String); > + arp_op_name(rte_be_to_cpu_16(arp_h- > >arp_op), ArpOp); > + MODE6_DEBUG("TX ARP:", src_ip, dst_ip, > eth_h, ArpOp, i, burstnumberTX); > + } > +#endif > } > } > > @@ -676,6 +789,17 @@ bond_ethdev_tx_burst_alb(void *queue, struct > rte_mbuf **bufs, uint16_t nb_pkts) > for (j = num_send; j < update_bufs_pkts[i]; j++) { > rte_pktmbuf_free(update_bufs[i][j]); > } > +#ifdef RTE_LIBRTE_BOND_DEBUG_ALB > + for (j = 0; j < update_bufs_pkts[i]; j++) { > + eth_h = rte_pktmbuf_mtod(update_bufs[i][j], > struct ether_hdr *); > + offset = get_vlan_offset(eth_h); > + arp_h = (struct arp_hdr *)((char *)(eth_h + 1) > + offset); > + ipv4_addr_to_dot(arp_h->arp_data.arp_sip, > src_ip, MaxIPv4String); > + ipv4_addr_to_dot(arp_h->arp_data.arp_tip, > dst_ip, MaxIPv4String); > + arp_op_name(rte_be_to_cpu_16(arp_h- > >arp_op), ArpOp); > + MODE6_DEBUG("TX ARPupd:", src_ip, dst_ip, > eth_h, ArpOp, i, burstnumberTX); > + } > +#endif > } > } > > -- > 1.7.9.5 This patch add some debug information when using link bonding mode 6. It prints basic information about ARP packets on RX and TX (MAC, ip, packet number, arp packet type). ^ permalink raw reply [flat|nested] 7+ messages in thread
* [dpdk-dev] [PATCH 4/4] bond: added example application for link bonding mode 6. 2015-01-30 10:57 [dpdk-dev] [PATCH 0/4] Link Bonding mode 6 support (ALB) Michal Jastrzebski ` (2 preceding siblings ...) 2015-01-30 10:57 ` [dpdk-dev] [PATCH 3/4] bond: add debug info for mode 6 link bonding Michal Jastrzebski @ 2015-01-30 10:57 ` Michal Jastrzebski 2015-01-30 11:27 ` Jastrzebski, MichalX K 3 siblings, 1 reply; 7+ messages in thread From: Michal Jastrzebski @ 2015-01-30 10:57 UTC (permalink / raw) To: dev Signed-off-by: Michal Jastrzebski <michalx.k.jastrzebski@intel.com> Signed-off-by: Maciej Gajdzica <maciejx.t.gajdzica@intel.com> --- examples/bond/Makefile | 57 ++++ examples/bond/main.c | 790 ++++++++++++++++++++++++++++++++++++++++++++++++ examples/bond/main.h | 46 +++ 3 files changed, 893 insertions(+) create mode 100644 examples/bond/Makefile create mode 100644 examples/bond/main.c create mode 100644 examples/bond/main.h diff --git a/examples/bond/Makefile b/examples/bond/Makefile new file mode 100644 index 0000000..9262249 --- /dev/null +++ b/examples/bond/Makefile @@ -0,0 +1,57 @@ +# BSD LICENSE +# +# Copyright(c) 2010-2014 Intel Corporation. All rights reserved. +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Intel Corporation nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +ifeq ($(RTE_SDK),) +$(error "Please define RTE_SDK environment variable") +endif + +# Default target, can be overriden by command line or environment +RTE_TARGET ?= x86_64-native-linuxapp-gcc + +include $(RTE_SDK)/mk/rte.vars.mk + +# binary name +APP = bond_app + +# all source are stored in SRCS-y +SRCS-y := main.c + +CFLAGS += $(WERROR_FLAGS) + +# workaround for a gcc bug with noreturn attribute +# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603 +ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y) +CFLAGS_main.o += -Wno-return-type +endif + +CFLAGS += -O3 + +include $(RTE_SDK)/mk/rte.extapp.mk diff --git a/examples/bond/main.c b/examples/bond/main.c new file mode 100644 index 0000000..57cc672 --- /dev/null +++ b/examples/bond/main.c @@ -0,0 +1,790 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include <stdint.h> +#include <sys/queue.h> +#include <stdlib.h> +#include <string.h> +#include <stdio.h> +#include <assert.h> +#include <errno.h> +#include <signal.h> +#include <stdarg.h> +#include <inttypes.h> +#include <getopt.h> +#include <termios.h> +#include <unistd.h> +#include <pthread.h> + +#include <rte_common.h> +#include <rte_log.h> +#include <rte_memory.h> +#include <rte_memcpy.h> +#include <rte_memzone.h> +#include <rte_tailq.h> +#include <rte_eal.h> +#include <rte_per_lcore.h> +#include <rte_launch.h> +#include <rte_atomic.h> +#include <rte_cycles.h> +#include <rte_prefetch.h> +#include <rte_lcore.h> +#include <rte_per_lcore.h> +#include <rte_branch_prediction.h> +#include <rte_interrupts.h> +#include <rte_pci.h> +#include <rte_random.h> +#include <rte_debug.h> +#include <rte_ether.h> +#include <rte_ethdev.h> +#include <rte_ring.h> +#include <rte_log.h> +#include <rte_mempool.h> +#include <rte_mbuf.h> +#include <rte_memcpy.h> +#include <rte_ip.h> +#include <rte_tcp.h> +#include <rte_arp.h> +#include <rte_spinlock.h> + +#include <cmdline_rdline.h> +#include <cmdline_parse.h> +#include <cmdline_parse_num.h> +#include <cmdline_parse_string.h> +#include <cmdline_parse_ipaddr.h> +#include <cmdline_parse_etheraddr.h> +#include <cmdline_socket.h> +#include <cmdline.h> + +#include "main.h" + +#include <rte_devargs.h> + + +#include "rte_byteorder.h" +#include "rte_cpuflags.h" +#include "rte_eth_bond.h" + +#define RTE_LOGTYPE_DCB RTE_LOGTYPE_USER1 + +#define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM) +#define NB_MBUF (1024*8) + +#define MAX_PKT_BURST 32 +#define BURST_TX_DRAIN_US 100 /* TX drain every ~100us */ +#define BURST_RX_INTERVAL_NS (10) /* RX poll interval ~100ns */ + +/* + * RX and TX Prefetch, Host, and Write-back threshold values should be + * carefully set for optimal performance. Consult the network + * controller's datasheet and supporting DPDK documentation for guidance + * on how these parameters should be set. + */ +#define RX_PTHRESH 8 /**< Default values of RX prefetch threshold reg. */ +#define RX_HTHRESH 8 /**< Default values of RX host threshold reg. */ +#define RX_WTHRESH 4 /**< Default values of RX write-back threshold reg. */ +#define RX_FTHRESH (MAX_PKT_BURST * 2)/**< Default values of RX free threshold reg. */ + +/* + * These default values are optimized for use with the Intel(R) 82599 10 GbE + * Controller and the DPDK ixgbe PMD. Consider using other values for other + * network controllers and/or network drivers. + */ +#define TX_PTHRESH 36 /**< Default values of TX prefetch threshold reg. */ +#define TX_HTHRESH 0 /**< Default values of TX host threshold reg. */ +#define TX_WTHRESH 0 /**< Default values of TX write-back threshold reg. */ + +/* + * Configurable number of RX/TX ring descriptors + */ +#define RTE_RX_DESC_DEFAULT 128 +#define RTE_TX_DESC_DEFAULT 512 + +#define BOND_IP_1 7 +#define BOND_IP_2 0 +#define BOND_IP_3 0 +#define BOND_IP_4 10 + +/* not defined under linux */ +#ifndef NIPQUAD +#define NIPQUAD_FMT "%u.%u.%u.%u" +#define NIPQUAD(addr) \ + (unsigned)((unsigned char *)&addr)[0], \ + (unsigned)((unsigned char *)&addr)[1], \ + (unsigned)((unsigned char *)&addr)[2], \ + (unsigned)((unsigned char *)&addr)[3] +#endif + +#define MAX_PORTS 4 +#define PRINT_MAC(addr) printf("%02"PRIx8":%02"PRIx8":%02"PRIx8 \ + ":%02"PRIx8":%02"PRIx8":%02"PRIx8, \ + addr.addr_bytes[0], addr.addr_bytes[1], addr.addr_bytes[2], \ + addr.addr_bytes[3], addr.addr_bytes[4], addr.addr_bytes[5]) + +uint8_t slaves[RTE_MAX_ETHPORTS]; +uint8_t slaves_count; + +static uint8_t BOND_PORT = 0xff; + +static struct rte_mempool *mbuf_pool; + +/* + * RX and TX Prefetch, Host, and Write-back threshold values should be + * carefully set for optimal performance. Consult the network + * controller's datasheet and supporting DPDK documentation for guidance + * on how these parameters should be set. + */ +/* Default configuration for rx and tx thresholds etc. */ +static const struct rte_eth_rxconf rx_conf_default = { + .rx_thresh = { + .pthresh = RX_PTHRESH, + .hthresh = RX_HTHRESH, + .wthresh = RX_WTHRESH, + + }, + .rx_free_thresh = RX_FTHRESH, +}; + +/* + * These default values are optimized for use with the Intel(R) 82599 10 GbE + * Controller and the DPDK ixgbe PMD. Consider using other values for other + * network controllers and/or network drivers. + */ +static const struct rte_eth_txconf tx_conf_default = { + .tx_thresh = { + .pthresh = TX_PTHRESH, + .hthresh = TX_HTHRESH, + .wthresh = TX_WTHRESH, + }, + .tx_free_thresh = 0, /* Use PMD default values */ + .tx_rs_thresh = 0, /* Use PMD default values */ + .txq_flags = ETH_TXQ_FLAGS_NOMULTSEGS | ETH_TXQ_FLAGS_NOOFFLOADS, +}; + +static struct rte_eth_conf port_conf = { + .rxmode = { + .mq_mode = ETH_MQ_RX_NONE, + .max_rx_pkt_len = ETHER_MAX_LEN, + .split_hdr_size = 0, + .header_split = 0, /**< Header Split disabled */ + .hw_ip_checksum = 0, /**< IP checksum offload enabled */ + .hw_vlan_filter = 0, /**< VLAN filtering disabled */ + .jumbo_frame = 0, /**< Jumbo Frame Support disabled */ + .hw_strip_crc = 0, /**< CRC stripped by hardware */ + }, + .rx_adv_conf = { + .rss_conf = { + .rss_key = NULL, + .rss_hf = ETH_RSS_IP, + }, + }, + .txmode = { + .mq_mode = ETH_MQ_TX_NONE, + }, +}; + +static void +slave_port_init(uint8_t portid, struct rte_mempool *mbuf_pool) +{ + int retval; + + if (portid >= rte_eth_dev_count()) + rte_exit(EXIT_FAILURE, "Invalid port\n"); + + retval = rte_eth_dev_configure(portid, 1, 1, &port_conf); + if (retval != 0) + rte_exit(EXIT_FAILURE, "port %u: configuration failed (res=%d)\n", \ + portid, retval); + + /* RX setup */ + retval = rte_eth_rx_queue_setup(portid, 0, RTE_RX_DESC_DEFAULT, + rte_eth_dev_socket_id(portid), &rx_conf_default, + mbuf_pool); + if (retval < 0) + rte_exit(retval, " port %u: RX queue 0 setup failed (res=%d)", \ + portid, retval); + + /* TX setup */ + retval = rte_eth_tx_queue_setup(portid, 0, RTE_TX_DESC_DEFAULT, + rte_eth_dev_socket_id(portid), &tx_conf_default); + + if (retval < 0) + rte_exit(retval, "port %u: TX queue 0 setup failed (res=%d)", \ + portid, retval); + + retval = rte_eth_dev_start(portid); + if (retval < 0) + rte_exit(retval, \ + "Start port %d failed (res=%d)", \ + portid, retval); + + struct ether_addr addr; + rte_eth_macaddr_get(portid, &addr); + printf("Port %u MAC: ", (unsigned)portid); + PRINT_MAC(addr); + printf("\n"); +} + +static void +bond_port_init(struct rte_mempool *mbuf_pool) +{ + int retval; + uint8_t i; + + retval = rte_eth_bond_create("bond0", BONDING_MODE_ALB, 0 /*SOCKET_ID_ANY*/); + if (retval < 0) + rte_exit(EXIT_FAILURE, \ + "Faled to create bond port\n"); + + BOND_PORT = (uint8_t)retval; + + retval = rte_eth_dev_configure(BOND_PORT, 1, 1, &port_conf); + if (retval != 0) + rte_exit(EXIT_FAILURE, "port %u: configuration failed (res=%d)\n", \ + BOND_PORT, retval); + + /* RX setup */ + retval = rte_eth_rx_queue_setup(BOND_PORT, 0, RTE_RX_DESC_DEFAULT, + rte_eth_dev_socket_id(BOND_PORT), &rx_conf_default, + mbuf_pool); + if (retval < 0) + rte_exit(retval, " port %u: RX queue 0 setup failed (res=%d)", \ + BOND_PORT, retval); + + /* TX setup */ + retval = rte_eth_tx_queue_setup(BOND_PORT, 0, RTE_TX_DESC_DEFAULT, + rte_eth_dev_socket_id(BOND_PORT), &tx_conf_default); + + if (retval < 0) + rte_exit(retval, "port %u: TX queue 0 setup failed (res=%d)", \ + BOND_PORT, retval); + + for (i = 0; i < slaves_count; i++) { + if (rte_eth_bond_slave_add(BOND_PORT, slaves[i]) == -1) + rte_exit(-1, "Oooops! adding slave (%u) to bond (%u) failed!\n", \ + slaves[i], BOND_PORT); + + } + + retval = rte_eth_dev_start(BOND_PORT); + if (retval < 0) + rte_exit(retval, "Start port %d failed (res=%d)", BOND_PORT, retval); + + rte_eth_promiscuous_enable(BOND_PORT); + + struct ether_addr addr; + rte_eth_macaddr_get(BOND_PORT, &addr); + printf("Port %u MAC: ", (unsigned)BOND_PORT); + PRINT_MAC(addr); + printf("\n"); +} + +struct global_flag_stru_t { + int LcoreMainIsRunning; + int LcoreMainCore; + uint32_t port_packets[4]; + rte_spinlock_t lock; +}; +struct global_flag_stru_t global_flag_stru; +struct global_flag_stru_t *global_flag_stru_p = &global_flag_stru; + +/* + * Main thread that does the work, reading from INPUT_PORT + * and writing to OUTPUT_PORT + */ +static int lcore_main(__attribute__((unused)) void *arg1) +{ + struct rte_mbuf *pkts[MAX_PKT_BURST] __rte_cache_aligned; + struct ether_addr d_addr; + + struct ether_hdr *eth_hdr; + struct arp_hdr *arp_hdr; + + uint16_t rx_cnt; + uint32_t bond_ip; + int i = 0; + uint8_t is_free; + + bond_ip = BOND_IP_1 | (BOND_IP_2 << 8) | + (BOND_IP_3 << 16) | (BOND_IP_4 << 24); + + rte_spinlock_trylock(&global_flag_stru_p->lock); + + while (global_flag_stru_p->LcoreMainIsRunning) { + rte_spinlock_unlock(&global_flag_stru_p->lock); + rx_cnt = rte_eth_rx_burst(BOND_PORT, 0, pkts, MAX_PKT_BURST); + is_free = 0; + + /* If didn't receive any packets, wait and go to next iteration */ + if (rx_cnt == 0) { + rte_delay_us(50); + continue; + } + + /* Search incoming data for ARP packets and prepare response */ + for (i = 0; i < rx_cnt; i++) { + if (rte_spinlock_trylock(&global_flag_stru_p->lock) == 1) { + global_flag_stru_p->port_packets[0]++; + rte_spinlock_unlock(&global_flag_stru_p->lock); + } + eth_hdr = rte_pktmbuf_mtod(pkts[i], struct ether_hdr *); + if (eth_hdr->ether_type == rte_cpu_to_be_16(ETHER_TYPE_ARP)) { + if (rte_spinlock_trylock(&global_flag_stru_p->lock) == 1) { + global_flag_stru_p->port_packets[1]++; + rte_spinlock_unlock(&global_flag_stru_p->lock); + } + arp_hdr = (struct arp_hdr *)((char *)eth_hdr + sizeof(struct ether_hdr)); + if (arp_hdr->arp_data.arp_tip == bond_ip) { + if (arp_hdr->arp_op == rte_cpu_to_be_16(ARP_OP_REQUEST)) { + arp_hdr->arp_op = rte_cpu_to_be_16(ARP_OP_REPLY); + + /* Switch src and dst data and set bonding MAC */ + ether_addr_copy(ð_hdr->s_addr, ð_hdr->d_addr); + rte_eth_macaddr_get(BOND_PORT, ð_hdr->s_addr); + ether_addr_copy(&arp_hdr->arp_data.arp_sha, &arp_hdr->arp_data.arp_tha); + arp_hdr->arp_data.arp_tip = arp_hdr->arp_data.arp_sip; + rte_eth_macaddr_get(BOND_PORT, &d_addr); + ether_addr_copy(&d_addr, &arp_hdr->arp_data.arp_sha); + arp_hdr->arp_data.arp_sip = bond_ip; + rte_eth_tx_burst(BOND_PORT, 0, &pkts[i], 1); + is_free = 1; + } else { + rte_eth_tx_burst(BOND_PORT, 0, NULL, 0); + } + } + } + + /* Free processed packets */ + if (is_free == 0) + rte_pktmbuf_free(pkts[i]); + } + rte_spinlock_trylock(&global_flag_stru_p->lock); + } + rte_spinlock_unlock(&global_flag_stru_p->lock); + printf("BYE lcore_main\n"); + return 0; +} + +/**********************************************************/ +/**********************************************************/ +struct cmd_obj_send_result { + cmdline_fixed_string_t action; + uint32_t count; + cmdline_ipaddr_t ip; +}; + +static void cmd_obj_send_parsed(void *parsed_result, + __attribute__((unused)) struct cmdline *cl, + __attribute__((unused)) void *data) +{ + + struct cmd_obj_send_result *res = parsed_result; + char ip_str[INET6_ADDRSTRLEN]; + + struct rte_mbuf *created_pkt; + struct ether_hdr *eth_hdr; + struct arp_hdr *arp_hdr; + + uint32_t bond_ip; + size_t pkt_size; + + if (res->ip.family == AF_INET) + snprintf(ip_str, sizeof(ip_str), NIPQUAD_FMT, + NIPQUAD(res->ip.addr.ipv4)); + else + cmdline_printf(cl, "Wrong IP format. Only IPv4 is supported\n"); + + bond_ip = BOND_IP_1 | (BOND_IP_2 << 8) | + (BOND_IP_3 << 16) | (BOND_IP_4 << 24); + + created_pkt = rte_pktmbuf_alloc(mbuf_pool); + while (created_pkt == NULL); + + pkt_size = sizeof(struct ether_hdr) + sizeof(struct arp_hdr); + created_pkt->data_len = pkt_size; + created_pkt->pkt_len = pkt_size; + + eth_hdr = rte_pktmbuf_mtod(created_pkt, struct ether_hdr *); + rte_eth_macaddr_get(BOND_PORT, ð_hdr->s_addr); + memset(ð_hdr->d_addr, 0xFF, ETHER_ADDR_LEN); + eth_hdr->ether_type = rte_cpu_to_be_16(ETHER_TYPE_ARP); + + arp_hdr = (struct arp_hdr *)((char *)eth_hdr + sizeof(struct ether_hdr)); + arp_hdr->arp_hrd = rte_cpu_to_be_16(ARP_HRD_ETHER); + arp_hdr->arp_pro = rte_cpu_to_be_16(ETHER_TYPE_IPv4); + arp_hdr->arp_hln = ETHER_ADDR_LEN; + arp_hdr->arp_pln = sizeof(uint32_t); + arp_hdr->arp_op = rte_cpu_to_be_16(ARP_OP_REQUEST); + + rte_eth_macaddr_get(BOND_PORT, &arp_hdr->arp_data.arp_sha); + arp_hdr->arp_data.arp_sip = bond_ip; + memset(&arp_hdr->arp_data.arp_tha, 0, ETHER_ADDR_LEN); + arp_hdr->arp_data.arp_tip = + ((unsigned char *)&res->ip.addr.ipv4)[0] | + (((unsigned char *)&res->ip.addr.ipv4)[1] << 8) | + (((unsigned char *)&res->ip.addr.ipv4)[2] << 16) | + (((unsigned char *)&res->ip.addr.ipv4)[3] << 24); + rte_eth_tx_burst(BOND_PORT, 0, &created_pkt, 1); + + rte_delay_ms(500); + cmdline_printf(cl, "\n"); +} + +cmdline_parse_token_string_t cmd_obj_action_send = + TOKEN_STRING_INITIALIZER(struct cmd_obj_send_result, action, "send"); +cmdline_parse_token_num_t cmd_obj_count = + TOKEN_NUM_INITIALIZER(struct cmd_obj_send_result, count, UINT32); +cmdline_parse_token_ipaddr_t cmd_obj_ip = + TOKEN_IPADDR_INITIALIZER(struct cmd_obj_send_result, ip); + +cmdline_parse_inst_t cmd_obj_send = { + .f = cmd_obj_send_parsed, /* function to call */ + .data = NULL, /* 2nd arg of func */ + .help_str = "send arp", + .tokens = { /* token list, NULL terminated */ + (void *)&cmd_obj_action_send, + (void *)&cmd_obj_count, + (void *)&cmd_obj_ip, + NULL, + }, +}; + +/**********************************************************/ + + +/**********************************************************/ +struct cmd_start_result { + cmdline_fixed_string_t start; +}; + +static void cmd_start_parsed(__attribute__((unused)) void *parsed_result, + struct cmdline *cl, + __attribute__((unused)) void *data) +{ + int slave_core_id = rte_lcore_id(); + + rte_spinlock_trylock(&global_flag_stru_p->lock); + if (global_flag_stru_p->LcoreMainIsRunning == 0) { + if (lcore_config[global_flag_stru_p->LcoreMainCore].state != WAIT) { + rte_spinlock_unlock(&global_flag_stru_p->lock); + return; + } + rte_spinlock_unlock(&global_flag_stru_p->lock); + } else { + cmdline_printf(cl, "lcore_main already running on core:%d\n", + global_flag_stru_p->LcoreMainCore); + rte_spinlock_unlock(&global_flag_stru_p->lock); + return; + } + + /* start lcore main on core != master_core - ARP response thread */ + slave_core_id = rte_get_next_lcore(rte_lcore_id(), 1, 0); + if ((slave_core_id >= RTE_MAX_LCORE) || (slave_core_id == 0)) + return; + + rte_spinlock_trylock(&global_flag_stru_p->lock); + global_flag_stru_p->LcoreMainIsRunning = 1; + rte_spinlock_unlock(&global_flag_stru_p->lock); + cmdline_printf(cl, "Starting lcore_main on core %d:%d " + "Our IP:%d.%d.%d.%d\n", + slave_core_id, + rte_eal_remote_launch(lcore_main, NULL, slave_core_id), + BOND_IP_1, + BOND_IP_2, + BOND_IP_3, + BOND_IP_4 + ); +} + +cmdline_parse_token_string_t cmd_start_start = + TOKEN_STRING_INITIALIZER(struct cmd_start_result, start, "start"); + +cmdline_parse_inst_t cmd_start = { + .f = cmd_start_parsed, /* function to call */ + .data = NULL, /* 2nd arg of func */ + .help_str = "starts listening if not started at startup", + .tokens = { /* token list, NULL terminated */ + (void *)&cmd_start_start, + NULL, + }, +}; + +/**********************************************************/ + +struct cmd_help_result { + cmdline_fixed_string_t help; +}; + +static void cmd_help_parsed(__attribute__((unused)) void *parsed_result, + struct cmdline *cl, + __attribute__((unused)) void *data) +{ + cmdline_printf(cl, + "ALB - link bonding mode 6 example\n" + "send COUNT IP - sends COUNT ARPrequests thru bonding for IP.\n" + "start - starts listening ARPs.\n" + "stop - stops lcore_main.\n" + "show - shows some bond info: ex. active slaves etc.\n" + "help - prints help.\n" + "quit - terminate all threads and quit.\n" + ); +} + +cmdline_parse_token_string_t cmd_help_help = + TOKEN_STRING_INITIALIZER(struct cmd_help_result, help, "help"); + +cmdline_parse_inst_t cmd_help = { + .f = cmd_help_parsed, /* function to call */ + .data = NULL, /* 2nd arg of func */ + .help_str = "show help", + .tokens = { /* token list, NULL terminated */ + (void *)&cmd_help_help, + NULL, + }, +}; + +/**********************************************************/ +struct cmd_stop_result { + cmdline_fixed_string_t stop; +}; + +static void cmd_stop_parsed(__attribute__((unused)) void *parsed_result, + struct cmdline *cl, + __attribute__((unused)) void *data) +{ + rte_spinlock_trylock(&global_flag_stru_p->lock); + if (global_flag_stru_p->LcoreMainIsRunning == 0) { + cmdline_printf(cl, + "lcore_main not running on core:%d\n", + global_flag_stru_p->LcoreMainCore); + rte_spinlock_unlock(&global_flag_stru_p->lock); + return; + } + global_flag_stru_p->LcoreMainIsRunning = 0; + rte_eal_wait_lcore(global_flag_stru_p->LcoreMainCore); + cmdline_printf(cl, + "lcore_main stopped on core:%d\n", + global_flag_stru_p->LcoreMainCore); + rte_spinlock_unlock(&global_flag_stru_p->lock); +} + +cmdline_parse_token_string_t cmd_stop_stop = + TOKEN_STRING_INITIALIZER(struct cmd_stop_result, stop, "stop"); + +cmdline_parse_inst_t cmd_stop = { + .f = cmd_stop_parsed, /* function to call */ + .data = NULL, /* 2nd arg of func */ + .help_str = "this command do not handle any arguments", + .tokens = { /* token list, NULL terminated */ + (void *)&cmd_stop_stop, + NULL, + }, +}; +/**********************************************************/ + +/**********************************************************/ +struct cmd_quit_result { + cmdline_fixed_string_t quit; +}; + +static void cmd_quit_parsed(__attribute__((unused)) void *parsed_result, + struct cmdline *cl, + __attribute__((unused)) void *data) +{ + cmdline_printf(cl, + "quit - for quit just do ctrl+d\n" + ); + exit(EXIT_SUCCESS); +} + +cmdline_parse_token_string_t cmd_quit_quit = + TOKEN_STRING_INITIALIZER(struct cmd_quit_result, quit, "quit"); + +cmdline_parse_inst_t cmd_quit = { + .f = cmd_quit_parsed, /* function to call */ + .data = NULL, /* 2nd arg of func */ + .help_str = "this command do not handle any arguments", + .tokens = { /* token list, NULL terminated */ + (void *)&cmd_quit_quit, + NULL, + }, +}; +/**********************************************************/ + +/**********************************************************/ +struct cmd_show_result { + cmdline_fixed_string_t show; +}; + +static void cmd_show_parsed(__attribute__((unused)) void *parsed_result, + struct cmdline *cl, + __attribute__((unused)) void *data) +{ + uint8_t slaves[16] = {0}; + uint8_t len = 16; + struct ether_addr addr; + uint8_t i = 0; + + while (i < slaves_count) { + rte_eth_macaddr_get(i, &addr); + PRINT_MAC(addr); + if (i == BOND_PORT) + printf(" - current primary slave"); + printf("\n"); + i++; + } + rte_eth_macaddr_get(i, &addr); + PRINT_MAC(addr); + if (i == BOND_PORT) + printf(" - current primary slave"); + printf("\n"); + rte_spinlock_trylock(&global_flag_stru_p->lock); + cmdline_printf(cl, + "Active_slaves:%d " + "packets received:Tot:%d Arp:%d\n", + rte_eth_bond_active_slaves_get(BOND_PORT, slaves, len), + global_flag_stru_p->port_packets[0], + global_flag_stru_p->port_packets[1]); + rte_spinlock_unlock(&global_flag_stru_p->lock); +} + +cmdline_parse_token_string_t cmd_show_show = + TOKEN_STRING_INITIALIZER(struct cmd_show_result, show, "show"); + +cmdline_parse_inst_t cmd_show = { + .f = cmd_show_parsed, /* function to call */ + .data = NULL, /* 2nd arg of func */ + .help_str = "this command do not handle any arguments", + .tokens = { /* token list, NULL terminated */ + (void *)&cmd_show_show, + NULL, + }, +}; +/**********************************************************/ + +/**********************************************************/ +/****** CONTEXT (list of instruction) */ + +cmdline_parse_ctx_t main_ctx[] = { + (cmdline_parse_inst_t *)&cmd_start, + (cmdline_parse_inst_t *)&cmd_obj_send, + (cmdline_parse_inst_t *)&cmd_stop, + (cmdline_parse_inst_t *)&cmd_show, + (cmdline_parse_inst_t *)&cmd_quit, + (cmdline_parse_inst_t *)&cmd_help, + NULL, +}; + +/* prompt function, called from main on MASTER lcore */ +static void *prompt(__attribute__((unused)) void *arg1) +{ + struct cmdline *cl; + + cl = cmdline_stdin_new(main_ctx, "bond6>"); + if (cl != NULL) { + cmdline_interact(cl); + cmdline_stdin_exit(cl); + } +} + +/* Main function, does initialisation and calls the per-lcore functions */ +int +MAIN(int argc, char *argv[]) +{ + int ret; + uint8_t nb_ports, i; + + /* init EAL */ + ret = rte_eal_init(argc, argv); + rte_eal_devargs_dump(stdout); + if (ret < 0) + rte_exit(EXIT_FAILURE, "Error with EAL initialization\n"); + argc -= ret; + argv += ret; + + nb_ports = rte_eth_dev_count(); + if (nb_ports == 0) + rte_exit(EXIT_FAILURE, "Give at least one port\n"); + else if (nb_ports > MAX_PORTS) + rte_exit(EXIT_FAILURE, "You can have max 4 ports\n"); + + mbuf_pool = rte_mempool_create("MBUF_POOL", NB_MBUF, + MBUF_SIZE, 32, + sizeof(struct rte_pktmbuf_pool_private), + rte_pktmbuf_pool_init, NULL, + rte_pktmbuf_init, NULL, + rte_socket_id(), 0); + if (mbuf_pool == NULL) + rte_exit(EXIT_FAILURE, "Cannot create mbuf pool\n"); + + /* initialize all ports */ + slaves_count = nb_ports; + for (i = 0; i < nb_ports; i++) { + slave_port_init(i, mbuf_pool); + slaves[i] = i; + } + + bond_port_init(mbuf_pool); + + rte_spinlock_init(&global_flag_stru_p->lock); + int slave_core_id = rte_lcore_id(); + + /* check state of lcores */ + RTE_LCORE_FOREACH_SLAVE(slave_core_id) { + if (lcore_config[slave_core_id].state != WAIT) + return -EBUSY; + } + /* start lcore main on core != master_core - ARP response thread */ + slave_core_id = rte_get_next_lcore(rte_lcore_id(), 1, 0); + if ((slave_core_id >= RTE_MAX_LCORE) || (slave_core_id == 0)) + return -EPERM; + + global_flag_stru_p->LcoreMainIsRunning = 1; + global_flag_stru_p->LcoreMainCore = slave_core_id; + printf("Starting lcore_main on core %d:%d Our IP:%d.%d.%d.%d\n", + slave_core_id, + rte_eal_remote_launch((lcore_function_t *)lcore_main, + NULL, + slave_core_id), + BOND_IP_1, + BOND_IP_2, + BOND_IP_3, + BOND_IP_4 + ); + + /* Start prompt for user interact */ + prompt(NULL); + + + return 0; +} diff --git a/examples/bond/main.h b/examples/bond/main.h new file mode 100644 index 0000000..2682d15 --- /dev/null +++ b/examples/bond/main.h @@ -0,0 +1,46 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _MAIN_H_ +#define _MAIN_H_ + + +#ifdef RTE_EXEC_ENV_BAREMETAL +#define MAIN _main +#else +#define MAIN main +#endif + +int MAIN(int argc, char *argv[]); + +#endif /* ifndef _MAIN_H_ */ -- 1.7.9.5 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dpdk-dev] [PATCH 4/4] bond: added example application for link bonding mode 6. 2015-01-30 10:57 ` [dpdk-dev] [PATCH 4/4] bond: added example application for link bonding mode 6 Michal Jastrzebski @ 2015-01-30 11:27 ` Jastrzebski, MichalX K 0 siblings, 0 replies; 7+ messages in thread From: Jastrzebski, MichalX K @ 2015-01-30 11:27 UTC (permalink / raw) To: dev > -----Original Message----- > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Michal Jastrzebski > Sent: Friday, January 30, 2015 11:58 AM > To: dev@dpdk.org > Subject: [dpdk-dev] [PATCH 4/4] bond: added example application for link > bonding mode 6. > > > Signed-off-by: Michal Jastrzebski <michalx.k.jastrzebski@intel.com> > Signed-off-by: Maciej Gajdzica <maciejx.t.gajdzica@intel.com> > --- > examples/bond/Makefile | 57 ++++ > examples/bond/main.c | 790 > ++++++++++++++++++++++++++++++++++++++++++++++++ > examples/bond/main.h | 46 +++ > 3 files changed, 893 insertions(+) > create mode 100644 examples/bond/Makefile > create mode 100644 examples/bond/main.c > create mode 100644 examples/bond/main.h > > diff --git a/examples/bond/Makefile b/examples/bond/Makefile > new file mode 100644 > index 0000000..9262249 > --- /dev/null > +++ b/examples/bond/Makefile > @@ -0,0 +1,57 @@ > +# BSD LICENSE > +# > +# Copyright(c) 2010-2014 Intel Corporation. All rights reserved. > +# All rights reserved. > +# > +# Redistribution and use in source and binary forms, with or without > +# modification, are permitted provided that the following conditions > +# are met: > +# > +# * Redistributions of source code must retain the above copyright > +# notice, this list of conditions and the following disclaimer. > +# * Redistributions in binary form must reproduce the above copyright > +# notice, this list of conditions and the following disclaimer in > +# the documentation and/or other materials provided with the > +# distribution. > +# * Neither the name of Intel Corporation nor the names of its > +# contributors may be used to endorse or promote products derived > +# from this software without specific prior written permission. > +# > +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND > CONTRIBUTORS > +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT > NOT > +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND > FITNESS FOR > +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE > COPYRIGHT > +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, > INCIDENTAL, > +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT > NOT > +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS > OF USE, > +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED > AND ON ANY > +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR > TORT > +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF > THE USE > +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH > DAMAGE. > + > +ifeq ($(RTE_SDK),) > +$(error "Please define RTE_SDK environment variable") > +endif > + > +# Default target, can be overriden by command line or environment > +RTE_TARGET ?= x86_64-native-linuxapp-gcc > + > +include $(RTE_SDK)/mk/rte.vars.mk > + > +# binary name > +APP = bond_app > + > +# all source are stored in SRCS-y > +SRCS-y := main.c > + > +CFLAGS += $(WERROR_FLAGS) > + > +# workaround for a gcc bug with noreturn attribute > +# http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12603 > +ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y) > +CFLAGS_main.o += -Wno-return-type > +endif > + > +CFLAGS += -O3 > + > +include $(RTE_SDK)/mk/rte.extapp.mk > diff --git a/examples/bond/main.c b/examples/bond/main.c > new file mode 100644 > index 0000000..57cc672 > --- /dev/null > +++ b/examples/bond/main.c > @@ -0,0 +1,790 @@ > +/*- > + * BSD LICENSE > + * > + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. > + * All rights reserved. > + * > + * Redistribution and use in source and binary forms, with or without > + * modification, are permitted provided that the following conditions > + * are met: > + * > + * * Redistributions of source code must retain the above copyright > + * notice, this list of conditions and the following disclaimer. > + * * Redistributions in binary form must reproduce the above copyright > + * notice, this list of conditions and the following disclaimer in > + * the documentation and/or other materials provided with the > + * distribution. > + * * Neither the name of Intel Corporation nor the names of its > + * contributors may be used to endorse or promote products derived > + * from this software without specific prior written permission. > + * > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND > CONTRIBUTORS > + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT > NOT > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND > FITNESS FOR > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE > COPYRIGHT > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, > INCIDENTAL, > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT > NOT > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS > OF USE, > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED > AND ON ANY > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR > TORT > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF > THE USE > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH > DAMAGE. > + */ > + > +#include <stdint.h> > +#include <sys/queue.h> > +#include <stdlib.h> > +#include <string.h> > +#include <stdio.h> > +#include <assert.h> > +#include <errno.h> > +#include <signal.h> > +#include <stdarg.h> > +#include <inttypes.h> > +#include <getopt.h> > +#include <termios.h> > +#include <unistd.h> > +#include <pthread.h> > + > +#include <rte_common.h> > +#include <rte_log.h> > +#include <rte_memory.h> > +#include <rte_memcpy.h> > +#include <rte_memzone.h> > +#include <rte_tailq.h> > +#include <rte_eal.h> > +#include <rte_per_lcore.h> > +#include <rte_launch.h> > +#include <rte_atomic.h> > +#include <rte_cycles.h> > +#include <rte_prefetch.h> > +#include <rte_lcore.h> > +#include <rte_per_lcore.h> > +#include <rte_branch_prediction.h> > +#include <rte_interrupts.h> > +#include <rte_pci.h> > +#include <rte_random.h> > +#include <rte_debug.h> > +#include <rte_ether.h> > +#include <rte_ethdev.h> > +#include <rte_ring.h> > +#include <rte_log.h> > +#include <rte_mempool.h> > +#include <rte_mbuf.h> > +#include <rte_memcpy.h> > +#include <rte_ip.h> > +#include <rte_tcp.h> > +#include <rte_arp.h> > +#include <rte_spinlock.h> > + > +#include <cmdline_rdline.h> > +#include <cmdline_parse.h> > +#include <cmdline_parse_num.h> > +#include <cmdline_parse_string.h> > +#include <cmdline_parse_ipaddr.h> > +#include <cmdline_parse_etheraddr.h> > +#include <cmdline_socket.h> > +#include <cmdline.h> > + > +#include "main.h" > + > +#include <rte_devargs.h> > + > + > +#include "rte_byteorder.h" > +#include "rte_cpuflags.h" > +#include "rte_eth_bond.h" > + > +#define RTE_LOGTYPE_DCB RTE_LOGTYPE_USER1 > + > +#define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) + > RTE_PKTMBUF_HEADROOM) > +#define NB_MBUF (1024*8) > + > +#define MAX_PKT_BURST 32 > +#define BURST_TX_DRAIN_US 100 /* TX drain every ~100us */ > +#define BURST_RX_INTERVAL_NS (10) /* RX poll interval ~100ns */ > + > +/* > + * RX and TX Prefetch, Host, and Write-back threshold values should be > + * carefully set for optimal performance. Consult the network > + * controller's datasheet and supporting DPDK documentation for guidance > + * on how these parameters should be set. > + */ > +#define RX_PTHRESH 8 /**< Default values of RX prefetch threshold reg. */ > +#define RX_HTHRESH 8 /**< Default values of RX host threshold reg. */ > +#define RX_WTHRESH 4 /**< Default values of RX write-back threshold reg. > */ > +#define RX_FTHRESH (MAX_PKT_BURST * 2)/**< Default values of RX free > threshold reg. */ > + > +/* > + * These default values are optimized for use with the Intel(R) 82599 10 GbE > + * Controller and the DPDK ixgbe PMD. Consider using other values for other > + * network controllers and/or network drivers. > + */ > +#define TX_PTHRESH 36 /**< Default values of TX prefetch threshold reg. */ > +#define TX_HTHRESH 0 /**< Default values of TX host threshold reg. */ > +#define TX_WTHRESH 0 /**< Default values of TX write-back threshold reg. > */ > + > +/* > + * Configurable number of RX/TX ring descriptors > + */ > +#define RTE_RX_DESC_DEFAULT 128 > +#define RTE_TX_DESC_DEFAULT 512 > + > +#define BOND_IP_1 7 > +#define BOND_IP_2 0 > +#define BOND_IP_3 0 > +#define BOND_IP_4 10 > + > +/* not defined under linux */ > +#ifndef NIPQUAD > +#define NIPQUAD_FMT "%u.%u.%u.%u" > +#define NIPQUAD(addr) \ > + (unsigned)((unsigned char *)&addr)[0], \ > + (unsigned)((unsigned char *)&addr)[1], \ > + (unsigned)((unsigned char *)&addr)[2], \ > + (unsigned)((unsigned char *)&addr)[3] > +#endif > + > +#define MAX_PORTS 4 > +#define PRINT_MAC(addr) > printf("%02"PRIx8":%02"PRIx8":%02"PRIx8 \ > + ":%02"PRIx8":%02"PRIx8":%02"PRIx8, \ > + addr.addr_bytes[0], addr.addr_bytes[1], addr.addr_bytes[2], \ > + addr.addr_bytes[3], addr.addr_bytes[4], addr.addr_bytes[5]) > + > +uint8_t slaves[RTE_MAX_ETHPORTS]; > +uint8_t slaves_count; > + > +static uint8_t BOND_PORT = 0xff; > + > +static struct rte_mempool *mbuf_pool; > + > +/* > + * RX and TX Prefetch, Host, and Write-back threshold values should be > + * carefully set for optimal performance. Consult the network > + * controller's datasheet and supporting DPDK documentation for guidance > + * on how these parameters should be set. > + */ > +/* Default configuration for rx and tx thresholds etc. */ > +static const struct rte_eth_rxconf rx_conf_default = { > + .rx_thresh = { > + .pthresh = RX_PTHRESH, > + .hthresh = RX_HTHRESH, > + .wthresh = RX_WTHRESH, > + > + }, > + .rx_free_thresh = RX_FTHRESH, > +}; > + > +/* > + * These default values are optimized for use with the Intel(R) 82599 10 GbE > + * Controller and the DPDK ixgbe PMD. Consider using other values for other > + * network controllers and/or network drivers. > + */ > +static const struct rte_eth_txconf tx_conf_default = { > + .tx_thresh = { > + .pthresh = TX_PTHRESH, > + .hthresh = TX_HTHRESH, > + .wthresh = TX_WTHRESH, > + }, > + .tx_free_thresh = 0, /* Use PMD default values */ > + .tx_rs_thresh = 0, /* Use PMD default values */ > + .txq_flags = ETH_TXQ_FLAGS_NOMULTSEGS | > ETH_TXQ_FLAGS_NOOFFLOADS, > +}; > + > +static struct rte_eth_conf port_conf = { > + .rxmode = { > + .mq_mode = ETH_MQ_RX_NONE, > + .max_rx_pkt_len = ETHER_MAX_LEN, > + .split_hdr_size = 0, > + .header_split = 0, /**< Header Split disabled */ > + .hw_ip_checksum = 0, /**< IP checksum offload enabled */ > + .hw_vlan_filter = 0, /**< VLAN filtering disabled */ > + .jumbo_frame = 0, /**< Jumbo Frame Support disabled */ > + .hw_strip_crc = 0, /**< CRC stripped by hardware */ > + }, > + .rx_adv_conf = { > + .rss_conf = { > + .rss_key = NULL, > + .rss_hf = ETH_RSS_IP, > + }, > + }, > + .txmode = { > + .mq_mode = ETH_MQ_TX_NONE, > + }, > +}; > + > +static void > +slave_port_init(uint8_t portid, struct rte_mempool *mbuf_pool) > +{ > + int retval; > + > + if (portid >= rte_eth_dev_count()) > + rte_exit(EXIT_FAILURE, "Invalid port\n"); > + > + retval = rte_eth_dev_configure(portid, 1, 1, &port_conf); > + if (retval != 0) > + rte_exit(EXIT_FAILURE, "port %u: configuration failed > (res=%d)\n", \ > + portid, retval); > + > + /* RX setup */ > + retval = rte_eth_rx_queue_setup(portid, 0, RTE_RX_DESC_DEFAULT, > + rte_eth_dev_socket_id(portid), > &rx_conf_default, > + mbuf_pool); > + if (retval < 0) > + rte_exit(retval, " port %u: RX queue 0 setup failed (res=%d)", > \ > + portid, retval); > + > + /* TX setup */ > + retval = rte_eth_tx_queue_setup(portid, 0, RTE_TX_DESC_DEFAULT, > + rte_eth_dev_socket_id(portid), > &tx_conf_default); > + > + if (retval < 0) > + rte_exit(retval, "port %u: TX queue 0 setup failed (res=%d)", \ > + portid, retval); > + > + retval = rte_eth_dev_start(portid); > + if (retval < 0) > + rte_exit(retval, \ > + "Start port %d failed (res=%d)", \ > + portid, retval); > + > + struct ether_addr addr; > + rte_eth_macaddr_get(portid, &addr); > + printf("Port %u MAC: ", (unsigned)portid); > + PRINT_MAC(addr); > + printf("\n"); > +} > + > +static void > +bond_port_init(struct rte_mempool *mbuf_pool) > +{ > + int retval; > + uint8_t i; > + > + retval = rte_eth_bond_create("bond0", BONDING_MODE_ALB, 0 > /*SOCKET_ID_ANY*/); > + if (retval < 0) > + rte_exit(EXIT_FAILURE, \ > + "Faled to create bond port\n"); > + > + BOND_PORT = (uint8_t)retval; > + > + retval = rte_eth_dev_configure(BOND_PORT, 1, 1, &port_conf); > + if (retval != 0) > + rte_exit(EXIT_FAILURE, "port %u: configuration failed > (res=%d)\n", \ > + BOND_PORT, retval); > + > + /* RX setup */ > + retval = rte_eth_rx_queue_setup(BOND_PORT, 0, > RTE_RX_DESC_DEFAULT, > + rte_eth_dev_socket_id(BOND_PORT), > &rx_conf_default, > + mbuf_pool); > + if (retval < 0) > + rte_exit(retval, " port %u: RX queue 0 setup failed (res=%d)", > \ > + BOND_PORT, retval); > + > + /* TX setup */ > + retval = rte_eth_tx_queue_setup(BOND_PORT, 0, > RTE_TX_DESC_DEFAULT, > + rte_eth_dev_socket_id(BOND_PORT), > &tx_conf_default); > + > + if (retval < 0) > + rte_exit(retval, "port %u: TX queue 0 setup failed (res=%d)", \ > + BOND_PORT, retval); > + > + for (i = 0; i < slaves_count; i++) { > + if (rte_eth_bond_slave_add(BOND_PORT, slaves[i]) == -1) > + rte_exit(-1, "Oooops! adding slave (%u) to bond (%u) > failed!\n", \ > + slaves[i], BOND_PORT); > + > + } > + > + retval = rte_eth_dev_start(BOND_PORT); > + if (retval < 0) > + rte_exit(retval, "Start port %d failed (res=%d)", BOND_PORT, > retval); > + > + rte_eth_promiscuous_enable(BOND_PORT); > + > + struct ether_addr addr; > + rte_eth_macaddr_get(BOND_PORT, &addr); > + printf("Port %u MAC: ", (unsigned)BOND_PORT); > + PRINT_MAC(addr); > + printf("\n"); > +} > + > +struct global_flag_stru_t { > + int LcoreMainIsRunning; > + int LcoreMainCore; > + uint32_t port_packets[4]; > + rte_spinlock_t lock; > +}; > +struct global_flag_stru_t global_flag_stru; > +struct global_flag_stru_t *global_flag_stru_p = &global_flag_stru; > + > +/* > + * Main thread that does the work, reading from INPUT_PORT > + * and writing to OUTPUT_PORT > + */ > +static int lcore_main(__attribute__((unused)) void *arg1) > +{ > + struct rte_mbuf *pkts[MAX_PKT_BURST] __rte_cache_aligned; > + struct ether_addr d_addr; > + > + struct ether_hdr *eth_hdr; > + struct arp_hdr *arp_hdr; > + > + uint16_t rx_cnt; > + uint32_t bond_ip; > + int i = 0; > + uint8_t is_free; > + > + bond_ip = BOND_IP_1 | (BOND_IP_2 << 8) | > + (BOND_IP_3 << 16) | (BOND_IP_4 << 24); > + > + rte_spinlock_trylock(&global_flag_stru_p->lock); > + > + while (global_flag_stru_p->LcoreMainIsRunning) { > + rte_spinlock_unlock(&global_flag_stru_p->lock); > + rx_cnt = rte_eth_rx_burst(BOND_PORT, 0, pkts, > MAX_PKT_BURST); > + is_free = 0; > + > + /* If didn't receive any packets, wait and go to next iteration > */ > + if (rx_cnt == 0) { > + rte_delay_us(50); > + continue; > + } > + > + /* Search incoming data for ARP packets and prepare > response */ > + for (i = 0; i < rx_cnt; i++) { > + if (rte_spinlock_trylock(&global_flag_stru_p->lock) == > 1) { > + global_flag_stru_p->port_packets[0]++; > + rte_spinlock_unlock(&global_flag_stru_p- > >lock); > + } > + eth_hdr = rte_pktmbuf_mtod(pkts[i], struct > ether_hdr *); > + if (eth_hdr->ether_type == > rte_cpu_to_be_16(ETHER_TYPE_ARP)) { > + if (rte_spinlock_trylock(&global_flag_stru_p- > >lock) == 1) { > + > global_flag_stru_p->port_packets[1]++; > + > rte_spinlock_unlock(&global_flag_stru_p->lock); > + } > + arp_hdr = (struct arp_hdr *)((char *)eth_hdr > + sizeof(struct ether_hdr)); > + if (arp_hdr->arp_data.arp_tip == bond_ip) { > + if (arp_hdr->arp_op == > rte_cpu_to_be_16(ARP_OP_REQUEST)) { > + arp_hdr->arp_op = > rte_cpu_to_be_16(ARP_OP_REPLY); > + > + /* Switch src and dst data > and set bonding MAC */ > + ether_addr_copy(ð_hdr- > >s_addr, ð_hdr->d_addr); > + > rte_eth_macaddr_get(BOND_PORT, ð_hdr->s_addr); > + ether_addr_copy(&arp_hdr- > >arp_data.arp_sha, &arp_hdr->arp_data.arp_tha); > + arp_hdr->arp_data.arp_tip = > arp_hdr->arp_data.arp_sip; > + > rte_eth_macaddr_get(BOND_PORT, &d_addr); > + ether_addr_copy(&d_addr, > &arp_hdr->arp_data.arp_sha); > + arp_hdr->arp_data.arp_sip = > bond_ip; > + > rte_eth_tx_burst(BOND_PORT, 0, &pkts[i], 1); > + is_free = 1; > + } else { > + > rte_eth_tx_burst(BOND_PORT, 0, NULL, 0); > + } > + } > + } > + > + /* Free processed packets */ > + if (is_free == 0) > + rte_pktmbuf_free(pkts[i]); > + } > + rte_spinlock_trylock(&global_flag_stru_p->lock); > + } > + rte_spinlock_unlock(&global_flag_stru_p->lock); > + printf("BYE lcore_main\n"); > + return 0; > +} > + > +/**********************************************************/ > +/**********************************************************/ > +struct cmd_obj_send_result { > + cmdline_fixed_string_t action; > + uint32_t count; > + cmdline_ipaddr_t ip; > +}; > + > +static void cmd_obj_send_parsed(void *parsed_result, > + __attribute__((unused)) struct cmdline *cl, > + __attribute__((unused)) void *data) > +{ > + > + struct cmd_obj_send_result *res = parsed_result; > + char ip_str[INET6_ADDRSTRLEN]; > + > + struct rte_mbuf *created_pkt; > + struct ether_hdr *eth_hdr; > + struct arp_hdr *arp_hdr; > + > + uint32_t bond_ip; > + size_t pkt_size; > + > + if (res->ip.family == AF_INET) > + snprintf(ip_str, sizeof(ip_str), NIPQUAD_FMT, > + NIPQUAD(res->ip.addr.ipv4)); > + else > + cmdline_printf(cl, "Wrong IP format. Only IPv4 is > supported\n"); > + > + bond_ip = BOND_IP_1 | (BOND_IP_2 << 8) | > + (BOND_IP_3 << 16) | (BOND_IP_4 << 24); > + > + created_pkt = rte_pktmbuf_alloc(mbuf_pool); > + while (created_pkt == NULL); > + > + pkt_size = sizeof(struct ether_hdr) + sizeof(struct arp_hdr); > + created_pkt->data_len = pkt_size; > + created_pkt->pkt_len = pkt_size; > + > + eth_hdr = rte_pktmbuf_mtod(created_pkt, struct ether_hdr *); > + rte_eth_macaddr_get(BOND_PORT, ð_hdr->s_addr); > + memset(ð_hdr->d_addr, 0xFF, ETHER_ADDR_LEN); > + eth_hdr->ether_type = rte_cpu_to_be_16(ETHER_TYPE_ARP); > + > + arp_hdr = (struct arp_hdr *)((char *)eth_hdr + sizeof(struct > ether_hdr)); > + arp_hdr->arp_hrd = rte_cpu_to_be_16(ARP_HRD_ETHER); > + arp_hdr->arp_pro = rte_cpu_to_be_16(ETHER_TYPE_IPv4); > + arp_hdr->arp_hln = ETHER_ADDR_LEN; > + arp_hdr->arp_pln = sizeof(uint32_t); > + arp_hdr->arp_op = rte_cpu_to_be_16(ARP_OP_REQUEST); > + > + rte_eth_macaddr_get(BOND_PORT, &arp_hdr->arp_data.arp_sha); > + arp_hdr->arp_data.arp_sip = bond_ip; > + memset(&arp_hdr->arp_data.arp_tha, 0, ETHER_ADDR_LEN); > + arp_hdr->arp_data.arp_tip = > + ((unsigned char *)&res->ip.addr.ipv4)[0] | > + (((unsigned char *)&res->ip.addr.ipv4)[1] << 8) | > + (((unsigned char *)&res->ip.addr.ipv4)[2] << 16) | > + (((unsigned char *)&res->ip.addr.ipv4)[3] << 24); > + rte_eth_tx_burst(BOND_PORT, 0, &created_pkt, 1); > + > + rte_delay_ms(500); > + cmdline_printf(cl, "\n"); > +} > + > +cmdline_parse_token_string_t cmd_obj_action_send = > + TOKEN_STRING_INITIALIZER(struct cmd_obj_send_result, action, > "send"); > +cmdline_parse_token_num_t cmd_obj_count = > + TOKEN_NUM_INITIALIZER(struct cmd_obj_send_result, count, > UINT32); > +cmdline_parse_token_ipaddr_t cmd_obj_ip = > + TOKEN_IPADDR_INITIALIZER(struct cmd_obj_send_result, ip); > + > +cmdline_parse_inst_t cmd_obj_send = { > + .f = cmd_obj_send_parsed, /* function to call */ > + .data = NULL, /* 2nd arg of func */ > + .help_str = "send arp", > + .tokens = { /* token list, NULL terminated */ > + (void *)&cmd_obj_action_send, > + (void *)&cmd_obj_count, > + (void *)&cmd_obj_ip, > + NULL, > + }, > +}; > + > +/**********************************************************/ > + > + > +/**********************************************************/ > +struct cmd_start_result { > + cmdline_fixed_string_t start; > +}; > + > +static void cmd_start_parsed(__attribute__((unused)) void *parsed_result, > + struct cmdline *cl, > + __attribute__((unused)) void *data) > +{ > + int slave_core_id = rte_lcore_id(); > + > + rte_spinlock_trylock(&global_flag_stru_p->lock); > + if (global_flag_stru_p->LcoreMainIsRunning == 0) { > + if (lcore_config[global_flag_stru_p->LcoreMainCore].state != > WAIT) { > + rte_spinlock_unlock(&global_flag_stru_p->lock); > + return; > + } > + rte_spinlock_unlock(&global_flag_stru_p->lock); > + } else { > + cmdline_printf(cl, "lcore_main already running on > core:%d\n", > + global_flag_stru_p->LcoreMainCore); > + rte_spinlock_unlock(&global_flag_stru_p->lock); > + return; > + } > + > + /* start lcore main on core != master_core - ARP response thread */ > + slave_core_id = rte_get_next_lcore(rte_lcore_id(), 1, 0); > + if ((slave_core_id >= RTE_MAX_LCORE) || (slave_core_id == 0)) > + return; > + > + rte_spinlock_trylock(&global_flag_stru_p->lock); > + global_flag_stru_p->LcoreMainIsRunning = 1; > + rte_spinlock_unlock(&global_flag_stru_p->lock); > + cmdline_printf(cl, "Starting lcore_main on core %d:%d " > + "Our IP:%d.%d.%d.%d\n", > + slave_core_id, > + rte_eal_remote_launch(lcore_main, NULL, > slave_core_id), > + BOND_IP_1, > + BOND_IP_2, > + BOND_IP_3, > + BOND_IP_4 > + ); > +} > + > +cmdline_parse_token_string_t cmd_start_start = > + TOKEN_STRING_INITIALIZER(struct cmd_start_result, start, "start"); > + > +cmdline_parse_inst_t cmd_start = { > + .f = cmd_start_parsed, /* function to call */ > + .data = NULL, /* 2nd arg of func */ > + .help_str = "starts listening if not started at startup", > + .tokens = { /* token list, NULL terminated */ > + (void *)&cmd_start_start, > + NULL, > + }, > +}; > + > +/**********************************************************/ > + > +struct cmd_help_result { > + cmdline_fixed_string_t help; > +}; > + > +static void cmd_help_parsed(__attribute__((unused)) void *parsed_result, > + struct cmdline *cl, > + __attribute__((unused)) void *data) > +{ > + cmdline_printf(cl, > + "ALB - link bonding mode 6 example\n" > + "send COUNT IP - sends COUNT ARPrequests thru > bonding for IP.\n" > + "start - starts listening ARPs.\n" > + "stop - stops lcore_main.\n" > + "show - shows some bond info: ex. active > slaves etc.\n" > + "help - prints help.\n" > + "quit - terminate all threads and quit.\n" > + ); > +} > + > +cmdline_parse_token_string_t cmd_help_help = > + TOKEN_STRING_INITIALIZER(struct cmd_help_result, help, "help"); > + > +cmdline_parse_inst_t cmd_help = { > + .f = cmd_help_parsed, /* function to call */ > + .data = NULL, /* 2nd arg of func */ > + .help_str = "show help", > + .tokens = { /* token list, NULL terminated */ > + (void *)&cmd_help_help, > + NULL, > + }, > +}; > + > +/**********************************************************/ > +struct cmd_stop_result { > + cmdline_fixed_string_t stop; > +}; > + > +static void cmd_stop_parsed(__attribute__((unused)) void *parsed_result, > + struct cmdline *cl, > + __attribute__((unused)) void *data) > +{ > + rte_spinlock_trylock(&global_flag_stru_p->lock); > + if (global_flag_stru_p->LcoreMainIsRunning == 0) { > + cmdline_printf(cl, > + "lcore_main not running on > core:%d\n", > + global_flag_stru_p->LcoreMainCore); > + rte_spinlock_unlock(&global_flag_stru_p->lock); > + return; > + } > + global_flag_stru_p->LcoreMainIsRunning = 0; > + rte_eal_wait_lcore(global_flag_stru_p->LcoreMainCore); > + cmdline_printf(cl, > + "lcore_main stopped on core:%d\n", > + global_flag_stru_p->LcoreMainCore); > + rte_spinlock_unlock(&global_flag_stru_p->lock); > +} > + > +cmdline_parse_token_string_t cmd_stop_stop = > + TOKEN_STRING_INITIALIZER(struct cmd_stop_result, stop, "stop"); > + > +cmdline_parse_inst_t cmd_stop = { > + .f = cmd_stop_parsed, /* function to call */ > + .data = NULL, /* 2nd arg of func */ > + .help_str = "this command do not handle any arguments", > + .tokens = { /* token list, NULL terminated */ > + (void *)&cmd_stop_stop, > + NULL, > + }, > +}; > +/**********************************************************/ > + > +/**********************************************************/ > +struct cmd_quit_result { > + cmdline_fixed_string_t quit; > +}; > + > +static void cmd_quit_parsed(__attribute__((unused)) void *parsed_result, > + struct cmdline *cl, > + __attribute__((unused)) void *data) > +{ > + cmdline_printf(cl, > + "quit - for quit just do ctrl+d\n" > + ); > + exit(EXIT_SUCCESS); > +} > + > +cmdline_parse_token_string_t cmd_quit_quit = > + TOKEN_STRING_INITIALIZER(struct cmd_quit_result, quit, "quit"); > + > +cmdline_parse_inst_t cmd_quit = { > + .f = cmd_quit_parsed, /* function to call */ > + .data = NULL, /* 2nd arg of func */ > + .help_str = "this command do not handle any arguments", > + .tokens = { /* token list, NULL terminated */ > + (void *)&cmd_quit_quit, > + NULL, > + }, > +}; > +/**********************************************************/ > + > +/**********************************************************/ > +struct cmd_show_result { > + cmdline_fixed_string_t show; > +}; > + > +static void cmd_show_parsed(__attribute__((unused)) void *parsed_result, > + struct cmdline *cl, > + __attribute__((unused)) void *data) > +{ > + uint8_t slaves[16] = {0}; > + uint8_t len = 16; > + struct ether_addr addr; > + uint8_t i = 0; > + > + while (i < slaves_count) { > + rte_eth_macaddr_get(i, &addr); > + PRINT_MAC(addr); > + if (i == BOND_PORT) > + printf(" - current primary slave"); > + printf("\n"); > + i++; > + } > + rte_eth_macaddr_get(i, &addr); > + PRINT_MAC(addr); > + if (i == BOND_PORT) > + printf(" - current primary slave"); > + printf("\n"); > + rte_spinlock_trylock(&global_flag_stru_p->lock); > + cmdline_printf(cl, > + "Active_slaves:%d " > + "packets received:Tot:%d Arp:%d\n", > + rte_eth_bond_active_slaves_get(BOND_PORT, slaves, > len), > + global_flag_stru_p->port_packets[0], > + global_flag_stru_p->port_packets[1]); > + rte_spinlock_unlock(&global_flag_stru_p->lock); > +} > + > +cmdline_parse_token_string_t cmd_show_show = > + TOKEN_STRING_INITIALIZER(struct cmd_show_result, show, "show"); > + > +cmdline_parse_inst_t cmd_show = { > + .f = cmd_show_parsed, /* function to call */ > + .data = NULL, /* 2nd arg of func */ > + .help_str = "this command do not handle any arguments", > + .tokens = { /* token list, NULL terminated */ > + (void *)&cmd_show_show, > + NULL, > + }, > +}; > +/**********************************************************/ > + > +/**********************************************************/ > +/****** CONTEXT (list of instruction) */ > + > +cmdline_parse_ctx_t main_ctx[] = { > + (cmdline_parse_inst_t *)&cmd_start, > + (cmdline_parse_inst_t *)&cmd_obj_send, > + (cmdline_parse_inst_t *)&cmd_stop, > + (cmdline_parse_inst_t *)&cmd_show, > + (cmdline_parse_inst_t *)&cmd_quit, > + (cmdline_parse_inst_t *)&cmd_help, > + NULL, > +}; > + > +/* prompt function, called from main on MASTER lcore */ > +static void *prompt(__attribute__((unused)) void *arg1) > +{ > + struct cmdline *cl; > + > + cl = cmdline_stdin_new(main_ctx, "bond6>"); > + if (cl != NULL) { > + cmdline_interact(cl); > + cmdline_stdin_exit(cl); > + } > +} > + > +/* Main function, does initialisation and calls the per-lcore functions */ > +int > +MAIN(int argc, char *argv[]) > +{ > + int ret; > + uint8_t nb_ports, i; > + > + /* init EAL */ > + ret = rte_eal_init(argc, argv); > + rte_eal_devargs_dump(stdout); > + if (ret < 0) > + rte_exit(EXIT_FAILURE, "Error with EAL initialization\n"); > + argc -= ret; > + argv += ret; > + > + nb_ports = rte_eth_dev_count(); > + if (nb_ports == 0) > + rte_exit(EXIT_FAILURE, "Give at least one port\n"); > + else if (nb_ports > MAX_PORTS) > + rte_exit(EXIT_FAILURE, "You can have max 4 ports\n"); > + > + mbuf_pool = rte_mempool_create("MBUF_POOL", NB_MBUF, > + MBUF_SIZE, 32, > + sizeof(struct rte_pktmbuf_pool_private), > + rte_pktmbuf_pool_init, NULL, > + rte_pktmbuf_init, NULL, > + rte_socket_id(), 0); > + if (mbuf_pool == NULL) > + rte_exit(EXIT_FAILURE, "Cannot create mbuf pool\n"); > + > + /* initialize all ports */ > + slaves_count = nb_ports; > + for (i = 0; i < nb_ports; i++) { > + slave_port_init(i, mbuf_pool); > + slaves[i] = i; > + } > + > + bond_port_init(mbuf_pool); > + > + rte_spinlock_init(&global_flag_stru_p->lock); > + int slave_core_id = rte_lcore_id(); > + > + /* check state of lcores */ > + RTE_LCORE_FOREACH_SLAVE(slave_core_id) { > + if (lcore_config[slave_core_id].state != WAIT) > + return -EBUSY; > + } > + /* start lcore main on core != master_core - ARP response thread */ > + slave_core_id = rte_get_next_lcore(rte_lcore_id(), 1, 0); > + if ((slave_core_id >= RTE_MAX_LCORE) || (slave_core_id == 0)) > + return -EPERM; > + > + global_flag_stru_p->LcoreMainIsRunning = 1; > + global_flag_stru_p->LcoreMainCore = slave_core_id; > + printf("Starting lcore_main on core %d:%d Our IP:%d.%d.%d.%d\n", > + slave_core_id, > + rte_eal_remote_launch((lcore_function_t > *)lcore_main, > + NULL, > + slave_core_id), > + BOND_IP_1, > + BOND_IP_2, > + BOND_IP_3, > + BOND_IP_4 > + ); > + > + /* Start prompt for user interact */ > + prompt(NULL); > + > + > + return 0; > +} > diff --git a/examples/bond/main.h b/examples/bond/main.h > new file mode 100644 > index 0000000..2682d15 > --- /dev/null > +++ b/examples/bond/main.h > @@ -0,0 +1,46 @@ > +/*- > + * BSD LICENSE > + * > + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. > + * All rights reserved. > + * > + * Redistribution and use in source and binary forms, with or without > + * modification, are permitted provided that the following conditions > + * are met: > + * > + * * Redistributions of source code must retain the above copyright > + * notice, this list of conditions and the following disclaimer. > + * * Redistributions in binary form must reproduce the above copyright > + * notice, this list of conditions and the following disclaimer in > + * the documentation and/or other materials provided with the > + * distribution. > + * * Neither the name of Intel Corporation nor the names of its > + * contributors may be used to endorse or promote products derived > + * from this software without specific prior written permission. > + * > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND > CONTRIBUTORS > + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT > NOT > + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND > FITNESS FOR > + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE > COPYRIGHT > + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, > INCIDENTAL, > + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT > NOT > + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS > OF USE, > + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED > AND ON ANY > + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR > TORT > + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF > THE USE > + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH > DAMAGE. > + */ > + > +#ifndef _MAIN_H_ > +#define _MAIN_H_ > + > + > +#ifdef RTE_EXEC_ENV_BAREMETAL > +#define MAIN _main > +#else > +#define MAIN main > +#endif > + > +int MAIN(int argc, char *argv[]); > + > +#endif /* ifndef _MAIN_H_ */ > -- > 1.7.9.5 This patch contains an example for link bonding mode 6. It interact with user by a command prompt. Available commands are: Start - starts ARP_thread which respond to ARP_requests and sends ARP_updates (this Is enabled by default after startup), Stop -stops ARP_thread, Send count ip - send count ARP requests for IP, Show - prints basic bond information, Help, Quit. The best way to test mode 6 is to use this example together with previous patch: [PATCH 3/4] bond: add debug info for mode 6 link bonding. Connect clients thru switch to bonding machine and send: arping -c 1 bond_ip or telnet bond_ip (IPv4 traffic from different clients should be then balanced on slaves in round robin manner). Best regards Michal ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-01-30 11:27 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-01-30 10:57 [dpdk-dev] [PATCH 0/4] Link Bonding mode 6 support (ALB) Michal Jastrzebski 2015-01-30 10:57 ` [dpdk-dev] [PATCH 1/4] net: changed arp_hdr struct declaration Michal Jastrzebski 2015-01-30 10:57 ` [dpdk-dev] [PATCH 2/4] bond: added link bonding mode 6 implementation Michal Jastrzebski 2015-01-30 10:57 ` [dpdk-dev] [PATCH 3/4] bond: add debug info for mode 6 link bonding Michal Jastrzebski 2015-01-30 11:09 ` Jastrzebski, MichalX K 2015-01-30 10:57 ` [dpdk-dev] [PATCH 4/4] bond: added example application for link bonding mode 6 Michal Jastrzebski 2015-01-30 11:27 ` Jastrzebski, MichalX K
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).