* [dpdk-dev] [PATCH v2 0/5] add dpdk packet capture support for tcpdump
@ 2016-02-12 14:57 Reshma Pattan
2016-02-12 14:57 ` [dpdk-dev] [PATCH v2 1/5] app/test-pmd: fix nb_rxq and nb_txq checks Reshma Pattan
` (6 more replies)
0 siblings, 7 replies; 21+ messages in thread
From: Reshma Pattan @ 2016-02-12 14:57 UTC (permalink / raw)
To: dev
This patch set include design to capture dpdk port packets for tcpdump.
This patch set include test-pmd changes to verify patch set given in Dependencies 1.
Current patch set is dependent on below patch set. Below patch set must be applied before applying current patch set.
Dependencies 1:
http://dpdk.org/dev/patchwork/patch/9750/
http://dpdk.org/dev/patchwork/patch/9751/
http://dpdk.org/dev/patchwork/patch/9752/
Dependencies 2:
Patches 2/5 and 3/5 of current patch set contains pcap based design.
So to run and compile these patches, libpcap must be installed and pcap config option should be set to yes i.e. CONFIG_RTE_LIBRTE_PMD_PCAP=y.
packet capture flow for tcpdump:
================================
Part of design is implemented in secondary process (proc_info.c) and some part in primary process (eal_interrupt.c).
Communication between both the processes is provided using socket and rte_ring.
[Secondary process:]
*Changes are included in patch 3/5
*User should request packet capture via proc_info app command line with port, queue and src ip filter information.
Note: As initial development basic src filter option only provided.
*proc_info sends port, queue and src ip filter information to primary along with register rx tx cbs message.
*proc_info creates two pcap devices for writing ingress and egress packets of port and queue.
*Runs in a while loop, dequeue packets sent by primary over shared rte_ring and writes to respective pcap files.
[Primary Process]:
*Changes are included in patch 4/5.
*Create rte_rings and mempool used for communicating packets with secondary.
*Create socket, waits on socket for message from secondary.
*Upon receiving the register rx tx cbs message, registers rte_eth_rxtx_callbacks for receiving ingress and egress packets of given port and queue.
*RX callback:
Gets packet, apply src ip filter, for matched packets, duplicate packets will be
created from mempool and new duplicated packets will be enqueued to
rte_ring for secondary to dequeue and write to pcap.
*TX callback:
Gets packets, apply src ip filter, for matched packets increments reference
counter of the packet, enqueue to other rte_ring for secondary to dequeue and
write to pcap.
[Secondary Process]:
*When secondary is terminated with ctrl+c, secondary sends remove rx tx cbs message to primary.
*[Primary Process]:
*When primary receives remove rx tx cbs message should take care of removing registered rxtx callbacks.
Users who wish to view packets can run "tcpdump -r RX_pcap.pcap/TX_pcap.pcap"
to view packets of interest.
Running the changes:
===================
1)Start any primary sample application.
ex:sudo ./examples/rxtx_callbacks/build/rxtx_callbacks -c 0x2 -n 2
2)Start proc_info(runs as secondary process by default)application with new parameters for tcpdump.
ex: sudo ./build/app/proc_info/dpdk_proc_info -c 0x4 -n 2 -- -p 0x3 --tcpdump '(0,0)(1,0)' --src-ip-filter="2.2.2.2"
3)Start traffic from traffic generator.
4)Now you can view ingress and egress packets of dpdk ports matching src-ip-filter written to /tmp/RX_pcap.pcap and /tmp/TX_pcap.pcap respectively.
5)Stop the secondary process using ctrl+c and rerun it and packet capturing should resume again.
Note: Writing to PCAP files will be stopped once the folder size where pcap files exists reaches its max value.
v2:
* extended nb_rxq/nb_txq check to other fwd modes along with rx_only and tx_only.
* changed some of the global variables to static in proc_info/main.c and eal_interrupts.c.
* release notes updated.
Reshma Pattan (5):
app/test-pmd: fix nb_rxq and np_txq checks
drivers/net/pcap: add public api to create pcap device
app/proc_info: add tcpdump support in secondary process
lib/librte_eal: add tcpdump support in primary process
doc: update doc for tcpdump feature
app/proc_info/main.c | 451 +++++++++++++++++++++++++-
app/test-pmd/cmdline.c | 11 +-
app/test-pmd/parameters.c | 14 +-
app/test-pmd/testpmd.c | 28 ++-
doc/guides/rel_notes/release_16_04.rst | 9 +-
drivers/net/pcap/Makefile | 4 +-
drivers/net/pcap/rte_eth_pcap.c | 156 ++++++++-
drivers/net/pcap/rte_eth_pcap.h | 87 +++++
drivers/net/pcap/rte_pmd_pcap_version.map | 8 +
lib/librte_eal/linuxapp/eal/Makefile | 5 +-
lib/librte_eal/linuxapp/eal/eal_interrupts.c | 375 +++++++++++++++++++++-
11 files changed, 1105 insertions(+), 43 deletions(-)
create mode 100644 drivers/net/pcap/rte_eth_pcap.h
--
1.7.4.1
^ permalink raw reply [flat|nested] 21+ messages in thread
* [dpdk-dev] [PATCH v2 1/5] app/test-pmd: fix nb_rxq and nb_txq checks
2016-02-12 14:57 [dpdk-dev] [PATCH v2 0/5] add dpdk packet capture support for tcpdump Reshma Pattan
@ 2016-02-12 14:57 ` Reshma Pattan
2016-02-12 14:57 ` [dpdk-dev] [PATCH v2 2/5] drivers/net/pcap: add public api to create pcap device Reshma Pattan
` (5 subsequent siblings)
6 siblings, 0 replies; 21+ messages in thread
From: Reshma Pattan @ 2016-02-12 14:57 UTC (permalink / raw)
To: dev
Made testpmd changes to validate nb_rxq/nb_txq zero
value changes of librte_ether.
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
app/test-pmd/cmdline.c | 11 +++++------
app/test-pmd/parameters.c | 14 +++++++++-----
app/test-pmd/testpmd.c | 28 +++++++++++++++++++++++++---
3 files changed, 39 insertions(+), 14 deletions(-)
diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 52e9f5f..f8e71a3 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -1,7 +1,7 @@
/*-
* BSD LICENSE
*
- * Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ * Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
* Copyright(c) 2014 6WIND S.A.
* All rights reserved.
*
@@ -1163,17 +1163,16 @@ cmd_config_rx_tx_parsed(void *parsed_result,
printf("Please stop all ports first\n");
return;
}
-
if (!strcmp(res->name, "rxq")) {
- if (res->value <= 0) {
- printf("rxq %d invalid - must be > 0\n", res->value);
+ if (!res->value && !nb_txq) {
+ printf("Warning: Either rx or tx queues should be non zero\n");
return;
}
nb_rxq = res->value;
}
else if (!strcmp(res->name, "txq")) {
- if (res->value <= 0) {
- printf("txq %d invalid - must be > 0\n", res->value);
+ if (!res->value && !nb_rxq) {
+ printf("Warning: Either rx or tx queues should be non zero\n");
return;
}
nb_txq = res->value;
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 4b421c8..55572eb 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -1,7 +1,7 @@
/*-
* BSD LICENSE
*
- * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ * Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
@@ -810,22 +810,26 @@ launch_args_parse(int argc, char** argv)
rss_hf = ETH_RSS_UDP;
if (!strcmp(lgopts[opt_idx].name, "rxq")) {
n = atoi(optarg);
- if (n >= 1 && n <= (int) MAX_QUEUE_ID)
+ if (n >= 0 && n <= (int) MAX_QUEUE_ID)
nb_rxq = (queueid_t) n;
else
rte_exit(EXIT_FAILURE, "rxq %d invalid - must be"
- " >= 1 && <= %d\n", n,
+ " >= 0 && <= %d\n", n,
(int) MAX_QUEUE_ID);
}
if (!strcmp(lgopts[opt_idx].name, "txq")) {
n = atoi(optarg);
- if (n >= 1 && n <= (int) MAX_QUEUE_ID)
+ if (n >= 0 && n <= (int) MAX_QUEUE_ID)
nb_txq = (queueid_t) n;
else
rte_exit(EXIT_FAILURE, "txq %d invalid - must be"
- " >= 1 && <= %d\n", n,
+ " >= 0 && <= %d\n", n,
(int) MAX_QUEUE_ID);
}
+ if (!nb_rxq && !nb_txq) {
+ rte_exit(EXIT_FAILURE, "Either rx or tx queues should "
+ "be non-zero\n");
+ }
if (!strcmp(lgopts[opt_idx].name, "burst")) {
n = atoi(optarg);
if ((n >= 1) && (n <= MAX_PKT_BURST))
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 1319917..a523442 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -1,7 +1,7 @@
/*-
* BSD LICENSE
*
- * Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ * Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
@@ -608,6 +608,7 @@ init_fwd_streams(void)
portid_t pid;
struct rte_port *port;
streamid_t sm_id, nb_fwd_streams_new;
+ queueid_t q;
/* set socket id according to numa or not */
FOREACH_PORT(pid, ports) {
@@ -643,7 +644,12 @@ init_fwd_streams(void)
}
}
- nb_fwd_streams_new = (streamid_t)(nb_ports * nb_rxq);
+ q = RTE_MAX(nb_rxq, nb_txq);
+ if (q == 0) {
+ printf("Fail:Cannot allocate fwd streams as number of queues is 0\n");
+ return -1;
+ }
+ nb_fwd_streams_new = (streamid_t)(nb_ports * q);
if (nb_fwd_streams_new == nb_fwd_streams)
return 0;
/* clear the old */
@@ -955,6 +961,19 @@ start_packet_forwarding(int with_tx_first)
portid_t pt_id;
streamid_t sm_id;
+ if (strcmp(cur_fwd_eng->fwd_mode_name, "rxonly") == 0 && !nb_rxq)
+ rte_exit(EXIT_FAILURE, "rxq are 0, cannot use rxonly fwd mode\n");
+
+ if (strcmp(cur_fwd_eng->fwd_mode_name, "txonly") == 0 && !nb_txq)
+ rte_exit(EXIT_FAILURE, "txq are 0, cannot use txonly fwd mode\n");
+
+ if ((strcmp(cur_fwd_eng->fwd_mode_name, "rxonly") != 0 &&
+ strcmp(cur_fwd_eng->fwd_mode_name, "txonly") != 0) &&
+ (!nb_rxq || !nb_txq))
+ rte_exit(EXIT_FAILURE,
+ "Either rxq or txq are 0, cannot use %s fwd mode\n",
+ cur_fwd_eng->fwd_mode_name);
+
if (all_ports_started() == 0) {
printf("Not all ports were started\n");
return;
@@ -2037,7 +2056,10 @@ main(int argc, char** argv)
if (argc > 1)
launch_args_parse(argc, argv);
- if (nb_rxq > nb_txq)
+ if (!nb_rxq && !nb_txq)
+ printf("Warning: Either rx or tx queues should be non-zero\n");
+
+ if (nb_rxq > 1 && nb_rxq > nb_txq)
printf("Warning: nb_rxq=%d enables RSS configuration, "
"but nb_txq=%d will prevent to fully test it.\n",
nb_rxq, nb_txq);
--
1.7.4.1
^ permalink raw reply [flat|nested] 21+ messages in thread
* [dpdk-dev] [PATCH v2 2/5] drivers/net/pcap: add public api to create pcap device
2016-02-12 14:57 [dpdk-dev] [PATCH v2 0/5] add dpdk packet capture support for tcpdump Reshma Pattan
2016-02-12 14:57 ` [dpdk-dev] [PATCH v2 1/5] app/test-pmd: fix nb_rxq and nb_txq checks Reshma Pattan
@ 2016-02-12 14:57 ` Reshma Pattan
2016-02-17 9:03 ` Pavel Fedin
2016-02-12 14:57 ` [dpdk-dev] [PATCH v2 3/5] app/proc_info: add tcpdump support in secondary process Reshma Pattan
` (4 subsequent siblings)
6 siblings, 1 reply; 21+ messages in thread
From: Reshma Pattan @ 2016-02-12 14:57 UTC (permalink / raw)
To: dev
Added new public api to create pcap device from pcaps.
Added new header file for API declaration.
Added new public api to version map
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
drivers/net/pcap/Makefile | 4 +-
drivers/net/pcap/rte_eth_pcap.c | 156 +++++++++++++++++++++++++---
drivers/net/pcap/rte_eth_pcap.h | 87 ++++++++++++++++
drivers/net/pcap/rte_pmd_pcap_version.map | 8 ++
4 files changed, 236 insertions(+), 19 deletions(-)
create mode 100644 drivers/net/pcap/rte_eth_pcap.h
diff --git a/drivers/net/pcap/Makefile b/drivers/net/pcap/Makefile
index b41d8a2..8e424bf 100644
--- a/drivers/net/pcap/Makefile
+++ b/drivers/net/pcap/Makefile
@@ -1,6 +1,6 @@
# BSD LICENSE
#
-# Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+# Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
# Copyright(c) 2014 6WIND S.A.
# All rights reserved.
#
@@ -53,7 +53,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += rte_eth_pcap.c
#
# Export include files
#
-SYMLINK-y-include +=
+SYMLINK-y-include += rte_eth_pcap.h
# this lib depends upon:
DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += lib/librte_mbuf
diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
index f9230eb..1da7913 100644
--- a/drivers/net/pcap/rte_eth_pcap.c
+++ b/drivers/net/pcap/rte_eth_pcap.c
@@ -1,7 +1,7 @@
/*-
* BSD LICENSE
*
- * Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ * Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
* Copyright(c) 2014 6WIND S.A.
* All rights reserved.
*
@@ -44,7 +44,7 @@
#include <net/if.h>
-#include <pcap.h>
+#include "rte_eth_pcap.h"
#define RTE_ETH_PCAP_SNAPSHOT_LEN 65535
#define RTE_ETH_PCAP_SNAPLEN ETHER_MAX_JUMBO_FRAME_LEN
@@ -85,21 +85,6 @@ struct pcap_tx_queue {
char type[ETH_PCAP_ARG_MAXLEN];
};
-struct rx_pcaps {
- unsigned num_of_rx;
- pcap_t *pcaps[RTE_PMD_RING_MAX_RX_RINGS];
- const char *names[RTE_PMD_RING_MAX_RX_RINGS];
- const char *types[RTE_PMD_RING_MAX_RX_RINGS];
-};
-
-struct tx_pcaps {
- unsigned num_of_tx;
- pcap_dumper_t *dumpers[RTE_PMD_RING_MAX_TX_RINGS];
- pcap_t *pcaps[RTE_PMD_RING_MAX_RX_RINGS];
- const char *names[RTE_PMD_RING_MAX_RX_RINGS];
- const char *types[RTE_PMD_RING_MAX_RX_RINGS];
-};
-
struct pmd_internals {
struct pcap_rx_queue rx_queue[RTE_PMD_RING_MAX_RX_RINGS];
struct pcap_tx_queue tx_queue[RTE_PMD_RING_MAX_TX_RINGS];
@@ -875,6 +860,143 @@ error:
return -1;
}
+int
+rte_eth_from_pcapsndumpers(const char *name,
+ struct rx_pcaps *rx_queues,
+ const unsigned nb_rx_queues,
+ struct tx_pcaps *tx_queues,
+ const unsigned nb_tx_queues,
+ const unsigned numa_node)
+{
+ struct rte_eth_dev_data *data = NULL;
+ struct pmd_internals *internals = NULL;
+ struct rte_eth_dev *eth_dev = NULL;
+ unsigned i;
+ pcap_dumper_t *dumper;
+ pcap_t *pcap = NULL;
+
+ hz = rte_get_timer_hz();
+ /* do some parameter checking */
+ if (!rx_queues && nb_rx_queues > 0)
+ return -1;
+ if (!tx_queues && nb_tx_queues > 0)
+ return -1;
+
+ /* initialize rx and tx pcaps */
+ for (i = 0; i < nb_rx_queues; i++) {
+ if (open_single_rx_pcap(rx_queues->names[i], &pcap) < 0)
+ return -1;
+ rx_queues->pcaps[i] = pcap;
+ }
+ for (i = 0; i < nb_tx_queues; i++) {
+ if (open_single_tx_pcap(tx_queues->names[i], &dumper) < 0)
+ return -1;
+ tx_queues->dumpers[i] = dumper;
+ }
+
+ RTE_LOG(INFO, PMD, "Creating pcap-backed ethdev on numa socket %u\n", numa_node);
+
+ /* now do all data allocation - for eth_dev structure, dummy pci driver
+ * and internal (private) data
+ */
+ data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
+ if (!data)
+ goto error;
+
+ if (nb_rx_queues) {
+ data->rx_queues = rte_zmalloc_socket(name, sizeof(void *) * nb_rx_queues,
+ 0, numa_node);
+ if (!data->rx_queues)
+ goto error;
+ }
+
+ if (nb_tx_queues) {
+ data->tx_queues = rte_zmalloc_socket(name, sizeof(void *) * nb_tx_queues,
+ 0, numa_node);
+ if (data->tx_queues == NULL)
+ goto error;
+ }
+
+ internals = rte_zmalloc_socket(name, sizeof(*internals), 0, numa_node);
+ if (!internals)
+ goto error;
+
+ /* reserve an ethdev entry */
+ eth_dev = rte_eth_dev_allocate(name, RTE_ETH_DEV_VIRTUAL);
+ if (!eth_dev)
+ goto error;
+
+ /* check length of device name */
+ if ((strlen(eth_dev->data->name) + 1) > sizeof(data->name))
+ goto error;
+
+ /* now put it all together
+ * - store queue data in internals,
+ * - store numa_node info in eth_dev_data
+ * - point eth_dev_data to internals
+ * - and point eth_dev structure to new eth_dev_data structure
+ */
+ internals->nb_rx_queues = nb_rx_queues;
+ internals->nb_tx_queues = nb_tx_queues;
+ internals->if_index = if_nametoindex(name);
+
+ data->dev_private = internals;
+ data->port_id = eth_dev->data->port_id;
+ strncpy(data->name, eth_dev->data->name, strlen(eth_dev->data->name));
+ data->nb_rx_queues = (uint16_t)nb_rx_queues;
+ data->nb_tx_queues = (uint16_t)nb_tx_queues;
+ data->dev_link = pmd_link;
+ data->mac_addrs = ð_addr;
+
+ strncpy(data->name,
+ eth_dev->data->name, strlen(eth_dev->data->name));
+ eth_dev->data = data;
+ eth_dev->driver = NULL;
+ eth_dev->dev_ops = &ops;
+ eth_dev->data->dev_flags = RTE_ETH_DEV_DETACHABLE;
+ eth_dev->data->kdrv = RTE_KDRV_NONE;
+ eth_dev->data->drv_name = drivername;
+ eth_dev->data->numa_node = numa_node;
+
+ for (i = 0; i < nb_rx_queues; i++) {
+ internals->rx_queue[i].pcap = rx_queues->pcaps[i];
+ snprintf(internals->rx_queue[i].name,
+ sizeof(internals->rx_queue[i].name), "%s",
+ rx_queues->names[i]);
+ snprintf(internals->rx_queue[i].type,
+ sizeof(internals->rx_queue[i].type), "%s",
+ rx_queues->types[i]);
+ }
+ for (i = 0; i < nb_tx_queues; i++) {
+ internals->tx_queue[i].dumper = tx_queues->dumpers[i];
+ snprintf(internals->tx_queue[i].name,
+ sizeof(internals->tx_queue[i].name), "%s",
+ tx_queues->names[i]);
+ snprintf(internals->tx_queue[i].type,
+ sizeof(internals->tx_queue[i].type), "%s",
+ tx_queues->types[i]);
+ }
+
+ /* using multiple pcaps/interfaces */
+ internals->single_iface = 0;
+
+ /* finally assign rx and tx ops */
+ eth_dev->rx_pkt_burst = eth_pcap_rx;
+ eth_dev->tx_pkt_burst = eth_pcap_tx_dumper;
+
+ return data->port_id;
+
+error:
+ if (data) {
+ rte_free(data->rx_queues);
+ rte_free(data->tx_queues);
+ }
+ rte_free(data);
+ rte_free(internals);
+
+ return -1;
+}
+
static int
rte_eth_from_pcaps_n_dumpers(const char *name,
struct rx_pcaps *rx_queues,
diff --git a/drivers/net/pcap/rte_eth_pcap.h b/drivers/net/pcap/rte_eth_pcap.h
new file mode 100644
index 0000000..5bcfb5d
--- /dev/null
+++ b/drivers/net/pcap/rte_eth_pcap.h
@@ -0,0 +1,87 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2016 Intel Corporation. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_ETH_PCAP_H_
+#define _RTE_ETH_PCAP_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <pcap.h>
+
+struct rx_pcaps {
+ unsigned num_of_rx;
+ pcap_t *pcaps[RTE_PMD_RING_MAX_RX_RINGS];
+ const char *names[RTE_PMD_RING_MAX_RX_RINGS];
+ const char *types[RTE_PMD_RING_MAX_RX_RINGS];
+};
+
+struct tx_pcaps {
+ unsigned num_of_tx;
+ pcap_dumper_t *dumpers[RTE_PMD_RING_MAX_TX_RINGS];
+ pcap_t *pcaps[RTE_PMD_RING_MAX_RX_RINGS];
+ const char *names[RTE_PMD_RING_MAX_RX_RINGS];
+ const char *types[RTE_PMD_RING_MAX_RX_RINGS];
+};
+
+/**
+ * Create a new ethdev port from pcaps
+ *
+ * @param name
+ * name to be given to the new ethdev port
+ * @param rx_queues
+ * pointer to array of pcaps to be used as RX queues
+ * @param nb_rx_queues
+ * number of elements in the rx_queues array
+ * @param tx_queues
+ * pointer to array of pcaps to be used as TX queues
+ * @param nb_tx_queues
+ * number of elements in the tx_queues array
+ * @param numa_node
+ * the numa node on which the memory for this port is to be allocated
+ * @return
+ * the port number of the newly created the ethdev or -1 on error.
+ */
+int rte_eth_from_pcapsndumpers(const char *name,
+ struct rx_pcaps *rx_queues,
+ const unsigned nb_rx_queues,
+ struct tx_pcaps *tx_queues,
+ const unsigned nb_tx_queues,
+ const unsigned numa_node);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/drivers/net/pcap/rte_pmd_pcap_version.map b/drivers/net/pcap/rte_pmd_pcap_version.map
index ef35398..104dc4d 100644
--- a/drivers/net/pcap/rte_pmd_pcap_version.map
+++ b/drivers/net/pcap/rte_pmd_pcap_version.map
@@ -2,3 +2,11 @@ DPDK_2.0 {
local: *;
};
+
+DPDK_2.3 {
+ global:
+
+ rte_eth_from_pcapsndumpers;
+
+} DPDK_2.0;
+
--
1.7.4.1
^ permalink raw reply [flat|nested] 21+ messages in thread
* [dpdk-dev] [PATCH v2 3/5] app/proc_info: add tcpdump support in secondary process
2016-02-12 14:57 [dpdk-dev] [PATCH v2 0/5] add dpdk packet capture support for tcpdump Reshma Pattan
2016-02-12 14:57 ` [dpdk-dev] [PATCH v2 1/5] app/test-pmd: fix nb_rxq and nb_txq checks Reshma Pattan
2016-02-12 14:57 ` [dpdk-dev] [PATCH v2 2/5] drivers/net/pcap: add public api to create pcap device Reshma Pattan
@ 2016-02-12 14:57 ` Reshma Pattan
2016-02-12 14:57 ` [dpdk-dev] [PATCH v2 4/5] lib/librte_eal: add tcpdump support in primary process Reshma Pattan
` (3 subsequent siblings)
6 siblings, 0 replies; 21+ messages in thread
From: Reshma Pattan @ 2016-02-12 14:57 UTC (permalink / raw)
To: dev
Added "--tcupdump and "--src-ip-filter" command line options
for tcpdump support.
Added pcap device creation and writing of packets to pcap device
for tcpdump.
Added socket functionality to communicate with primary process.
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
app/proc_info/main.c | 451 +++++++++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 447 insertions(+), 4 deletions(-)
diff --git a/app/proc_info/main.c b/app/proc_info/main.c
index 341176d..fe4d9a9 100644
--- a/app/proc_info/main.c
+++ b/app/proc_info/main.c
@@ -1,7 +1,7 @@
/*
* BSD LICENSE
*
- * Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ * Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
@@ -38,8 +38,25 @@
#include <stdarg.h>
#include <inttypes.h>
#include <sys/queue.h>
+#include <sys/socket.h>
#include <stdlib.h>
#include <getopt.h>
+#include <unistd.h>
+#include <signal.h>
+#include <stdbool.h>
+#include <arpa/inet.h>
+
+/* sys/un.h with __USE_MISC uses strlen, which is unsafe */
+#ifdef __USE_MISC
+#define REMOVED_USE_MISC
+#undef __USE_MISC
+#endif
+#include <sys/un.h>
+/* make sure we redefine __USE_MISC only if it was previously undefined */
+#ifdef REMOVED_USE_MISC
+#define __USE_MISC
+#undef REMOVED_USE_MISC
+#endif
#include <rte_eal.h>
#include <rte_common.h>
@@ -57,11 +74,25 @@
#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_string_fns.h>
+#include <rte_errno.h>
+
+#ifdef RTE_LIBRTE_PMD_PCAP
+#include <rte_eth_pcap.h>
+#endif
/* Maximum long option length for option parsing. */
#define MAX_LONG_OPT_SZ 64
#define RTE_LOGTYPE_APP RTE_LOGTYPE_USER1
-
+#define APP_ARG_TCPDUMP_MAX_TUPLES 54
+#define TCPDUMP_SOCKET_PATH "%s/tcpdump_mp_socket"
+#define CMSGLEN CMSG_LEN(sizeof(int))
+#define TX_DESC_PER_QUEUE 512
+#define RX_DESC_PER_QUEUE 128
+#define BURST_SIZE 32
+#define MBUF_PER_POOL 65535
+#define MBUF_POOL_CACHE_SIZE 250
+
+static struct rte_eth_conf port_conf_default;
/**< mask of enabled ports */
static uint32_t enabled_port_mask;
/**< Enable stats. */
@@ -75,13 +106,59 @@ static uint32_t reset_xstats;
/**< Enable memory info. */
static uint32_t mem_info;
+enum tcpdump_msg_type {
+ REMOVE_RXTX_CBS = 1,
+ REGISTER_RXTX_CBS = 2
+};
+
+enum rx_tx_type {
+ RX = 1,
+ TX = 2,
+ RX_TX_TYPES = 2
+};
+
+/**< src ip filter for tcpdump. */
+static uint32_t src_ip_filter;
+/**< socket for connecting to primary. */
+static int socket_fd;
+/**< vdev port ids. */
+static int pcap_vdev_port_id[RX_TX_TYPES];
+volatile uint8_t quit_signal;
+/**< Enable tcpdump feature. */
+bool is_tcpdump_enabled;
+
+static volatile struct tcpdump_app_stats {
+ struct {
+ uint64_t dequeue_pkts;
+ uint64_t tx_pkts;
+ uint64_t freed_pkts;
+ } in __rte_cache_aligned;
+ struct {
+ uint64_t dequeue_pkts;
+ uint64_t tx_pkts;
+ uint64_t freed_pkts;
+ } out __rte_cache_aligned;
+} tcpdump_app_stats __rte_cache_aligned;
+
+struct tcpdump_port_queue_tuples {
+ int num_pq_tuples;
+ uint8_t port_id[APP_ARG_TCPDUMP_MAX_TUPLES];
+ uint8_t queue_id[APP_ARG_TCPDUMP_MAX_TUPLES];
+} __rte_cache_aligned;
+
+static struct tcpdump_port_queue_tuples tcpdump_pq_t;
+
/**< display usage */
+
static void
proc_info_usage(const char *prgname)
{
printf("%s [EAL options] -- -p PORTMASK\n"
" -m to display DPDK memory zones, segments and TAILQ information\n"
" -p PORTMASK: hexadecimal bitmask of ports to retrieve stats for\n"
+ " --tcpdump (port,queue): port and queue info for capturing packets "
+ "for tcpdump\n"
+ " --src-ip-filter \"A.B.C.D\": src ip for tcpdump filtering\n"
" --stats: to display port statistics, enabled by default\n"
" --xstats: to display extended port statistics, disabled by "
"default\n"
@@ -116,14 +193,79 @@ parse_portmask(const char *portmask)
}
+static int
+parse_tcpdump(const char *q_arg)
+{
+ char s[256];
+ const char *p, *p0 = q_arg;
+ char *end;
+
+ enum fieldnames {
+ FLD_PORT = 0,
+ FLD_QUEUE,
+ _NUM_FLD
+ };
+
+ unsigned long int_fld[_NUM_FLD];
+ char *str_fld[_NUM_FLD];
+ int i;
+ unsigned size;
+ uint32_t nb_tcpdump_params;
+
+ nb_tcpdump_params = 0;
+
+ while ((p = strchr(p0, '(')) != NULL) {
+ ++p;
+ p0 = strchr(p, ')');
+ if (p0 == NULL)
+ return -1;
+
+ size = p0 - p;
+ if (size >= sizeof(s))
+ return -1;
+
+ snprintf(s, sizeof(s), "%.*s", size, p);
+ if (rte_strsplit(s, sizeof(s), str_fld, _NUM_FLD, ',') != _NUM_FLD)
+ return -1;
+ for (i = 0; i < _NUM_FLD; i++) {
+ errno = 0;
+ int_fld[i] = strtoul(str_fld[i], &end, 0);
+ if (errno != 0 || end == str_fld[i] || int_fld[i] > 255)
+ return -1;
+ }
+ if (nb_tcpdump_params >= APP_ARG_TCPDUMP_MAX_TUPLES) {
+ printf("exceeded max number of port params: %"PRIu32"\n",
+ nb_tcpdump_params);
+ return -1;
+ }
+ tcpdump_pq_t.port_id[tcpdump_pq_t.num_pq_tuples] =
+ (uint8_t)int_fld[FLD_PORT];
+ tcpdump_pq_t.queue_id[tcpdump_pq_t.num_pq_tuples] =
+ (uint8_t)int_fld[FLD_QUEUE];
+ tcpdump_pq_t.num_pq_tuples++;
+ }
+ return 0;
+}
+
+static int
+parse_ip(const char *q_arg)
+{
+ if (!inet_pton(AF_INET, q_arg, &src_ip_filter))
+ return 1;
+
+ return 0;
+}
+
/* Parse the argument given in the command line of the application */
static int
proc_info_parse_args(int argc, char **argv)
{
- int opt;
+ int opt, ret;
int option_index;
char *prgname = argv[0];
static struct option long_option[] = {
+ {"tcpdump", 1, 0, 0},
+ {"src-ip-filter", 1, 0, 0},
{"stats", 0, NULL, 0},
{"stats-reset", 0, NULL, 0},
{"xstats", 0, NULL, 0},
@@ -151,6 +293,27 @@ proc_info_parse_args(int argc, char **argv)
mem_info = 1;
break;
case 0:
+ if (!strncmp(long_option[option_index].name, "tcpdump",
+ MAX_LONG_OPT_SZ)) {
+ ret = parse_tcpdump(optarg);
+ if (ret) {
+ printf("invalid tcpdump\n");
+ proc_info_usage(prgname);
+ return -1;
+ }
+ is_tcpdump_enabled = true;
+ }
+
+ if (!strncmp(long_option[option_index].name, "src-ip-filter",
+ MAX_LONG_OPT_SZ)) {
+ ret = parse_ip(optarg);
+ if (ret) {
+ printf("invalid src-ip-filter\n");
+ proc_info_usage(prgname);
+ return -1;
+ }
+ }
+
/* Print stats */
if (!strncmp(long_option[option_index].name, "stats",
MAX_LONG_OPT_SZ))
@@ -285,6 +448,202 @@ nic_xstats_clear(uint8_t port_id)
printf("\n NIC extended statistics for port %d cleared\n", port_id);
}
+/* get socket path (/var/run if root, $HOME otherwise) */
+static void
+tcpdump_get_socket_path(char *buffer, int bufsz)
+{
+ const char *dir = "/var/run/tcpdump_socket";
+ const char *home_dir = getenv("HOME/tcpdump_socket");
+
+ if (getuid() != 0 && home_dir != NULL)
+ dir = home_dir;
+ /* use current prefix as file path */
+ snprintf(buffer, bufsz, TCPDUMP_SOCKET_PATH, dir);
+}
+
+static int
+tcpdump_connect_to_primary(void)
+{
+ struct sockaddr_un addr;
+ socklen_t sockaddr_len;
+
+ /* set up a socket */
+ socket_fd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+ if (socket_fd < 0) {
+ RTE_LOG(ERR, EAL, "Failed to create socket!\n");
+ return -1;
+ }
+
+ tcpdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path));
+ addr.sun_family = AF_UNIX;
+
+ sockaddr_len = sizeof(struct sockaddr_un);
+
+ if (connect(socket_fd, (struct sockaddr *) &addr, sockaddr_len) == 0)
+ return socket_fd;
+
+ /* if connect failed */
+ close(socket_fd);
+ return -1;
+}
+
+/* send a request, return -1 on error */
+static int
+tcpdump_send_request(int socket, enum tcpdump_msg_type type)
+{
+ char buffer[256];
+ struct msghdr reg_cb_msg;
+ struct iovec msg[3];
+ int ret, wc, buf, i, n = 0;
+
+ buf = type;
+ for (i = 0; i < tcpdump_pq_t.num_pq_tuples; i++) {
+ wc = snprintf(buffer + n, sizeof(buffer) - n, "(%d,%d)",
+ tcpdump_pq_t.port_id[i], tcpdump_pq_t.queue_id[i]);
+ n += wc;
+ }
+
+ memset(msg, 0, sizeof(msg));
+ msg[0].iov_base = (char *) &buf;
+ msg[0].iov_len = 1;
+ msg[1].iov_base = (char *)buffer;
+ msg[1].iov_len = sizeof(buffer);
+ msg[2].iov_base = (char *) &src_ip_filter;
+ msg[2].iov_len = sizeof(src_ip_filter);
+
+ memset(®_cb_msg, 0, sizeof(reg_cb_msg));
+ reg_cb_msg.msg_iov = msg;
+ reg_cb_msg.msg_iovlen = 3;
+
+ ret = sendmsg(socket, ®_cb_msg, 0);
+ if (ret < 0)
+ return -1;
+ return 0;
+}
+
+static void
+int_handler(int sig_num)
+{
+ /* connect to primary process using AF_UNIX socket */
+ socket_fd = tcpdump_connect_to_primary();
+ if (socket_fd < 0)
+ printf("cannot connect to primary process for RX/TX CBs removal!\n");
+
+ /* send request to remove rx/tx callbacks */
+ if (tcpdump_send_request(socket_fd, REMOVE_RXTX_CBS) < 0) {
+ printf("cannot send tcpdump remove rxtx cbs eequest!\n");
+ close(socket_fd);
+ }
+
+ /* close tcpdump socket fd */
+ close(socket_fd);
+ printf("Exiting on signal %d\n", sig_num);
+ quit_signal = 1;
+}
+
+static inline int
+configure_pcap_vdev(uint8_t port_id)
+{
+ struct ether_addr addr;
+ const uint16_t rxRings = 0, txRings = 1;
+ const uint8_t nb_ports = rte_eth_dev_count();
+ int ret;
+ uint16_t q;
+
+ if (port_id > nb_ports)
+ return -1;
+
+ ret = rte_eth_dev_configure(port_id, rxRings, txRings, &port_conf_default);
+ if (ret != 0)
+ return ret;
+
+ for (q = 0; q < txRings; q++) {
+ ret = rte_eth_tx_queue_setup(port_id, q, TX_DESC_PER_QUEUE,
+ rte_eth_dev_socket_id(port_id), NULL);
+ if (ret < 0) {
+ rte_exit(EXIT_FAILURE, "queue setup failed\n");
+ return ret;
+ }
+ }
+
+ ret = rte_eth_dev_start(port_id);
+ if (ret < 0)
+ return ret;
+
+ rte_eth_macaddr_get(port_id, &addr);
+ printf("Port %u MAC: %02"PRIx8" %02"PRIx8" %02"PRIx8
+ " %02"PRIx8" %02"PRIx8" %02"PRIx8"\n",
+ (unsigned)port_id,
+ addr.addr_bytes[0], addr.addr_bytes[1],
+ addr.addr_bytes[2], addr.addr_bytes[3],
+ addr.addr_bytes[4], addr.addr_bytes[5]);
+
+ rte_eth_promiscuous_enable(port_id);
+
+ return 0;
+}
+
+static int
+create_pcap_pmd_vdev(enum rx_tx_type type) {
+ char pcap_vdev_name[32];
+ char pcap_filename[32];
+#ifdef RTE_LIBRTE_PMD_PCAP
+ struct rx_pcaps rxpcap;
+ struct tx_pcaps txpcap;
+#endif
+ int port_id;
+
+ if (type == RX) {
+ snprintf(pcap_vdev_name, sizeof(pcap_vdev_name),
+ "eth_pcap_tcpdump_%s", "RX");
+ snprintf(pcap_filename, sizeof(pcap_filename),
+ "/tmp/%s_pcap.pcap", "RX");
+ } else if (type == TX) {
+ snprintf(pcap_vdev_name, sizeof(pcap_vdev_name),
+ "eth_pcap_tcpdump_%s", "TX");
+ snprintf(pcap_filename, sizeof(pcap_filename),
+ "/tmp/%s_pcap.pcap", "TX");
+ }
+
+#ifdef RTE_LIBRTE_PMD_PCAP
+ rxpcap.names[0] = "";
+ rxpcap.types[0] = "";
+ rxpcap.num_of_rx = 0;
+ txpcap.names[0] = pcap_filename;
+ txpcap.types[0] = "tx_pcap";
+ txpcap.num_of_tx = 1;
+
+ port_id = rte_eth_from_pcapsndumpers(pcap_vdev_name,
+ &rxpcap, rxpcap.num_of_rx,
+ &txpcap, txpcap.num_of_tx, rte_socket_id());
+#else
+ port_id = -1;
+#endif
+ if (port_id < 0)
+ rte_exit(EXIT_FAILURE, "Failed to create pcap_vdev\n");
+
+ return port_id;
+}
+
+static void
+print_tcpdump_stats(void)
+{
+ printf("##### TCPDUMP DEBUG STATS #####\n");
+ printf(" - Input packets dequeued: %"PRIu64"\n",
+ tcpdump_app_stats.in.dequeue_pkts);
+ printf(" - Input packets transmitted to pcap: %"PRIu64"\n",
+ tcpdump_app_stats.in.tx_pkts);
+ printf(" - Input packets freed: %"PRIu64"\n",
+ tcpdump_app_stats.in.freed_pkts);
+ printf(" - Output packets dequeued: %"PRIu64"\n",
+ tcpdump_app_stats.out.dequeue_pkts);
+ printf(" - Output packets transmitted to pcap: %"PRIu64"\n",
+ tcpdump_app_stats.out.tx_pkts);
+ printf(" - Output packets freed: %"PRIu64"\n",
+ tcpdump_app_stats.out.freed_pkts);
+ printf("################################\n");
+}
+
int
main(int argc, char **argv)
{
@@ -295,6 +654,10 @@ main(int argc, char **argv)
char mp_flag[] = "--proc-type=secondary";
char *argp[argc + 3];
uint8_t nb_ports;
+ struct rte_ring *rx_ring, *tx_ring;
+
+ /* catch ctrl-c so we can print on exit */
+ signal(SIGINT, int_handler);
argp[0] = argv[0];
argp[1] = c_flag;
@@ -327,7 +690,6 @@ main(int argc, char **argv)
if (nb_ports == 0)
rte_exit(EXIT_FAILURE, "No Ethernet ports - bye\n");
-
if (nb_ports > RTE_MAX_ETHPORTS)
nb_ports = RTE_MAX_ETHPORTS;
@@ -348,5 +710,86 @@ main(int argc, char **argv)
}
}
+ if (is_tcpdump_enabled == true) {
+
+ /* create pcap virtual devices for rx and tx */
+ pcap_vdev_port_id[0] = create_pcap_pmd_vdev(RX);
+ configure_pcap_vdev(pcap_vdev_port_id[0]);
+
+ pcap_vdev_port_id[1] = create_pcap_pmd_vdev(TX);
+ configure_pcap_vdev(pcap_vdev_port_id[1]);
+
+ /* connect to primary process using AF_UNIX socket */
+ socket_fd = tcpdump_connect_to_primary();
+ if (socket_fd < 0) {
+ printf("cannot connect to primary process!\n");
+ return -1;
+ }
+
+ if (tcpdump_send_request(socket_fd, REGISTER_RXTX_CBS) < 0) {
+ printf("cannot send tcpdump register rxtx cbs request!\n");
+ close(socket_fd);
+ return -1;
+ }
+
+ while (1) {
+ rx_ring = rte_ring_lookup("prim_to_sec_rx");
+ tx_ring = rte_ring_lookup("prim_to_sec_tx");
+ if (rx_ring != NULL && tx_ring != NULL)
+ break;
+ }
+
+ while (!quit_signal) {
+ /* write input packets of port to pcap file for tcpdump */
+ struct rte_mbuf *rx_bufs[BURST_SIZE];
+
+ /* first dequeue packets from ring of primary process */
+ const uint16_t nb_in_deq = rte_ring_dequeue_burst(rx_ring,
+ (void *)rx_bufs, BURST_SIZE);
+ tcpdump_app_stats.in.dequeue_pkts += nb_in_deq;
+
+ if (nb_in_deq) {
+ /* then sent on pcap file */
+ uint16_t nb_in_txd = rte_eth_tx_burst(
+ pcap_vdev_port_id[0],
+ 0, rx_bufs, nb_in_deq);
+ tcpdump_app_stats.in.tx_pkts += nb_in_txd;
+
+ if (unlikely(nb_in_txd < nb_in_deq)) {
+ do {
+ rte_pktmbuf_free(rx_bufs[nb_in_txd]);
+ tcpdump_app_stats.in.freed_pkts++;
+ } while (++nb_in_txd < nb_in_deq);
+ }
+
+ }
+
+ /* write output packets of port to pcap file for tcpdump */
+ struct rte_mbuf *tx_bufs[BURST_SIZE];
+
+ /* first dequeue from ring of primary process */
+ const uint16_t nb_out_deq = rte_ring_dequeue_burst(tx_ring,
+ (void *)tx_bufs, BURST_SIZE);
+ tcpdump_app_stats.out.dequeue_pkts += nb_out_deq;
+
+ if (nb_out_deq) {
+ /* then sent on pcap file */
+ uint16_t nb_out_txd = rte_eth_tx_burst(
+ pcap_vdev_port_id[1],
+ 0, tx_bufs, nb_out_deq);
+ tcpdump_app_stats.out.tx_pkts += nb_out_txd;
+ if (unlikely(nb_out_txd < nb_out_deq)) {
+ do {
+ rte_pktmbuf_free(tx_bufs[nb_out_txd]);
+ tcpdump_app_stats.out.freed_pkts++;
+ } while (++nb_out_txd < nb_out_deq);
+
+ }
+ }
+ }
+
+ print_tcpdump_stats();
+
+ }
return 0;
}
--
1.7.4.1
^ permalink raw reply [flat|nested] 21+ messages in thread
* [dpdk-dev] [PATCH v2 4/5] lib/librte_eal: add tcpdump support in primary process
2016-02-12 14:57 [dpdk-dev] [PATCH v2 0/5] add dpdk packet capture support for tcpdump Reshma Pattan
` (2 preceding siblings ...)
2016-02-12 14:57 ` [dpdk-dev] [PATCH v2 3/5] app/proc_info: add tcpdump support in secondary process Reshma Pattan
@ 2016-02-12 14:57 ` Reshma Pattan
2016-02-17 9:57 ` Pavel Fedin
2016-02-12 14:57 ` [dpdk-dev] [PATCH v2 5/5] doc: update doc for tcpdump feature Reshma Pattan
` (2 subsequent siblings)
6 siblings, 1 reply; 21+ messages in thread
From: Reshma Pattan @ 2016-02-12 14:57 UTC (permalink / raw)
To: dev
Added tcpdump functionality to eal interrupt thread.
Enhanced interrupt thread to support tcpdump socket
and message processing from secondary.
Created new mempool and rings to handle packets of tcpdump.
Added rte_eth_rxtx_callbacks for ingress/egress packets processing
for tcpdump.
Added functionality to remove registered rte_eth_rxtx_callbacks
once secondary process is terminated.
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
lib/librte_eal/linuxapp/eal/Makefile | 5 +-
lib/librte_eal/linuxapp/eal/eal_interrupts.c | 375 +++++++++++++++++++++++++-
2 files changed, 377 insertions(+), 3 deletions(-)
diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index 6e26250..425152c 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -1,6 +1,6 @@
# BSD LICENSE
#
-# Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+# Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
@@ -47,6 +47,9 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common/include
CFLAGS += -I$(RTE_SDK)/lib/librte_ring
CFLAGS += -I$(RTE_SDK)/lib/librte_mempool
CFLAGS += -I$(RTE_SDK)/lib/librte_ivshmem
+CFLAGS += -I$(RTE_SDK)/lib/librte_mbuf
+CFLAGS += -I$(RTE_SDK)/lib/librte_ether
+CFLAGS += -I$(RTE_SDK)/lib/librte_net
CFLAGS += $(WERROR_FLAGS) -O3
# specific to linuxapp exec-env
diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index 06b26a9..dafe0bb 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -1,7 +1,7 @@
/*-
* BSD LICENSE
*
- * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ * Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
@@ -45,7 +45,11 @@
#include <sys/signalfd.h>
#include <sys/ioctl.h>
#include <sys/eventfd.h>
+#include <sys/socket.h>
+#include <sys/un.h>
#include <assert.h>
+#include <arpa/inet.h>
+#include <sys/stat.h>
#include <rte_common.h>
#include <rte_interrupts.h>
@@ -65,15 +69,40 @@
#include <rte_malloc.h>
#include <rte_errno.h>
#include <rte_spinlock.h>
+#include <rte_memcpy.h>
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+#include <rte_ether.h>
+#include <rte_ip.h>
#include "eal_private.h"
#include "eal_vfio.h"
#include "eal_thread.h"
+#include "eal_internal_cfg.h"
#define EAL_INTR_EPOLL_WAIT_FOREVER (-1)
#define NB_OTHER_INTR 1
+#define TCPDUMP_SOCKET_PATH "%s/tcpdump_mp_socket"
+#define TCPDUMP_SOCKET_ERR 0xFF
+#define TCPDUMP_REQ 0x1
+#define RING_SIZE 1024
+#define BURST_SIZE 32
+#define NUM_MBUFS 65536
+#define MBUF_CACHE_SIZE 250
+#define MAX_CBS 54
static RTE_DEFINE_PER_LCORE(int, _epfd) = -1; /**< epoll fd per thread */
+static uint32_t src_ip_filter;
+static int tcpdump_socket_fd;
+struct rte_ring *prim_to_sec_rx;
+struct rte_ring *prim_to_sec_tx;
+struct rte_mempool *tcpdump_pktmbuf_pool;
+static struct rxtx_cbs {
+ uint8_t port;
+ uint16_t queue;
+ struct rte_eth_rxtx_callback *rx_cb;
+ struct rte_eth_rxtx_callback *tx_cb;
+} cbs[54];
/**
* union for pipe fds.
@@ -644,6 +673,259 @@ rte_intr_disable(struct rte_intr_handle *intr_handle)
return 0;
}
+static inline void
+tcpdump_pktmbuf_duplicate(struct rte_mbuf *mi, struct rte_mbuf *m)
+{
+
+ mi->data_len = m->data_len;
+ mi->port = m->port;
+ mi->vlan_tci = m->vlan_tci;
+ mi->vlan_tci_outer = m->vlan_tci_outer;
+ mi->tx_offload = m->tx_offload;
+ mi->hash = m->hash;
+
+ mi->pkt_len = mi->data_len;
+ mi->ol_flags = m->ol_flags;
+ mi->packet_type = m->packet_type;
+
+ rte_memcpy(rte_pktmbuf_mtod(mi, void *),
+ rte_pktmbuf_mtod(m, void *),
+ rte_pktmbuf_data_len(mi));
+
+ __rte_mbuf_sanity_check(mi, 1);
+ __rte_mbuf_sanity_check(m, 0);
+}
+
+static inline struct rte_mbuf *
+tcpdump_pktmbuf_clone(struct rte_mbuf *md, struct rte_mempool *mp)
+{
+ struct rte_mbuf *mc, *mi, **prev;
+ uint32_t pktlen;
+ uint8_t nseg;
+
+ mc = rte_pktmbuf_alloc(mp);
+ if (unlikely(mc == NULL))
+ return NULL;
+
+ mi = mc;
+ prev = &mi->next;
+ pktlen = md->pkt_len;
+ nseg = 0;
+
+ do {
+ nseg++;
+ tcpdump_pktmbuf_duplicate(mi, md);
+ *prev = mi;
+ prev = &mi->next;
+ } while ((md = md->next) != NULL &&
+ (mi = rte_pktmbuf_alloc(mp)) != NULL);
+
+ *prev = NULL;
+ mc->nb_segs = nseg;
+ mc->pkt_len = pktlen;
+
+ /* Allocation of new indirect segment failed */
+ if (unlikely(mi == NULL)) {
+ rte_pktmbuf_free(mc);
+ return NULL;
+ }
+
+ __rte_mbuf_sanity_check(mc, 1);
+ return mc;
+
+}
+
+static int
+compare_filter(struct rte_mbuf *pkt)
+{
+ struct ipv4_hdr *pkt_hdr = rte_pktmbuf_mtod_offset(pkt, struct ipv4_hdr *,
+ sizeof(struct ether_hdr));
+ if (pkt_hdr->src_addr != src_ip_filter)
+ return -1;
+
+ return 0;
+}
+
+static uint16_t
+tcpdump_rx(uint8_t port __rte_unused, uint16_t qidx __rte_unused,
+ struct rte_mbuf **pkts, uint16_t nb_pkts,
+ uint16_t max_pkts __rte_unused, void *_ __rte_unused)
+{
+ unsigned i;
+ uint16_t filtered_pkts = 0;
+ int ring_enq = 0;
+ struct rte_mbuf *dup_bufs[nb_pkts];
+
+ for (i = 0; i < nb_pkts; i++) {
+ if (compare_filter(pkts[i]) == 0)
+ dup_bufs[filtered_pkts++] = tcpdump_pktmbuf_clone(pkts[i],
+ tcpdump_pktmbuf_pool);
+ }
+
+ ring_enq = rte_ring_enqueue_burst(prim_to_sec_rx, (void *)dup_bufs,
+ filtered_pkts);
+ if (unlikely(ring_enq < filtered_pkts)) {
+ do {
+ rte_pktmbuf_free(dup_bufs[ring_enq]);
+ } while (++ring_enq < filtered_pkts);
+ }
+ return nb_pkts;
+}
+
+static uint16_t
+tcpdump_tx(uint8_t port __rte_unused, uint16_t qidx __rte_unused,
+ struct rte_mbuf **pkts, uint16_t nb_pkts,
+ void *_ __rte_unused)
+{
+ int i;
+ int ring_enq = 0;
+ uint16_t filtered_pkts = 0;
+ struct rte_mbuf *dup_bufs[nb_pkts];
+
+ /*
+ * Increment reference count of mbuf to avoid accidental returrn of mbuf
+ * to pool while tcpdump processing is still on.
+ */
+ for (i = 0; i < nb_pkts; i++) {
+ if (compare_filter(pkts[i]) == 0) {
+ rte_pktmbuf_refcnt_update(pkts[i], 1);
+ dup_bufs[filtered_pkts++] = pkts[i];
+ }
+ }
+
+ ring_enq = rte_ring_enqueue_burst(prim_to_sec_tx, (void *)dup_bufs,
+ filtered_pkts);
+ if (unlikely(ring_enq < filtered_pkts)) {
+ do {
+ rte_pktmbuf_free(dup_bufs[ring_enq]);
+ } while (++ring_enq < filtered_pkts);
+ }
+ return nb_pkts;
+}
+
+static void
+tcpdump_create_mpool_n_rings(void)
+{
+ /* Create the mbuf pool */
+ tcpdump_pktmbuf_pool = rte_pktmbuf_pool_create("tcpdump_pktmbuf_pool", NUM_MBUFS,
+ MBUF_CACHE_SIZE, 0, RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id());
+ if (tcpdump_pktmbuf_pool == NULL)
+ rte_exit(EXIT_FAILURE, "Could not initialize tcpdump_pktmbuf_pool\n");
+
+ /* Create rings */
+ prim_to_sec_rx = rte_ring_create("prim_to_sec_rx", RING_SIZE, rte_socket_id(),
+ RING_F_SC_DEQ);
+ prim_to_sec_tx = rte_ring_create("prim_to_sec_tx", RING_SIZE, rte_socket_id(),
+ RING_F_SC_DEQ);
+}
+
+static void
+tcpdump_register_rxtx_callbacks(int port, int queue)
+{
+ static int cnt;
+
+ cbs[cnt].port = port;
+ cbs[cnt].queue = queue;
+ cbs[cnt].rx_cb = rte_eth_add_rx_callback(port, queue, tcpdump_rx, NULL);
+ cbs[cnt].tx_cb = rte_eth_add_tx_callback(port, queue, tcpdump_tx, NULL);
+ cnt++;
+}
+
+static void
+tcpdump_remove_rxtx_callbacks(int port, int queue)
+{
+ int i;
+
+ for (i = 0; i < MAX_CBS; i++) {
+ if ((cbs[i].port == port) && (cbs[i].queue == queue)) {
+ rte_eth_remove_rx_callback(port, queue, cbs[i].rx_cb);
+ rte_eth_remove_tx_callback(port, queue, cbs[i].tx_cb);
+ }
+ }
+}
+
+/* receive a request and return it */
+static int
+tcpdump_receive_request(int socket)
+{
+ char buffer[256];
+ char *buf;
+ int msg_type;
+
+ int port, queue;
+ int rval;
+ int buf_offset;
+
+ struct msghdr reg_cbs_msg;
+ struct iovec msg[3];
+
+ memset(®_cbs_msg, 0, sizeof(reg_cbs_msg));
+ reg_cbs_msg.msg_iov = msg;
+ reg_cbs_msg.msg_iovlen = 3;
+
+ msg[0].iov_base = (char *) &msg_type;
+ msg[0].iov_len = 1;
+
+ msg[1].iov_base = (char *) buffer;
+ msg[1].iov_len = sizeof(buffer);
+
+ msg[2].iov_base = (char *) &src_ip_filter;
+ msg[2].iov_len = sizeof(uint32_t);
+
+ rval = recvmsg(socket, ®_cbs_msg, 0);
+ if (rval < 0) {
+ RTE_LOG(ERR, EAL, "Error reading from file descriptor %d: %s\n",
+ socket,
+ strerror(errno));
+ return -1;
+ } else if (rval == 0) {
+ RTE_LOG(ERR, EAL, "Read nothing from file "
+ "descriptor %d\n", socket);
+ return -1;
+ }
+
+ buf = buffer;
+
+ /* Update port and queue */
+ while (sscanf(buf, "%*[^0123456789]%d%*[^0123456789]%d%n", &port,
+ &queue, &buf_offset) == 2) {
+ if (msg_type == 2)
+ tcpdump_register_rxtx_callbacks(port, queue);
+ else if (msg_type == 1)
+ tcpdump_remove_rxtx_callbacks(port, queue);
+ buf += buf_offset;
+ }
+
+ return 0;
+}
+
+static void
+tcpdump_socket_ready(int socket)
+{
+ for (;;) {
+ int conn_sock;
+ struct sockaddr_un addr;
+
+ socklen_t sockaddr_len = sizeof(addr);
+ /* this is a blocking call */
+ conn_sock = accept(socket, (struct sockaddr *) &addr, &sockaddr_len);
+ /* just restart on error */
+ if (conn_sock == -1)
+ continue;
+
+ /* set socket to linger after close */
+ struct linger l;
+
+ l.l_onoff = 1;
+ l.l_linger = 60;
+ setsockopt(conn_sock, SOL_SOCKET, SO_LINGER, &l, sizeof(l));
+
+ tcpdump_receive_request(conn_sock);
+ close(conn_sock);
+ break;
+ }
+}
+
static int
eal_intr_process_interrupts(struct epoll_event *events, int nfds)
{
@@ -655,6 +937,13 @@ eal_intr_process_interrupts(struct epoll_event *events, int nfds)
for (n = 0; n < nfds; n++) {
+ if (internal_config.process_type == RTE_PROC_PRIMARY) {
+
+ /** tcpdump socket fd */
+ if (events[n].data.fd == tcpdump_socket_fd)
+ tcpdump_socket_ready(tcpdump_socket_fd);
+ }
+
/**
* if the pipe fd is ready to read, return out to
* rebuild the wait list.
@@ -786,6 +1075,61 @@ eal_intr_handle_interrupts(int pfd, unsigned totalfds)
}
}
+/* get socket path (/var/run if root, $HOME otherwise) */
+ static void
+tcpdump_get_socket_path(char *buffer, int bufsz)
+{
+ const char *dir = "/var/run/tcpdump_socket";
+ const char *home_dir = getenv("HOME/tcpdump_socket");
+
+ if (getuid() != 0 && home_dir != NULL)
+ dir = home_dir;
+ mkdir(dir, 700);
+ /* use current prefix as file path */
+ snprintf(buffer, bufsz, TCPDUMP_SOCKET_PATH, dir);
+}
+
+static int
+tcpdump_create_primary_socket(void)
+{
+ int ret, socket_fd;
+ struct sockaddr_un addr;
+ socklen_t sockaddr_len;
+
+ /* set up a socket */
+ socket_fd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+ if (socket_fd < 0) {
+ RTE_LOG(ERR, EAL, "Failed to create socket!\n");
+ return -1;
+ }
+
+ tcpdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path));
+ addr.sun_family = AF_UNIX;
+ sockaddr_len = sizeof(struct sockaddr_un);
+
+ /* unlink() before bind() to remove the socket if it already exists */
+ unlink(addr.sun_path);
+
+ ret = bind(socket_fd, (struct sockaddr *) &addr, sockaddr_len);
+ if (ret) {
+ RTE_LOG(ERR, EAL, "Failed to bind socket: %s!\n", strerror(errno));
+ close(socket_fd);
+ return -1;
+ }
+
+ ret = listen(socket_fd, 1);
+ if (ret) {
+ RTE_LOG(ERR, EAL, "Failed to listen: %s!\n", strerror(errno));
+ close(socket_fd);
+ return -1;
+ }
+
+ /* save the socket in local configuration */
+ tcpdump_socket_fd = socket_fd;
+
+ return 0;
+}
+
/**
* It builds/rebuilds up the epoll file descriptor with all the
* file descriptors being waited on. Then handles the interrupts.
@@ -800,9 +1144,9 @@ static __attribute__((noreturn)) void *
eal_intr_thread_main(__rte_unused void *arg)
{
struct epoll_event ev;
-
/* host thread, never break out */
for (;;) {
+
/* build up the epoll fd with all descriptors we are to
* wait on then pass it to the handle_interrupts function
*/
@@ -829,6 +1173,23 @@ eal_intr_thread_main(__rte_unused void *arg)
}
numfds++;
+ /* build up the epoll fd with tcpdump descriptor.
+ */
+ static struct epoll_event tcpdump_event = {
+ .events = EPOLLIN | EPOLLPRI,
+ };
+
+ if (internal_config.process_type == RTE_PROC_PRIMARY) {
+ tcpdump_event.data.fd = tcpdump_socket_fd;
+ if (epoll_ctl(pfd, EPOLL_CTL_ADD, tcpdump_socket_fd,
+ &tcpdump_event) < 0) {
+ rte_panic("Error adding tcpdump socket fd to %d "
+ "epoll_ctl, %s\n",
+ tcpdump_socket_fd, strerror(errno));
+ }
+ numfds++;
+ }
+
rte_spinlock_lock(&intr_lock);
TAILQ_FOREACH(src, &intr_sources, next) {
@@ -877,6 +1238,16 @@ rte_eal_intr_init(void)
if (pipe(intr_pipe.pipefd) < 0)
return -1;
+ /* if primary, try to open tcpdump socket */
+ if (internal_config.process_type == RTE_PROC_PRIMARY) {
+ if (tcpdump_create_primary_socket() < 0) {
+ RTE_LOG(ERR, EAL, "Failed to set up tcpdump_socket_fd for "
+ "tcpdump in primary\n");
+ return -1;
+ }
+ tcpdump_create_mpool_n_rings();
+ }
+
/* create the host thread to wait/handle the interrupt */
ret = pthread_create(&intr_thread, NULL,
eal_intr_thread_main, NULL);
--
1.7.4.1
^ permalink raw reply [flat|nested] 21+ messages in thread
* [dpdk-dev] [PATCH v2 5/5] doc: update doc for tcpdump feature
2016-02-12 14:57 [dpdk-dev] [PATCH v2 0/5] add dpdk packet capture support for tcpdump Reshma Pattan
` (3 preceding siblings ...)
2016-02-12 14:57 ` [dpdk-dev] [PATCH v2 4/5] lib/librte_eal: add tcpdump support in primary process Reshma Pattan
@ 2016-02-12 14:57 ` Reshma Pattan
2016-02-22 10:01 ` Mcnamara, John
2016-02-18 14:08 ` [dpdk-dev] [PATCH v2 0/5] add dpdk packet capture support for tcpdump Pavel Fedin
2016-03-02 12:16 ` [dpdk-dev] [PATCH v3 " Reshma Pattan
6 siblings, 1 reply; 21+ messages in thread
From: Reshma Pattan @ 2016-02-12 14:57 UTC (permalink / raw)
To: dev
Added tcpdump design changes to proc_info section of
sample application user guide.
Added tcpdump design changes to env abstraction layer section
of programmers guide.
Updated Release notes.
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
doc/guides/rel_notes/release_16_04.rst | 9 ++++++---
1 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst
index 27fc624..7b005bb 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -39,6 +39,9 @@ This section should contain new features added in this release. Sample format:
Enabled virtio 1.0 support for virtio pmd driver.
+* **Added dpdk packet capturing support for tcpdump.**
+
+Now users have facility to see packets on dpdk ports using proc_info app and tcpdump.
Resolved Issues
---------------
@@ -58,11 +61,11 @@ EAL
Drivers
~~~~~~~
-
+* **PCAP: Added public API support for creation of PCAP device using pcaps and dumpers.**
Libraries
~~~~~~~~~
-
+* **Enhanced eal library to support dpdk packet capturing support for tcpdump.**
Examples
~~~~~~~~
@@ -70,7 +73,7 @@ Examples
Other
~~~~~
-
+* **Enhanced app/proc_info for dpdk packet capturing support for tcpdump.**
Known Issues
------------
--
1.7.4.1
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [dpdk-dev] [PATCH v2 2/5] drivers/net/pcap: add public api to create pcap device
2016-02-12 14:57 ` [dpdk-dev] [PATCH v2 2/5] drivers/net/pcap: add public api to create pcap device Reshma Pattan
@ 2016-02-17 9:03 ` Pavel Fedin
0 siblings, 0 replies; 21+ messages in thread
From: Pavel Fedin @ 2016-02-17 9:03 UTC (permalink / raw)
To: 'Reshma Pattan', dev
Hello!
> diff --git a/drivers/net/pcap/rte_pmd_pcap_version.map
> b/drivers/net/pcap/rte_pmd_pcap_version.map
> index ef35398..104dc4d 100644
> --- a/drivers/net/pcap/rte_pmd_pcap_version.map
> +++ b/drivers/net/pcap/rte_pmd_pcap_version.map
> @@ -2,3 +2,11 @@ DPDK_2.0 {
>
> local: *;
> };
> +
> +DPDK_2.3 {
> + global:
> +
> + rte_eth_from_pcapsndumpers;
> +
> +} DPDK_2.0;
> +
This one produces style warning upon git am:
--- cut ---
Applying: drivers/net/pcap: add public api to create pcap
/home/p.fedin/dpdk/.git/rebase-apply/patch:333: new blank line at EOF.
+
--- cut ---
I guess the last empty line is not needed
Kind regards,
Pavel Fedin
Senior Engineer
Samsung Electronics Research center Russia
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [dpdk-dev] [PATCH v2 4/5] lib/librte_eal: add tcpdump support in primary process
2016-02-12 14:57 ` [dpdk-dev] [PATCH v2 4/5] lib/librte_eal: add tcpdump support in primary process Reshma Pattan
@ 2016-02-17 9:57 ` Pavel Fedin
0 siblings, 0 replies; 21+ messages in thread
From: Pavel Fedin @ 2016-02-17 9:57 UTC (permalink / raw)
To: 'Reshma Pattan', dev
Hello!
> +static int
> +compare_filter(struct rte_mbuf *pkt)
> +{
> + struct ipv4_hdr *pkt_hdr = rte_pktmbuf_mtod_offset(pkt, struct ipv4_hdr *,
> + sizeof(struct ether_hdr));
> + if (pkt_hdr->src_addr != src_ip_filter)
> + return -1;
> +
> + return 0;
> +}
Some critics to this...
What if i want to capture packets coming from more than one host?
What if i want to capture all packets?
What if it's not IPv4 at all?
May be this function should always return 0 if src_ip_filter == 0? This would at least be a quick way to disable filtering.
Kind regards,
Pavel Fedin
Senior Engineer
Samsung Electronics Research center Russia
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [dpdk-dev] [PATCH v2 0/5] add dpdk packet capture support for tcpdump
2016-02-12 14:57 [dpdk-dev] [PATCH v2 0/5] add dpdk packet capture support for tcpdump Reshma Pattan
` (4 preceding siblings ...)
2016-02-12 14:57 ` [dpdk-dev] [PATCH v2 5/5] doc: update doc for tcpdump feature Reshma Pattan
@ 2016-02-18 14:08 ` Pavel Fedin
2016-02-23 13:16 ` Pattan, Reshma
2016-03-02 12:16 ` [dpdk-dev] [PATCH v3 " Reshma Pattan
6 siblings, 1 reply; 21+ messages in thread
From: Pavel Fedin @ 2016-02-18 14:08 UTC (permalink / raw)
To: 'Reshma Pattan', dev
Hello!
With the aforementioned fix (disabling src_ip_filter if zero) i've got the patch series working. Now i have some more notes on
usability:
> 2)Start proc_info(runs as secondary process by default)application with new parameters for
> tcpdump.
> ex: sudo ./build/app/proc_info/dpdk_proc_info -c 0x4 -n 2 -- -p 0x3 --tcpdump '(0,0)(1,0)' --
> src-ip-filter="2.2.2.2"
1. Perhaps, ability to separate queues is useful for something. But not always. For example, what if i want to capture all the
traffic which passes through some interface (common use case)? For example, with OpenVSwitch i can have 9 queues on my networking
card. So, i have to enumerate all of them: (0,0)(0,1)(0,2)... It's insane and inconvenient with many queues. What if you could have
shorthand notation, like (0) or (0,*) for this?
2. What if i don't want separate RX and TX streams either? It only prevents me from seeing the complete picture.
3. vhostuser ports are missing. Perhaps not really related to this patchset, i just don't know how much code "server" part of
vhostuser shares with normal PMDs, but anyway, ability to dump them too would be nice to have.
Not directly related, but could we have some interface to tcpdump or wireshark? Would be good to have ability to dump packets in
real time.
Kind regards,
Pavel Fedin
Senior Engineer
Samsung Electronics Research center Russia
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [dpdk-dev] [PATCH v2 5/5] doc: update doc for tcpdump feature
2016-02-12 14:57 ` [dpdk-dev] [PATCH v2 5/5] doc: update doc for tcpdump feature Reshma Pattan
@ 2016-02-22 10:01 ` Mcnamara, John
0 siblings, 0 replies; 21+ messages in thread
From: Mcnamara, John @ 2016-02-22 10:01 UTC (permalink / raw)
To: Pattan, Reshma, dev
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Reshma Pattan
> Sent: Friday, February 12, 2016 2:57 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH v2 5/5] doc: update doc for tcpdump feature
>
> Added tcpdump design changes to proc_info section of sample application
> user guide.
> Added tcpdump design changes to env abstraction layer section of
> programmers guide.
> Updated Release notes.
>
> Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [dpdk-dev] [PATCH v2 0/5] add dpdk packet capture support for tcpdump
2016-02-18 14:08 ` [dpdk-dev] [PATCH v2 0/5] add dpdk packet capture support for tcpdump Pavel Fedin
@ 2016-02-23 13:16 ` Pattan, Reshma
2016-02-24 15:04 ` Pavel Fedin
0 siblings, 1 reply; 21+ messages in thread
From: Pattan, Reshma @ 2016-02-23 13:16 UTC (permalink / raw)
To: Pavel Fedin; +Cc: dev
Hi,
> -----Original Message-----
> From: Pavel Fedin [mailto:p.fedin@samsung.com]
> Sent: Thursday, February 18, 2016 2:08 PM
> To: Pattan, Reshma <reshma.pattan@intel.com>; dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v2 0/5] add dpdk packet capture support for
> tcpdump
>
>
> 1. Perhaps, ability to separate queues is useful for something. But not always.
> For example, what if i want to capture all the traffic which passes through some
> interface (common use case)? For example, with OpenVSwitch i can have 9
> queues on my networking card. So, i have to enumerate all of them:
> (0,0)(0,1)(0,2)... It's insane and inconvenient with many queues. What if you
> could have shorthand notation, like (0) or (0,*) for this?
I will fix this in my next version of patch.
> 2. What if i don't want separate RX and TX streams either? It only prevents me
> from seeing the complete picture.
Do you mean not to have separate pcap files for tx and rx? If so, I would prefer to keep this as it is.
Because pcap changes need to be replaced with TUN/TAP pmd once available in future.
> 3. vhostuser ports are missing. Perhaps not really related to this patchset, i just
> don't know how much code "server" part of vhostuser shares with normal PMDs,
> but anyway, ability to dump them too would be nice to have.
>
I think this can be done in future i.e. when vhost as PMD is available. But as of now vhost is library.
> Not directly related, but could we have some interface to tcpdump or
> wireshark? Would be good to have ability to dump packets in real time.
This can be done in future once KNI or TUN/TAP PMDs is available in DPDK.
Thanks,
Reshma
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [dpdk-dev] [PATCH v2 0/5] add dpdk packet capture support for tcpdump
2016-02-23 13:16 ` Pattan, Reshma
@ 2016-02-24 15:04 ` Pavel Fedin
2016-02-29 16:11 ` Pattan, Reshma
0 siblings, 1 reply; 21+ messages in thread
From: Pavel Fedin @ 2016-02-24 15:04 UTC (permalink / raw)
To: 'Pattan, Reshma'; +Cc: dev
Hello!
> > 2. What if i don't want separate RX and TX streams either? It only prevents me
> > from seeing the complete picture.
>
> Do you mean not to have separate pcap files for tx and rx? If so, I would prefer to keep this
> as it is.
I mean - add an option not to have separate files.
> Because pcap changes need to be replaced with TUN/TAP pmd once available in future.
I believe it's lo-o-o-ong way to get there...
> > 3. vhostuser ports are missing. Perhaps not really related to this patchset, i just
> > don't know how much code "server" part of vhostuser shares with normal PMDs,
> > but anyway, ability to dump them too would be nice to have.
> >
>
> I think this can be done in future i.e. when vhost as PMD is available. But as of now vhost
> is library.
I expected "server"-side vhost to be the same as "client" part (AKA virtio), just use another mechanism for exchanging control
information (via socket). Is it not true? I suppose, driving queues from both sides should be quite symmetric.
Kind regards,
Pavel Fedin
Senior Engineer
Samsung Electronics Research center Russia
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [dpdk-dev] [PATCH v2 0/5] add dpdk packet capture support for tcpdump
2016-02-24 15:04 ` Pavel Fedin
@ 2016-02-29 16:11 ` Pattan, Reshma
0 siblings, 0 replies; 21+ messages in thread
From: Pattan, Reshma @ 2016-02-29 16:11 UTC (permalink / raw)
To: Pavel Fedin; +Cc: dev
Hi,
> -----Original Message-----
> From: Pavel Fedin [mailto:p.fedin@samsung.com]
> Sent: Wednesday, February 24, 2016 3:05 PM
> To: Pattan, Reshma <reshma.pattan@intel.com>
> Cc: dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v2 0/5] add dpdk packet capture support for
> tcpdump
>
> Hello!
>
> > > 2. What if i don't want separate RX and TX streams either? It only
> > > prevents me from seeing the complete picture.
> >
> > Do you mean not to have separate pcap files for tx and rx? If so, I
> > would prefer to keep this as it is.
>
> I mean - add an option not to have separate files.
OK, I will make changes in v3.
>
> > > 3. vhostuser ports are missing. Perhaps not really related to this
> > > patchset, i just don't know how much code "server" part of vhostuser
> > > shares with normal PMDs, but anyway, ability to dump them too would be
> nice to have.
> > >
> >
> > I think this can be done in future i.e. when vhost as PMD is
> > available. But as of now vhost is library.
>
> I expected "server"-side vhost to be the same as "client" part (AKA virtio), just
> use another mechanism for exchanging control information (via socket). Is it not
> true? I suppose, driving queues from both sides should be quite symmetric.
>
At this stage of release adding these changes is difficult as I don't have knowledge on vhost.
But at the same if anyone from committee would like to make these enhancements are welcome.
Thanks,
Reshma
^ permalink raw reply [flat|nested] 21+ messages in thread
* [dpdk-dev] [PATCH v3 0/5] add dpdk packet capture support for tcpdump
2016-02-12 14:57 [dpdk-dev] [PATCH v2 0/5] add dpdk packet capture support for tcpdump Reshma Pattan
` (5 preceding siblings ...)
2016-02-18 14:08 ` [dpdk-dev] [PATCH v2 0/5] add dpdk packet capture support for tcpdump Pavel Fedin
@ 2016-03-02 12:16 ` Reshma Pattan
2016-03-02 12:16 ` [dpdk-dev] [PATCH v3 1/5] app/test-pmd: fix nb_rxq and nb_txq checks Reshma Pattan
` (5 more replies)
6 siblings, 6 replies; 21+ messages in thread
From: Reshma Pattan @ 2016-03-02 12:16 UTC (permalink / raw)
To: dev
This patch set include the design to capture packets from dpdk ports for tcpdump.
This patch set include test-pmd changes to verify below patch set.
http://dpdk.org/dev/patchwork/patch/9750/
http://dpdk.org/dev/patchwork/patch/9751/
http://dpdk.org/dev/patchwork/patch/9752/
Dependencies 1:
Patches 2/5 and 3/5 of current patch set contains pcap based design.
So to compile and run these patches, libpcap must be installed and pcap
config option should be set to yes i.e. CONFIG_RTE_LIBRTE_PMD_PCAP=y.
packet capture flow for tcpdump:
================================
Part of the design is implemented in secondary process (proc_info.c) and other part
in primary process (eal_interrupt.c).
Communication between primary and secondary processes is provided using socket and rte_ring.
[Secondary process: proc_info application]
*Changes are included in the patch 3/5
*User should request packet capture via proc_info application command line by passing newly
added tcpdump command line options i.e. [--tcpdump (port,queue)] [ --src-ip-filter \"A.B.C.D\"]
[--single-tcpdump-file].
Note: As basic support, a src ip filter option is provided for filtering the packets.
This is optional. If user dont provide any src ip filter option all packets will be captured
for tcpdump.
*proc_info application sends port, queue and src ip filter information to primary process
along with register rx tx callbacks message.
*proc_info application either writes ingress, egress packets to seperate RX and TX pcap files,
or writes both ingress and egress packets to single RX_TX pcap file.
This behaviour can be controlled by passing command line option "--sicgle-tcpdump-file".
*proc_info application runs in a while loop, dequeues packets sent by primary process over
shared rte_ring and write the packets to pcap file.
[Primary Process]:
*Changes are included in the patch 4/5.
*Creates rte_rings and mempool used for communicating the packets with the secondary process.
*Creates socket, waits on socket for message from secondary process.
*Upon receiving the register rx tx callbacks message, registers ''rte_eth_rxtx_callbacks''
for receiving ingress and egress packets of given port and queue.
*RX callback:
Get the packets, apply src ip filter, for the matched packets duplicate packets will be
created from new mempool and the new duplicated packets will be enqueued to
rte_ring for the secondary process to dequeue and write to pcap.
Note: If user dont provide any src ip filter option, all the packets are captured.
*TX callback:
Gets the packets, apply src ip filter, for the matched packets increments reference
counter of the packet, enqueue to other rte_ring for the secondary process to dequeue and
write to pcap.
Note: If user dont provide any src ip filter option, all the packets are captured.
[Secondary Process]:
*When the secondary process is terminated with ''ctrl+c'', secondary process sends remove
rx tx callbacks message to the primary process.
*[Primary Process]:
*When the primary process receives remove rx tx callbacks message, it removes registered rxtx callbacks.
Users who wish to view packets can run "tcpdump -r RX_pcap.pcap/TX_pcap.pcap/RX_TX_pcap.pcap"
to view packets of interest.
Running the changes:
===================
1)Start any primary sample application.
ex:sudo ./examples/rxtx_callbacks/build/rxtx_callbacks -c 0x2 -n 2
2)Start traffic from traffic generator.
3)Start proc_info(runs as secondary process by default)application with new parameters for tcpdump.
these parameters can be mix and matched to acheive different usability experience
ex1:
sudo ./build/app/proc_info/dpdk_proc_info -c 0x4 -n 2 -- -p 0x3 --tcpdump '(0,0)(1,0)' --src-ip-filter="2.2.2.2"
packets from queue 0 of each port matching with src ip filter are captured to /tmp/RX_pcap.pcap, /tmp/TX_pcap.pcap.
ex2:
sudo ./build/app/proc_info/dpdk_proc_info -c 0x4 -n 2 -- -p 0x3 --tcpdump '(0,*)(1,*)' --src-ip-filter="2.2.2.2" --single-tcpdump-file
packets from all queues of each port matching with src ip filter are captured to single pcap file i.e. /tmp/RX_TX_pcap.pcap.
ex3:
sudo ./build/app/proc_info/dpdk_proc_info -c 0x4 -n 2 -- -p 0x3 --tcpdump '(0,*)(1,0)'
packets from all queues of port 0 and packets from queue 0 of port 1 are captured to /tmp/RX_pcap.pcap, /tmp/TX_pcap.pcap.
4)Stop the secondary process using "ctrl+c" and rerun it, packet capturing should start again.
Note 1: For every start of proc_info application existing pacp files from /tmp/ folder will be removed and
new ones will be created.
Note 2: Secondary process must use cores different from the primary process cores.
Known limitations:
1: Writing to PCAP files will be stopped once the folder size where pcap files exists reaches its max value.
2: Because of the underlying pcap writing overhead packets can only be captured at slow rates.
v3:
* added tcpdump design changes to programmers guide and proc_info sample
application guide.
* updated packet capture logic in eal_interrupts.c and proc_info/main.c
to capture packets from all queues of given port when queue is specified as '*'.
in --tcpdump '(port,queue)' command line option.
* updated packet capture logic in eal_interrupts.c, to capture all packets when
"--src-ip-filter" option is not specified.
* added new "--single-tcpdump-file" command line option to the proc_info application to
enable packet capturing to single pcap file. If option not specified ingress and
egress packets will be captured to seperate RX & TX pcap files.
* removed blank line from EOF of rte_pmd_pcap_version.map.
v2:
* extended nb_rxq/nb_txq check to other fwd modes along with rx_only and tx_only.
* changed some of the global variables to static in proc_info/main.c and eal_interrupts.c.
* release notes updated.
Reshma Pattan (5):
app/test-pmd: fix nb_rxq and nb_txq checks
drivers/net/pcap: add public api to create pcap device
app/proc_info: add tcpdump support in secondary process
lib/librte_eal: add tcpdump support in primary process
doc: update doc for tcpdump feature
app/proc_info/main.c | 472 ++++++++++++++++++++++-
app/test-pmd/cmdline.c | 11 +-
app/test-pmd/parameters.c | 14 +-
app/test-pmd/testpmd.c | 28 ++-
doc/guides/prog_guide/env_abstraction_layer.rst | 43 ++-
doc/guides/rel_notes/release_16_04.rst | 6 +
doc/guides/sample_app_ug/proc_info.rst | 57 +++-
drivers/net/pcap/Makefile | 4 +-
drivers/net/pcap/rte_eth_pcap.c | 156 +++++++-
drivers/net/pcap/rte_eth_pcap.h | 87 +++++
drivers/net/pcap/rte_pmd_pcap_version.map | 7 +
lib/librte_eal/linuxapp/eal/Makefile | 5 +-
lib/librte_eal/linuxapp/eal/eal_interrupts.c | 422 ++++++++++++++++++++-
13 files changed, 1261 insertions(+), 51 deletions(-)
create mode 100644 drivers/net/pcap/rte_eth_pcap.h
--
1.7.4.1
^ permalink raw reply [flat|nested] 21+ messages in thread
* [dpdk-dev] [PATCH v3 1/5] app/test-pmd: fix nb_rxq and nb_txq checks
2016-03-02 12:16 ` [dpdk-dev] [PATCH v3 " Reshma Pattan
@ 2016-03-02 12:16 ` Reshma Pattan
2016-03-02 12:16 ` [dpdk-dev] [PATCH v3 2/5] drivers/net/pcap: add public api to create pcap device Reshma Pattan
` (4 subsequent siblings)
5 siblings, 0 replies; 21+ messages in thread
From: Reshma Pattan @ 2016-03-02 12:16 UTC (permalink / raw)
To: dev
Made testpmd changes to validate nb_rxq/nb_txq zero
value changes of librte_ether.
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
app/test-pmd/cmdline.c | 11 +++++------
app/test-pmd/parameters.c | 14 +++++++++-----
app/test-pmd/testpmd.c | 28 +++++++++++++++++++++++++---
3 files changed, 39 insertions(+), 14 deletions(-)
diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 52e9f5f..f8e71a3 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -1,7 +1,7 @@
/*-
* BSD LICENSE
*
- * Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ * Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
* Copyright(c) 2014 6WIND S.A.
* All rights reserved.
*
@@ -1163,17 +1163,16 @@ cmd_config_rx_tx_parsed(void *parsed_result,
printf("Please stop all ports first\n");
return;
}
-
if (!strcmp(res->name, "rxq")) {
- if (res->value <= 0) {
- printf("rxq %d invalid - must be > 0\n", res->value);
+ if (!res->value && !nb_txq) {
+ printf("Warning: Either rx or tx queues should be non zero\n");
return;
}
nb_rxq = res->value;
}
else if (!strcmp(res->name, "txq")) {
- if (res->value <= 0) {
- printf("txq %d invalid - must be > 0\n", res->value);
+ if (!res->value && !nb_rxq) {
+ printf("Warning: Either rx or tx queues should be non zero\n");
return;
}
nb_txq = res->value;
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 4b421c8..55572eb 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -1,7 +1,7 @@
/*-
* BSD LICENSE
*
- * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ * Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
@@ -810,22 +810,26 @@ launch_args_parse(int argc, char** argv)
rss_hf = ETH_RSS_UDP;
if (!strcmp(lgopts[opt_idx].name, "rxq")) {
n = atoi(optarg);
- if (n >= 1 && n <= (int) MAX_QUEUE_ID)
+ if (n >= 0 && n <= (int) MAX_QUEUE_ID)
nb_rxq = (queueid_t) n;
else
rte_exit(EXIT_FAILURE, "rxq %d invalid - must be"
- " >= 1 && <= %d\n", n,
+ " >= 0 && <= %d\n", n,
(int) MAX_QUEUE_ID);
}
if (!strcmp(lgopts[opt_idx].name, "txq")) {
n = atoi(optarg);
- if (n >= 1 && n <= (int) MAX_QUEUE_ID)
+ if (n >= 0 && n <= (int) MAX_QUEUE_ID)
nb_txq = (queueid_t) n;
else
rte_exit(EXIT_FAILURE, "txq %d invalid - must be"
- " >= 1 && <= %d\n", n,
+ " >= 0 && <= %d\n", n,
(int) MAX_QUEUE_ID);
}
+ if (!nb_rxq && !nb_txq) {
+ rte_exit(EXIT_FAILURE, "Either rx or tx queues should "
+ "be non-zero\n");
+ }
if (!strcmp(lgopts[opt_idx].name, "burst")) {
n = atoi(optarg);
if ((n >= 1) && (n <= MAX_PKT_BURST))
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 1319917..a523442 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -1,7 +1,7 @@
/*-
* BSD LICENSE
*
- * Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ * Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
@@ -608,6 +608,7 @@ init_fwd_streams(void)
portid_t pid;
struct rte_port *port;
streamid_t sm_id, nb_fwd_streams_new;
+ queueid_t q;
/* set socket id according to numa or not */
FOREACH_PORT(pid, ports) {
@@ -643,7 +644,12 @@ init_fwd_streams(void)
}
}
- nb_fwd_streams_new = (streamid_t)(nb_ports * nb_rxq);
+ q = RTE_MAX(nb_rxq, nb_txq);
+ if (q == 0) {
+ printf("Fail:Cannot allocate fwd streams as number of queues is 0\n");
+ return -1;
+ }
+ nb_fwd_streams_new = (streamid_t)(nb_ports * q);
if (nb_fwd_streams_new == nb_fwd_streams)
return 0;
/* clear the old */
@@ -955,6 +961,19 @@ start_packet_forwarding(int with_tx_first)
portid_t pt_id;
streamid_t sm_id;
+ if (strcmp(cur_fwd_eng->fwd_mode_name, "rxonly") == 0 && !nb_rxq)
+ rte_exit(EXIT_FAILURE, "rxq are 0, cannot use rxonly fwd mode\n");
+
+ if (strcmp(cur_fwd_eng->fwd_mode_name, "txonly") == 0 && !nb_txq)
+ rte_exit(EXIT_FAILURE, "txq are 0, cannot use txonly fwd mode\n");
+
+ if ((strcmp(cur_fwd_eng->fwd_mode_name, "rxonly") != 0 &&
+ strcmp(cur_fwd_eng->fwd_mode_name, "txonly") != 0) &&
+ (!nb_rxq || !nb_txq))
+ rte_exit(EXIT_FAILURE,
+ "Either rxq or txq are 0, cannot use %s fwd mode\n",
+ cur_fwd_eng->fwd_mode_name);
+
if (all_ports_started() == 0) {
printf("Not all ports were started\n");
return;
@@ -2037,7 +2056,10 @@ main(int argc, char** argv)
if (argc > 1)
launch_args_parse(argc, argv);
- if (nb_rxq > nb_txq)
+ if (!nb_rxq && !nb_txq)
+ printf("Warning: Either rx or tx queues should be non-zero\n");
+
+ if (nb_rxq > 1 && nb_rxq > nb_txq)
printf("Warning: nb_rxq=%d enables RSS configuration, "
"but nb_txq=%d will prevent to fully test it.\n",
nb_rxq, nb_txq);
--
1.7.4.1
^ permalink raw reply [flat|nested] 21+ messages in thread
* [dpdk-dev] [PATCH v3 2/5] drivers/net/pcap: add public api to create pcap device
2016-03-02 12:16 ` [dpdk-dev] [PATCH v3 " Reshma Pattan
2016-03-02 12:16 ` [dpdk-dev] [PATCH v3 1/5] app/test-pmd: fix nb_rxq and nb_txq checks Reshma Pattan
@ 2016-03-02 12:16 ` Reshma Pattan
2016-03-02 12:16 ` [dpdk-dev] [PATCH v3 3/5] app/proc_info: add tcpdump support in secondary process Reshma Pattan
` (3 subsequent siblings)
5 siblings, 0 replies; 21+ messages in thread
From: Reshma Pattan @ 2016-03-02 12:16 UTC (permalink / raw)
To: dev
Added new public api to create pcap device from pcaps.
Added new header file for API declaration.
Added new public api to version map
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
drivers/net/pcap/Makefile | 4 +-
drivers/net/pcap/rte_eth_pcap.c | 156 +++++++++++++++++++++++++---
drivers/net/pcap/rte_eth_pcap.h | 87 ++++++++++++++++
drivers/net/pcap/rte_pmd_pcap_version.map | 7 ++
4 files changed, 235 insertions(+), 19 deletions(-)
create mode 100644 drivers/net/pcap/rte_eth_pcap.h
diff --git a/drivers/net/pcap/Makefile b/drivers/net/pcap/Makefile
index b41d8a2..8e424bf 100644
--- a/drivers/net/pcap/Makefile
+++ b/drivers/net/pcap/Makefile
@@ -1,6 +1,6 @@
# BSD LICENSE
#
-# Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+# Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
# Copyright(c) 2014 6WIND S.A.
# All rights reserved.
#
@@ -53,7 +53,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += rte_eth_pcap.c
#
# Export include files
#
-SYMLINK-y-include +=
+SYMLINK-y-include += rte_eth_pcap.h
# this lib depends upon:
DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += lib/librte_mbuf
diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
index f9230eb..b7b9fd9 100644
--- a/drivers/net/pcap/rte_eth_pcap.c
+++ b/drivers/net/pcap/rte_eth_pcap.c
@@ -1,7 +1,7 @@
/*-
* BSD LICENSE
*
- * Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ * Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
* Copyright(c) 2014 6WIND S.A.
* All rights reserved.
*
@@ -44,7 +44,7 @@
#include <net/if.h>
-#include <pcap.h>
+#include "rte_eth_pcap.h"
#define RTE_ETH_PCAP_SNAPSHOT_LEN 65535
#define RTE_ETH_PCAP_SNAPLEN ETHER_MAX_JUMBO_FRAME_LEN
@@ -85,21 +85,6 @@ struct pcap_tx_queue {
char type[ETH_PCAP_ARG_MAXLEN];
};
-struct rx_pcaps {
- unsigned num_of_rx;
- pcap_t *pcaps[RTE_PMD_RING_MAX_RX_RINGS];
- const char *names[RTE_PMD_RING_MAX_RX_RINGS];
- const char *types[RTE_PMD_RING_MAX_RX_RINGS];
-};
-
-struct tx_pcaps {
- unsigned num_of_tx;
- pcap_dumper_t *dumpers[RTE_PMD_RING_MAX_TX_RINGS];
- pcap_t *pcaps[RTE_PMD_RING_MAX_RX_RINGS];
- const char *names[RTE_PMD_RING_MAX_RX_RINGS];
- const char *types[RTE_PMD_RING_MAX_RX_RINGS];
-};
-
struct pmd_internals {
struct pcap_rx_queue rx_queue[RTE_PMD_RING_MAX_RX_RINGS];
struct pcap_tx_queue tx_queue[RTE_PMD_RING_MAX_TX_RINGS];
@@ -875,6 +860,143 @@ error:
return -1;
}
+int
+rte_eth_from_pcapsndumpers(const char *name,
+ struct rx_pcaps *rx_queues,
+ const unsigned nb_rx_queues,
+ struct tx_pcaps *tx_queues,
+ const unsigned nb_tx_queues,
+ const unsigned numa_node)
+{
+ struct rte_eth_dev_data *data = NULL;
+ struct pmd_internals *internals = NULL;
+ struct rte_eth_dev *eth_dev = NULL;
+ unsigned i;
+ pcap_dumper_t *dumper;
+ pcap_t *pcap = NULL;
+
+ hz = rte_get_timer_hz();
+ /* do some parameter checking */
+ if (!rx_queues && nb_rx_queues > 0)
+ return -1;
+ if (!tx_queues && nb_tx_queues > 0)
+ return -1;
+
+ /* initialize rx and tx pcaps */
+ for (i = 0; i < nb_rx_queues; i++) {
+ if (open_single_rx_pcap(rx_queues->names[i], &pcap) < 0)
+ return -1;
+ rx_queues->pcaps[i] = pcap;
+ }
+ for (i = 0; i < nb_tx_queues; i++) {
+ if (open_single_tx_pcap(tx_queues->names[i], &dumper) < 0)
+ return -1;
+ tx_queues->dumpers[i] = dumper;
+ }
+
+ RTE_LOG(INFO, PMD, "Creating pcap-backed ethdev on numa socket %u\n", numa_node);
+
+ /* now do all data allocation - for eth_dev structure, dummy pci driver
+ * and internal (private) data
+ */
+ data = rte_zmalloc_socket(name, sizeof(*data), 0, numa_node);
+ if (!data)
+ goto error;
+
+ if (nb_rx_queues) {
+ data->rx_queues = rte_zmalloc_socket(name, sizeof(void *) * nb_rx_queues,
+ 0, numa_node);
+ if (!data->rx_queues)
+ goto error;
+ }
+
+ if (nb_tx_queues) {
+ data->tx_queues = rte_zmalloc_socket(name, sizeof(void *) * nb_tx_queues,
+ 0, numa_node);
+ if (!data->tx_queues)
+ goto error;
+ }
+
+ internals = rte_zmalloc_socket(name, sizeof(*internals), 0, numa_node);
+ if (!internals)
+ goto error;
+
+ /* reserve an ethdev entry */
+ eth_dev = rte_eth_dev_allocate(name, RTE_ETH_DEV_VIRTUAL);
+ if (!eth_dev)
+ goto error;
+
+ /* check length of device name */
+ if ((strlen(eth_dev->data->name) + 1) > sizeof(data->name))
+ goto error;
+
+ /* now put it all together
+ * - store queue data in internals,
+ * - store numa_node info in eth_dev_data
+ * - point eth_dev_data to internals
+ * - and point eth_dev structure to new eth_dev_data structure
+ */
+ internals->nb_rx_queues = nb_rx_queues;
+ internals->nb_tx_queues = nb_tx_queues;
+ internals->if_index = if_nametoindex(name);
+
+ data->dev_private = internals;
+ data->port_id = eth_dev->data->port_id;
+ strncpy(data->name, eth_dev->data->name, strlen(eth_dev->data->name));
+ data->nb_rx_queues = (uint16_t)nb_rx_queues;
+ data->nb_tx_queues = (uint16_t)nb_tx_queues;
+ data->dev_link = pmd_link;
+ data->mac_addrs = ð_addr;
+
+ strncpy(data->name,
+ eth_dev->data->name, strlen(eth_dev->data->name));
+ eth_dev->data = data;
+ eth_dev->driver = NULL;
+ eth_dev->dev_ops = &ops;
+ eth_dev->data->dev_flags = RTE_ETH_DEV_DETACHABLE;
+ eth_dev->data->kdrv = RTE_KDRV_NONE;
+ eth_dev->data->drv_name = drivername;
+ eth_dev->data->numa_node = numa_node;
+
+ for (i = 0; i < nb_rx_queues; i++) {
+ internals->rx_queue[i].pcap = rx_queues->pcaps[i];
+ snprintf(internals->rx_queue[i].name,
+ sizeof(internals->rx_queue[i].name), "%s",
+ rx_queues->names[i]);
+ snprintf(internals->rx_queue[i].type,
+ sizeof(internals->rx_queue[i].type), "%s",
+ rx_queues->types[i]);
+ }
+ for (i = 0; i < nb_tx_queues; i++) {
+ internals->tx_queue[i].dumper = tx_queues->dumpers[i];
+ snprintf(internals->tx_queue[i].name,
+ sizeof(internals->tx_queue[i].name), "%s",
+ tx_queues->names[i]);
+ snprintf(internals->tx_queue[i].type,
+ sizeof(internals->tx_queue[i].type), "%s",
+ tx_queues->types[i]);
+ }
+
+ /* using multiple pcaps/interfaces */
+ internals->single_iface = 0;
+
+ /* finally assign rx and tx ops */
+ eth_dev->rx_pkt_burst = eth_pcap_rx;
+ eth_dev->tx_pkt_burst = eth_pcap_tx_dumper;
+
+ return data->port_id;
+
+error:
+ if (data) {
+ rte_free(data->rx_queues);
+ rte_free(data->tx_queues);
+ }
+ rte_free(data);
+ rte_free(internals);
+
+ return -1;
+}
+
static int
rte_eth_from_pcaps_n_dumpers(const char *name,
struct rx_pcaps *rx_queues,
diff --git a/drivers/net/pcap/rte_eth_pcap.h b/drivers/net/pcap/rte_eth_pcap.h
new file mode 100644
index 0000000..5bcfb5d
--- /dev/null
+++ b/drivers/net/pcap/rte_eth_pcap.h
@@ -0,0 +1,87 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2016 Intel Corporation. All rights reserved.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_ETH_PCAP_H_
+#define _RTE_ETH_PCAP_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <pcap.h>
+
+struct rx_pcaps {
+ unsigned num_of_rx;
+ pcap_t *pcaps[RTE_PMD_RING_MAX_RX_RINGS];
+ const char *names[RTE_PMD_RING_MAX_RX_RINGS];
+ const char *types[RTE_PMD_RING_MAX_RX_RINGS];
+};
+
+struct tx_pcaps {
+ unsigned num_of_tx;
+ pcap_dumper_t *dumpers[RTE_PMD_RING_MAX_TX_RINGS];
+ pcap_t *pcaps[RTE_PMD_RING_MAX_RX_RINGS];
+ const char *names[RTE_PMD_RING_MAX_RX_RINGS];
+ const char *types[RTE_PMD_RING_MAX_RX_RINGS];
+};
+
+/**
+ * Create a new ethdev port from pcaps
+ *
+ * @param name
+ * name to be given to the new ethdev port
+ * @param rx_queues
+ * pointer to array of pcaps to be used as RX queues
+ * @param nb_rx_queues
+ * number of elements in the rx_queues array
+ * @param tx_queues
+ * pointer to array of pcaps to be used as TX queues
+ * @param nb_tx_queues
+ * number of elements in the tx_queues array
+ * @param numa_node
+ * the numa node on which the memory for this port is to be allocated
+ * @return
+ * the port number of the newly created the ethdev or -1 on error.
+ */
+int rte_eth_from_pcapsndumpers(const char *name,
+ struct rx_pcaps *rx_queues,
+ const unsigned nb_rx_queues,
+ struct tx_pcaps *tx_queues,
+ const unsigned nb_tx_queues,
+ const unsigned numa_node);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
diff --git a/drivers/net/pcap/rte_pmd_pcap_version.map b/drivers/net/pcap/rte_pmd_pcap_version.map
index ef35398..bffc35b 100644
--- a/drivers/net/pcap/rte_pmd_pcap_version.map
+++ b/drivers/net/pcap/rte_pmd_pcap_version.map
@@ -2,3 +2,10 @@ DPDK_2.0 {
local: *;
};
+
+DPDK_16.04 {
+ global:
+
+ rte_eth_from_pcapsndumpers;
+
+} DPDK_2.0;
--
1.7.4.1
^ permalink raw reply [flat|nested] 21+ messages in thread
* [dpdk-dev] [PATCH v3 3/5] app/proc_info: add tcpdump support in secondary process
2016-03-02 12:16 ` [dpdk-dev] [PATCH v3 " Reshma Pattan
2016-03-02 12:16 ` [dpdk-dev] [PATCH v3 1/5] app/test-pmd: fix nb_rxq and nb_txq checks Reshma Pattan
2016-03-02 12:16 ` [dpdk-dev] [PATCH v3 2/5] drivers/net/pcap: add public api to create pcap device Reshma Pattan
@ 2016-03-02 12:16 ` Reshma Pattan
2016-03-02 12:16 ` [dpdk-dev] [PATCH v3 4/5] lib/librte_eal: add tcpdump support in primary process Reshma Pattan
` (2 subsequent siblings)
5 siblings, 0 replies; 21+ messages in thread
From: Reshma Pattan @ 2016-03-02 12:16 UTC (permalink / raw)
To: dev
Added below optional command line options for tcpdump support.
1)--tcupdump '(port,queue)' ==> port id and queue can be valid
queue number or '*'. User need to specify specific queue id to
capture packets on given queue for given port.
(OR) '*' to capture packets from all queues of given port.
2)--src-ip-filter "A.B.C.D" ==> src ip to be used to filter the
packets. If src ip filter option is not passed, all packets will
be captured.
3)--single-tcpdump-file ==> If option is passed ingress and egress
packets will be captured to single pcap file. Else will be captured
to two different RX & TX pcap files.
Added pcap device creation and writing of packets to pcap device
for tcpdump.
Added socket functionality to communicate with primary process.
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
app/proc_info/main.c | 472 +++++++++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 469 insertions(+), 3 deletions(-)
diff --git a/app/proc_info/main.c b/app/proc_info/main.c
index 341176d..96d4b6e 100644
--- a/app/proc_info/main.c
+++ b/app/proc_info/main.c
@@ -1,7 +1,7 @@
/*
* BSD LICENSE
*
- * Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ * Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
@@ -38,8 +38,25 @@
#include <stdarg.h>
#include <inttypes.h>
#include <sys/queue.h>
+#include <sys/socket.h>
#include <stdlib.h>
#include <getopt.h>
+#include <unistd.h>
+#include <signal.h>
+#include <stdbool.h>
+#include <arpa/inet.h>
+
+/* sys/un.h with __USE_MISC uses strlen, which is unsafe */
+#ifdef __USE_MISC
+#define REMOVED_USE_MISC
+#undef __USE_MISC
+#endif
+#include <sys/un.h>
+/* make sure we redefine __USE_MISC only if it was previously undefined */
+#ifdef REMOVED_USE_MISC
+#define __USE_MISC
+#undef REMOVED_USE_MISC
+#endif
#include <rte_eal.h>
#include <rte_common.h>
@@ -57,10 +74,23 @@
#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_string_fns.h>
+#include <rte_errno.h>
+
+#ifdef RTE_LIBRTE_PMD_PCAP
+#include <rte_eth_pcap.h>
+#endif
/* Maximum long option length for option parsing. */
#define MAX_LONG_OPT_SZ 64
#define RTE_LOGTYPE_APP RTE_LOGTYPE_USER1
+#define APP_ARG_TCPDUMP_MAX_TUPLES 54
+#define TCPDUMP_SOCKET_PATH "%s/tcpdump_mp_socket"
+#define CMSGLEN CMSG_LEN(sizeof(int))
+#define TX_DESC_PER_QUEUE 512
+#define RX_DESC_PER_QUEUE 128
+#define BURST_SIZE 32
+#define MBUF_PER_POOL 65535
+#define MBUF_POOL_CACHE_SIZE 250
/**< mask of enabled ports */
static uint32_t enabled_port_mask;
@@ -75,13 +105,63 @@ static uint32_t reset_xstats;
/**< Enable memory info. */
static uint32_t mem_info;
+enum tcpdump_msg_type {
+ REMOVE_RXTX_CBS = 1,
+ REGISTER_RXTX_CBS = 2
+};
+
+enum rx_tx_type {
+ RX = 1,
+ TX = 2,
+ RX_TX = 3,
+ RX_TX_TYPES = 2
+};
+
+/**< src ip filter for tcpdump. */
+static uint32_t src_ip_filter;
+/**< socket for connecting to primary. */
+static int socket_fd;
+/**< vdev port ids. */
+static int pcap_vdev_port_id[RX_TX_TYPES];
+volatile uint8_t quit_signal;
+/**< Enable tcpdump feature. */
+bool is_tcpdump_enabled;
+/** Enable single tcpdump pcap file */
+bool single_tcpdump_file;
+
+static volatile struct tcpdump_app_stats {
+ struct {
+ uint64_t dequeue_pkts;
+ uint64_t tx_pkts;
+ uint64_t freed_pkts;
+ } in __rte_cache_aligned;
+ struct {
+ uint64_t dequeue_pkts;
+ uint64_t tx_pkts;
+ uint64_t freed_pkts;
+ } out __rte_cache_aligned;
+} tcpdump_app_stats __rte_cache_aligned;
+
+struct tcpdump_port_queue_tuples {
+ int num_pq_tuples;
+ char *port_id[APP_ARG_TCPDUMP_MAX_TUPLES];
+ char *queue_id[APP_ARG_TCPDUMP_MAX_TUPLES];
+} __rte_cache_aligned;
+
+static struct tcpdump_port_queue_tuples tcpdump_pq_t;
+
/**< display usage */
+
static void
proc_info_usage(const char *prgname)
{
printf("%s [EAL options] -- -p PORTMASK\n"
" -m to display DPDK memory zones, segments and TAILQ information\n"
" -p PORTMASK: hexadecimal bitmask of ports to retrieve stats for\n"
+ " --tcpdump (port,queue): port and queue info for capturing packets "
+ "for tcpdump\n"
+ " --src-ip-filter \"A.B.C.D\": src ip for tcpdump filtering\n"
+ " --single-tcpdump-file: capture packets to single pcap file\n"
" --stats: to display port statistics, enabled by default\n"
" --xstats: to display extended port statistics, disabled by "
"default\n"
@@ -116,14 +196,74 @@ parse_portmask(const char *portmask)
}
+static int
+parse_tcpdump(const char *q_arg)
+{
+ char s[256];
+ const char *p, *p0 = q_arg;
+ unsigned size;
+
+ enum fieldnames {
+ FLD_PORT = 0,
+ FLD_QUEUE,
+ _NUM_FLD
+ };
+
+ char *str_fld[_NUM_FLD];
+
+ while ((p = strchr(p0, '(')) != NULL) {
+ ++p;
+ p0 = strchr(p, ')');
+ if (p0 == NULL)
+ return -1;
+
+ size = p0 - p;
+ if (size >= sizeof(s))
+ return -1;
+
+ snprintf(s, sizeof(s), "%.*s", size, p);
+ if (rte_strsplit(s, sizeof(s), str_fld, _NUM_FLD, ',') != _NUM_FLD)
+ return -1;
+
+ if (tcpdump_pq_t.num_pq_tuples >= APP_ARG_TCPDUMP_MAX_TUPLES) {
+ printf("exceeded max number of port params: %"PRIu32"\n",
+ tcpdump_pq_t.num_pq_tuples);
+ return -1;
+ }
+ tcpdump_pq_t.port_id[tcpdump_pq_t.num_pq_tuples] =
+ rte_malloc(NULL, strlen(str_fld[FLD_PORT]), 0);
+ tcpdump_pq_t.queue_id[tcpdump_pq_t.num_pq_tuples] =
+ rte_malloc(NULL, strlen(str_fld[FLD_QUEUE]), 0);
+ strncpy(tcpdump_pq_t.port_id[tcpdump_pq_t.num_pq_tuples],
+ str_fld[FLD_PORT], strlen(str_fld[FLD_PORT]));
+ strncpy(tcpdump_pq_t.queue_id[tcpdump_pq_t.num_pq_tuples],
+ str_fld[FLD_QUEUE], strlen(str_fld[FLD_QUEUE]));
+ tcpdump_pq_t.num_pq_tuples++;
+ }
+
+ return 0;
+}
+
+static int
+parse_ip(const char *q_arg)
+{
+ if (!inet_pton(AF_INET, q_arg, &src_ip_filter))
+ return 1;
+
+ return 0;
+}
+
/* Parse the argument given in the command line of the application */
static int
proc_info_parse_args(int argc, char **argv)
{
- int opt;
+ int opt, ret;
int option_index;
char *prgname = argv[0];
static struct option long_option[] = {
+ {"tcpdump", 1, 0, 0},
+ {"src-ip-filter", 1, 0, 0},
+ {"single-tcpdump-file", 0, NULL, 0},
{"stats", 0, NULL, 0},
{"stats-reset", 0, NULL, 0},
{"xstats", 0, NULL, 0},
@@ -151,6 +291,33 @@ proc_info_parse_args(int argc, char **argv)
mem_info = 1;
break;
case 0:
+ if (!strncmp(long_option[option_index].name, "tcpdump",
+ MAX_LONG_OPT_SZ)) {
+ ret = parse_tcpdump(optarg);
+ if (ret) {
+ printf("invalid tcpdump\n");
+ proc_info_usage(prgname);
+ return -1;
+ }
+ is_tcpdump_enabled = true;
+ }
+
+ if (!strncmp(long_option[option_index].name, "src-ip-filter",
+ MAX_LONG_OPT_SZ)) {
+ ret = parse_ip(optarg);
+ if (ret) {
+ printf("invalid src-ip-filter\n");
+ proc_info_usage(prgname);
+ return -1;
+ }
+ }
+
+ /* enable single pcap file */
+ if (!strncmp(long_option[option_index].name,
+ "single-tcpdump-file",
+ MAX_LONG_OPT_SZ))
+ single_tcpdump_file = true;
+
/* Print stats */
if (!strncmp(long_option[option_index].name, "stats",
MAX_LONG_OPT_SZ))
@@ -285,16 +452,225 @@ nic_xstats_clear(uint8_t port_id)
printf("\n NIC extended statistics for port %d cleared\n", port_id);
}
+/* get socket path (/var/run if root, $HOME otherwise) */
+static void
+tcpdump_get_socket_path(char *buffer, int bufsz)
+{
+ const char *dir = "/var/run/tcpdump_socket";
+ const char *home_dir = getenv("HOME/tcpdump_socket");
+
+ if (getuid() != 0 && home_dir != NULL)
+ dir = home_dir;
+ /* use current prefix as file path */
+ snprintf(buffer, bufsz, TCPDUMP_SOCKET_PATH, dir);
+}
+
+static int
+tcpdump_connect_to_primary(void)
+{
+ struct sockaddr_un addr;
+ socklen_t sockaddr_len;
+
+ /* set up a socket */
+ socket_fd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+ if (socket_fd < 0) {
+ RTE_LOG(ERR, EAL, "Failed to create socket!\n");
+ return -1;
+ }
+
+ tcpdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path));
+ addr.sun_family = AF_UNIX;
+
+ sockaddr_len = sizeof(struct sockaddr_un);
+
+ if (connect(socket_fd, (struct sockaddr *) &addr, sockaddr_len) == 0)
+ return socket_fd;
+
+ /* if connect failed */
+ close(socket_fd);
+ return -1;
+}
+
+/* send a request, return -1 on error */
+static int
+tcpdump_send_request(int socket, enum tcpdump_msg_type type)
+{
+ char buffer[256];
+ struct msghdr reg_cb_msg;
+ struct iovec msg[3];
+ int ret, wc, buf, i, n = 0;
+
+ buf = type;
+ for (i = 0; i < tcpdump_pq_t.num_pq_tuples; i++) {
+ wc = snprintf(buffer + n, sizeof(buffer) - n, "%s,%s,",
+ tcpdump_pq_t.port_id[i], tcpdump_pq_t.queue_id[i]);
+ n += wc;
+ }
+
+ memset(msg, 0, sizeof(msg));
+ msg[0].iov_base = (int *) &buf;
+ msg[0].iov_len = sizeof(int);
+ msg[1].iov_base = (char *)buffer;
+ msg[1].iov_len = sizeof(buffer);
+ msg[2].iov_base = (char *) &src_ip_filter;
+ msg[2].iov_len = sizeof(src_ip_filter);
+
+ memset(®_cb_msg, 0, sizeof(reg_cb_msg));
+ reg_cb_msg.msg_iov = msg;
+ reg_cb_msg.msg_iovlen = 3;
+ ret = sendmsg(socket, ®_cb_msg, 0);
+ if (ret < 0)
+ return -1;
+ return 0;
+}
+
+static void
+int_handler(int sig_num)
+{
+ int i;
+
+ /* connect to primary process using AF_UNIX socket */
+ socket_fd = tcpdump_connect_to_primary();
+ if (socket_fd < 0)
+ printf("cannot connect to primary process for RX/TX CBs removal!\n");
+
+ /* send request to remove rx/tx callbacks */
+ if (tcpdump_send_request(socket_fd, REMOVE_RXTX_CBS) < 0) {
+ printf("cannot send tcpdump remove rxtx cbs eequest!\n");
+ close(socket_fd);
+ }
+
+ /* close tcpdump socket fd */
+ close(socket_fd);
+
+ /* free port and queue tupples */
+ for (i = 0; i < tcpdump_pq_t.num_pq_tuples; i++) {
+ rte_free(tcpdump_pq_t.port_id[i]);
+ rte_free(tcpdump_pq_t.queue_id[i]);
+ }
+ printf("Exiting on signal %d\n", sig_num);
+ quit_signal = 1;
+}
+
+static inline int
+configure_pcap_vdev(uint8_t port_id)
+{
+ struct ether_addr addr;
+ const uint16_t txRings = 1;
+ const uint8_t nb_ports = rte_eth_dev_count();
+ int ret;
+ uint16_t q;
+
+ if (port_id > nb_ports)
+ return -1;
+
+ for (q = 0; q < txRings; q++) {
+ ret = rte_eth_tx_queue_setup(port_id, q, TX_DESC_PER_QUEUE,
+ rte_eth_dev_socket_id(port_id), NULL);
+ if (ret < 0) {
+ rte_exit(EXIT_FAILURE, "queue setup failed\n");
+ return ret;
+ }
+ }
+
+ ret = rte_eth_dev_start(port_id);
+ if (ret < 0)
+ return ret;
+
+ rte_eth_macaddr_get(port_id, &addr);
+ printf("Port %u MAC: %02"PRIx8" %02"PRIx8" %02"PRIx8
+ " %02"PRIx8" %02"PRIx8" %02"PRIx8"\n",
+ (unsigned)port_id,
+ addr.addr_bytes[0], addr.addr_bytes[1],
+ addr.addr_bytes[2], addr.addr_bytes[3],
+ addr.addr_bytes[4], addr.addr_bytes[5]);
+
+ rte_eth_promiscuous_enable(port_id);
+
+ return 0;
+}
+
+static int
+create_pcap_pmd_vdev(enum rx_tx_type type) {
+ char pcap_vdev_name[32];
+ char pcap_filename[32];
+#ifdef RTE_LIBRTE_PMD_PCAP
+ struct rx_pcaps rxpcap;
+ struct tx_pcaps txpcap;
+#endif
+ int port_id;
+
+ if (type == RX) {
+ snprintf(pcap_vdev_name, sizeof(pcap_vdev_name),
+ "eth_pcap_tcpdump_%s", "RX");
+ snprintf(pcap_filename, sizeof(pcap_filename),
+ "/tmp/%s_pcap.pcap", "RX");
+ } else if (type == TX) {
+ snprintf(pcap_vdev_name, sizeof(pcap_vdev_name),
+ "eth_pcap_tcpdump_%s", "TX");
+ snprintf(pcap_filename, sizeof(pcap_filename),
+ "/tmp/%s_pcap.pcap", "TX");
+ } else if (type == RX_TX) {
+ snprintf(pcap_vdev_name, sizeof(pcap_vdev_name),
+ "eth_pcap_tcpdump_%s", "RX_TX");
+ snprintf(pcap_filename, sizeof(pcap_filename),
+ "/tmp/%s_pcap.pcap", "RX_TX");
+ }
+
+#ifdef RTE_LIBRTE_PMD_PCAP
+ rxpcap.names[0] = "";
+ rxpcap.types[0] = "";
+ rxpcap.num_of_rx = 0;
+ txpcap.names[0] = pcap_filename;
+ txpcap.types[0] = "tx_pcap";
+ txpcap.num_of_tx = 1;
+
+ port_id = rte_eth_from_pcapsndumpers(pcap_vdev_name,
+ &rxpcap, rxpcap.num_of_rx,
+ &txpcap, txpcap.num_of_tx, rte_socket_id());
+#else
+ port_id = -1;
+#endif
+ if (port_id < 0)
+ rte_exit(EXIT_FAILURE, "Failed to create pcap_vdev\n");
+
+ return port_id;
+}
+
+static void
+print_tcpdump_stats(void)
+{
+ printf("##### TCPDUMP DEBUG STATS #####\n");
+ printf(" - Input packets dequeued: %"PRIu64"\n",
+ tcpdump_app_stats.in.dequeue_pkts);
+ printf(" - Input packets transmitted to pcap: %"PRIu64"\n",
+ tcpdump_app_stats.in.tx_pkts);
+ printf(" - Input packets freed: %"PRIu64"\n",
+ tcpdump_app_stats.in.freed_pkts);
+ printf(" - Output packets dequeued: %"PRIu64"\n",
+ tcpdump_app_stats.out.dequeue_pkts);
+ printf(" - Output packets transmitted to pcap: %"PRIu64"\n",
+ tcpdump_app_stats.out.tx_pkts);
+ printf(" - Output packets freed: %"PRIu64"\n",
+ tcpdump_app_stats.out.freed_pkts);
+ printf("################################\n");
+}
+
int
main(int argc, char **argv)
{
int ret;
int i;
+ int vdev_port;
char c_flag[] = "-c1";
char n_flag[] = "-n4";
char mp_flag[] = "--proc-type=secondary";
char *argp[argc + 3];
uint8_t nb_ports;
+ struct rte_ring *rx_ring, *tx_ring;
+
+ /* catch ctrl-c so we can print on exit */
+ signal(SIGINT, int_handler);
argp[0] = argv[0];
argp[1] = c_flag;
@@ -327,7 +703,6 @@ main(int argc, char **argv)
if (nb_ports == 0)
rte_exit(EXIT_FAILURE, "No Ethernet ports - bye\n");
-
if (nb_ports > RTE_MAX_ETHPORTS)
nb_ports = RTE_MAX_ETHPORTS;
@@ -348,5 +723,96 @@ main(int argc, char **argv)
}
}
+ if (is_tcpdump_enabled == true) {
+ /* remove existing pcap files from /tmp folder */
+ remove("/tmp/RX_TX_pcap.pcap");
+ remove("/tmp/RX_pcap.pcap");
+ remove("/tmp/TX_pcap.pcap");
+ if (single_tcpdump_file == true) {
+ vdev_port = create_pcap_pmd_vdev(RX_TX);
+ pcap_vdev_port_id[0] = vdev_port;
+ pcap_vdev_port_id[1] = vdev_port;
+ configure_pcap_vdev(pcap_vdev_port_id[0]);
+ } else {
+ /* create pcap virtual devices for rx and tx */
+ pcap_vdev_port_id[0] = create_pcap_pmd_vdev(RX);
+ configure_pcap_vdev(pcap_vdev_port_id[0]);
+
+ pcap_vdev_port_id[1] = create_pcap_pmd_vdev(TX);
+ configure_pcap_vdev(pcap_vdev_port_id[1]);
+ }
+
+ /* connect to primary process using AF_UNIX socket */
+ socket_fd = tcpdump_connect_to_primary();
+ if (socket_fd < 0) {
+ printf("cannot connect to primary process!\n");
+ return -1;
+ }
+
+ if (tcpdump_send_request(socket_fd, REGISTER_RXTX_CBS) < 0) {
+ printf("cannot send tcpdump register rxtx cbs request!\n");
+ close(socket_fd);
+ return -1;
+ }
+
+ while (1) {
+ rx_ring = rte_ring_lookup("prim_to_sec_rx");
+ tx_ring = rte_ring_lookup("prim_to_sec_tx");
+ if (rx_ring != NULL && tx_ring != NULL)
+ break;
+ }
+
+ while (!quit_signal) {
+ /* write input packets of port to pcap file for tcpdump */
+ struct rte_mbuf *rx_bufs[BURST_SIZE];
+
+ /* first dequeue packets from ring of primary process */
+ const uint16_t nb_in_deq = rte_ring_dequeue_burst(rx_ring,
+ (void *)rx_bufs, BURST_SIZE);
+ tcpdump_app_stats.in.dequeue_pkts += nb_in_deq;
+
+ if (nb_in_deq) {
+ /* then sent on pcap file */
+ uint16_t nb_in_txd = rte_eth_tx_burst(
+ pcap_vdev_port_id[0],
+ 0, rx_bufs, nb_in_deq);
+ tcpdump_app_stats.in.tx_pkts += nb_in_txd;
+
+ if (unlikely(nb_in_txd < nb_in_deq)) {
+ do {
+ rte_pktmbuf_free(rx_bufs[nb_in_txd]);
+ tcpdump_app_stats.in.freed_pkts++;
+ } while (++nb_in_txd < nb_in_deq);
+ }
+
+ }
+
+ /* write output packets of port to pcap file for tcpdump */
+ struct rte_mbuf *tx_bufs[BURST_SIZE];
+
+ /* first dequeue from ring of primary process */
+ const uint16_t nb_out_deq = rte_ring_dequeue_burst(tx_ring,
+ (void *)tx_bufs, BURST_SIZE);
+ tcpdump_app_stats.out.dequeue_pkts += nb_out_deq;
+
+ if (nb_out_deq) {
+ /* then sent on pcap file */
+ uint16_t nb_out_txd = rte_eth_tx_burst(
+ pcap_vdev_port_id[1],
+ 0, tx_bufs, nb_out_deq);
+ tcpdump_app_stats.out.tx_pkts += nb_out_txd;
+ if (unlikely(nb_out_txd < nb_out_deq)) {
+ do {
+ rte_pktmbuf_free(tx_bufs[nb_out_txd]);
+ tcpdump_app_stats.out.freed_pkts++;
+ } while (++nb_out_txd < nb_out_deq);
+
+ }
+ }
+ }
+
+ print_tcpdump_stats();
+
+ }
return 0;
}
--
1.7.4.1
^ permalink raw reply [flat|nested] 21+ messages in thread
* [dpdk-dev] [PATCH v3 4/5] lib/librte_eal: add tcpdump support in primary process
2016-03-02 12:16 ` [dpdk-dev] [PATCH v3 " Reshma Pattan
` (2 preceding siblings ...)
2016-03-02 12:16 ` [dpdk-dev] [PATCH v3 3/5] app/proc_info: add tcpdump support in secondary process Reshma Pattan
@ 2016-03-02 12:16 ` Reshma Pattan
2016-03-02 12:16 ` [dpdk-dev] [PATCH v3 5/5] doc: update doc for tcpdump feature Reshma Pattan
2016-03-09 0:33 ` [dpdk-dev] [PATCH v3 0/5] add dpdk packet capture support for tcpdump Thomas Monjalon
5 siblings, 0 replies; 21+ messages in thread
From: Reshma Pattan @ 2016-03-02 12:16 UTC (permalink / raw)
To: dev
Added tcpdump functionality to eal interrupt thread.
Enhanced interrupt thread to support tcpdump socket
and message processing from secondary.
Created new mempool and rings to handle packets of tcpdump.
Added rte_eth_rxtx_callbacks for ingress/egress packets processing
for tcpdump.
Added functionality to remove registered rte_eth_rxtx_callbacks
once secondary process is terminated.
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
lib/librte_eal/linuxapp/eal/Makefile | 5 +-
lib/librte_eal/linuxapp/eal/eal_interrupts.c | 422 +++++++++++++++++++++++++-
2 files changed, 424 insertions(+), 3 deletions(-)
diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index 6e26250..425152c 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -1,6 +1,6 @@
# BSD LICENSE
#
-# Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+# Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
@@ -47,6 +47,9 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common/include
CFLAGS += -I$(RTE_SDK)/lib/librte_ring
CFLAGS += -I$(RTE_SDK)/lib/librte_mempool
CFLAGS += -I$(RTE_SDK)/lib/librte_ivshmem
+CFLAGS += -I$(RTE_SDK)/lib/librte_mbuf
+CFLAGS += -I$(RTE_SDK)/lib/librte_ether
+CFLAGS += -I$(RTE_SDK)/lib/librte_net
CFLAGS += $(WERROR_FLAGS) -O3
# specific to linuxapp exec-env
diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index 06b26a9..3b82b7b 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -1,7 +1,7 @@
/*-
* BSD LICENSE
*
- * Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ * Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
@@ -45,7 +45,11 @@
#include <sys/signalfd.h>
#include <sys/ioctl.h>
#include <sys/eventfd.h>
+#include <sys/socket.h>
+#include <sys/un.h>
#include <assert.h>
+#include <arpa/inet.h>
+#include <sys/stat.h>
#include <rte_common.h>
#include <rte_interrupts.h>
@@ -65,15 +69,40 @@
#include <rte_malloc.h>
#include <rte_errno.h>
#include <rte_spinlock.h>
+#include <rte_memcpy.h>
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+#include <rte_ether.h>
+#include <rte_ip.h>
#include "eal_private.h"
#include "eal_vfio.h"
#include "eal_thread.h"
+#include "eal_internal_cfg.h"
#define EAL_INTR_EPOLL_WAIT_FOREVER (-1)
#define NB_OTHER_INTR 1
+#define TCPDUMP_SOCKET_PATH "%s/tcpdump_mp_socket"
+#define TCPDUMP_SOCKET_ERR 0xFF
+#define TCPDUMP_REQ 0x1
+#define RING_SIZE 1024
+#define BURST_SIZE 32
+#define NUM_MBUFS 65536
+#define MBUF_CACHE_SIZE 250
+#define MAX_CBS 54
+#define PORT_QUEUE_SIZE 5
static RTE_DEFINE_PER_LCORE(int, _epfd) = -1; /**< epoll fd per thread */
+static uint32_t src_ip_filter;
+static int tcpdump_socket_fd;
+struct rte_ring *prim_to_sec_rx;
+struct rte_ring *prim_to_sec_tx;
+struct rte_mempool *tcpdump_pktmbuf_pool;
+static struct rxtx_cbs {
+ uint8_t port;
+ uint16_t queue;
+ struct rte_eth_rxtx_callback *cb;
+} rx_cbs[MAX_CBS], tx_cbs[MAX_CBS];
/**
* union for pipe fds.
@@ -644,6 +673,306 @@ rte_intr_disable(struct rte_intr_handle *intr_handle)
return 0;
}
+static inline void
+tcpdump_pktmbuf_duplicate(struct rte_mbuf *mi, struct rte_mbuf *m)
+{
+
+ mi->data_len = m->data_len;
+ mi->port = m->port;
+ mi->vlan_tci = m->vlan_tci;
+ mi->vlan_tci_outer = m->vlan_tci_outer;
+ mi->tx_offload = m->tx_offload;
+ mi->hash = m->hash;
+
+ mi->pkt_len = mi->data_len;
+ mi->ol_flags = m->ol_flags;
+ mi->packet_type = m->packet_type;
+
+ rte_memcpy(rte_pktmbuf_mtod(mi, void *),
+ rte_pktmbuf_mtod(m, void *),
+ rte_pktmbuf_data_len(mi));
+
+ __rte_mbuf_sanity_check(mi, 1);
+ __rte_mbuf_sanity_check(m, 0);
+}
+
+static inline struct rte_mbuf *
+tcpdump_pktmbuf_clone(struct rte_mbuf *md, struct rte_mempool *mp)
+{
+ struct rte_mbuf *mc, *mi, **prev;
+ uint32_t pktlen;
+ uint8_t nseg;
+
+ mc = rte_pktmbuf_alloc(mp);
+ if (unlikely(mc == NULL))
+ return NULL;
+
+ mi = mc;
+ prev = &mi->next;
+ pktlen = md->pkt_len;
+ nseg = 0;
+
+ do {
+ nseg++;
+ tcpdump_pktmbuf_duplicate(mi, md);
+ *prev = mi;
+ prev = &mi->next;
+ } while ((md = md->next) != NULL &&
+ (mi = rte_pktmbuf_alloc(mp)) != NULL);
+
+ *prev = NULL;
+ mc->nb_segs = nseg;
+ mc->pkt_len = pktlen;
+
+ /* Allocation of new indirect segment failed */
+ if (unlikely(mi == NULL)) {
+ rte_pktmbuf_free(mc);
+ return NULL;
+ }
+
+ __rte_mbuf_sanity_check(mc, 1);
+ return mc;
+
+}
+
+static int
+compare_filter(struct rte_mbuf *pkt)
+{
+ struct ipv4_hdr *pkt_hdr = rte_pktmbuf_mtod_offset(pkt, struct ipv4_hdr *,
+ sizeof(struct ether_hdr));
+ if (!src_ip_filter)
+ return 0;
+ else if (pkt_hdr->src_addr != src_ip_filter)
+ return -1;
+
+ return 0;
+}
+
+static uint16_t
+tcpdump_rx(uint8_t port __rte_unused, uint16_t qidx __rte_unused,
+ struct rte_mbuf **pkts, uint16_t nb_pkts,
+ uint16_t max_pkts __rte_unused, void *_ __rte_unused)
+{
+ unsigned i;
+ uint16_t filtered_pkts = 0;
+ int ring_enq = 0;
+ struct rte_mbuf *dup_bufs[nb_pkts];
+
+ for (i = 0; i < nb_pkts; i++) {
+ if (compare_filter(pkts[i]) == 0)
+ dup_bufs[filtered_pkts++] = tcpdump_pktmbuf_clone(pkts[i],
+ tcpdump_pktmbuf_pool);
+ }
+
+ ring_enq = rte_ring_enqueue_burst(prim_to_sec_rx, (void *)dup_bufs,
+ filtered_pkts);
+ if (unlikely(ring_enq < filtered_pkts)) {
+ do {
+ rte_pktmbuf_free(dup_bufs[ring_enq]);
+ } while (++ring_enq < filtered_pkts);
+ }
+ return nb_pkts;
+}
+
+static uint16_t
+tcpdump_tx(uint8_t port __rte_unused, uint16_t qidx __rte_unused,
+ struct rte_mbuf **pkts, uint16_t nb_pkts,
+ void *_ __rte_unused)
+{
+ int i;
+ int ring_enq = 0;
+ uint16_t filtered_pkts = 0;
+ struct rte_mbuf *dup_bufs[nb_pkts];
+
+ /*
+ * Increment reference count of mbuf to avoid accidental returrn of mbuf
+ * to pool while tcpdump processing is still on.
+ */
+ for (i = 0; i < nb_pkts; i++) {
+ if (compare_filter(pkts[i]) == 0) {
+ rte_pktmbuf_refcnt_update(pkts[i], 1);
+ dup_bufs[filtered_pkts++] = pkts[i];
+ }
+ }
+
+ ring_enq = rte_ring_enqueue_burst(prim_to_sec_tx, (void *)dup_bufs,
+ filtered_pkts);
+ if (unlikely(ring_enq < filtered_pkts)) {
+ do {
+ rte_pktmbuf_free(dup_bufs[ring_enq]);
+ } while (++ring_enq < filtered_pkts);
+ }
+ return nb_pkts;
+}
+
+static void
+tcpdump_create_mpool_n_rings(void)
+{
+ /* Create the mbuf pool */
+ tcpdump_pktmbuf_pool = rte_pktmbuf_pool_create("tcpdump_pktmbuf_pool", NUM_MBUFS,
+ MBUF_CACHE_SIZE, 0, RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id());
+ if (tcpdump_pktmbuf_pool == NULL)
+ rte_exit(EXIT_FAILURE, "Could not initialize tcpdump_pktmbuf_pool\n");
+
+ /* Create rings */
+ prim_to_sec_rx = rte_ring_create("prim_to_sec_rx", RING_SIZE, rte_socket_id(),
+ RING_F_SC_DEQ);
+ prim_to_sec_tx = rte_ring_create("prim_to_sec_tx", RING_SIZE, rte_socket_id(),
+ RING_F_SC_DEQ);
+}
+
+static void
+tcpdump_register_rx_callbacks(int port, int queue)
+{
+ static int cnt;
+
+ rx_cbs[cnt].port = port;
+ rx_cbs[cnt].queue = queue;
+ rx_cbs[cnt].cb = rte_eth_add_rx_callback(port, queue, tcpdump_rx, NULL);
+ cnt++;
+}
+
+static void
+tcpdump_register_tx_callbacks(int port, int queue)
+{
+ static int cnt;
+
+ tx_cbs[cnt].port = port;
+ tx_cbs[cnt].queue = queue;
+ tx_cbs[cnt].cb = rte_eth_add_tx_callback(port, queue, tcpdump_tx, NULL);
+ cnt++;
+}
+
+static void
+tcpdump_remove_rx_callbacks(int port, int queue)
+{
+ int i;
+
+ for (i = 0; i < MAX_CBS; i++) {
+ if ((rx_cbs[i].port == port) && (rx_cbs[i].queue == queue))
+ rte_eth_remove_rx_callback(port, queue, rx_cbs[i].cb);
+ }
+}
+
+static void
+tcpdump_remove_tx_callbacks(int port, int queue)
+{
+ int i;
+
+ for (i = 0; i < MAX_CBS; i++) {
+ if ((tx_cbs[i].port == port) && (tx_cbs[i].queue == queue))
+ rte_eth_remove_tx_callback(port, queue, tx_cbs[i].cb);
+ }
+}
+
+/* receive a request and return it */
+static int
+tcpdump_receive_request(int socket)
+{
+ struct msghdr reg_cbs_msg;
+ struct iovec msg[3];
+
+ char buffer[256];
+ char port[PORT_QUEUE_SIZE], queue[PORT_QUEUE_SIZE];
+ char *buf;
+
+ int msg_type;
+ int rval;
+ int buf_offset;
+ int i;
+ uint8_t port_id;
+ uint16_t queue_id;
+ uint16_t nb_rxq, nb_txq;
+
+ memset(®_cbs_msg, 0, sizeof(reg_cbs_msg));
+ reg_cbs_msg.msg_iov = msg;
+ reg_cbs_msg.msg_iovlen = 3;
+
+ msg[0].iov_base = (int *) &msg_type;
+ msg[0].iov_len = sizeof(int);
+
+ msg[1].iov_base = (char *) buffer;
+ msg[1].iov_len = sizeof(buffer);
+
+ msg[2].iov_base = (char *) &src_ip_filter;
+ msg[2].iov_len = sizeof(uint32_t);
+
+ rval = recvmsg(socket, ®_cbs_msg, 0);
+ if (rval < 0) {
+ RTE_LOG(ERR, EAL, "Error reading from file descriptor %d: %s\n",
+ socket,
+ strerror(errno));
+ return -1;
+ } else if (rval == 0) {
+ RTE_LOG(ERR, EAL, "Read nothing from file "
+ "descriptor %d\n", socket);
+ return -1;
+ }
+
+ buf = buffer;
+
+ /* Update port and queue */
+ while (sscanf(buf, "%[^','],%[^','],%n", port, queue, &buf_offset) == 2) {
+ port_id = atoi(port);
+ queue_id = atoi(queue);
+ if (msg_type == 2) {
+ if (!strcmp(queue, "*")) {
+ nb_rxq = rte_eth_devices[port_id].data->nb_rx_queues;
+ nb_txq = rte_eth_devices[port_id].data->nb_tx_queues;
+ for (i = 0; i < nb_rxq; i++)
+ tcpdump_register_rx_callbacks(port_id, i);
+ for (i = 0; i < nb_txq; i++)
+ tcpdump_register_tx_callbacks(port_id, i);
+ } else {
+ tcpdump_register_rx_callbacks(port_id, queue_id);
+ tcpdump_register_tx_callbacks(port_id, queue_id);
+ }
+ } else if (msg_type == 1) {
+ if (!strcmp(queue, "*")) {
+ nb_rxq = rte_eth_devices[port_id].data->nb_rx_queues;
+ nb_txq = rte_eth_devices[port_id].data->nb_tx_queues;
+ for (i = 0; i < nb_rxq; i++)
+ tcpdump_remove_rx_callbacks(port_id, i);
+ for (i = 0; i < nb_txq; i++)
+ tcpdump_remove_tx_callbacks(port_id, i);
+ } else {
+ tcpdump_remove_rx_callbacks(port_id, queue_id);
+ tcpdump_remove_tx_callbacks(port_id, queue_id);
+ }
+ }
+ buf += buf_offset;
+ }
+
+ return 0;
+}
+
+static void
+tcpdump_socket_ready(int socket)
+{
+ for (;;) {
+ int conn_sock;
+ struct sockaddr_un addr;
+
+ socklen_t sockaddr_len = sizeof(addr);
+ /* this is a blocking call */
+ conn_sock = accept(socket, (struct sockaddr *) &addr, &sockaddr_len);
+ /* just restart on error */
+ if (conn_sock == -1)
+ continue;
+
+ /* set socket to linger after close */
+ struct linger l;
+
+ l.l_onoff = 1;
+ l.l_linger = 60;
+ setsockopt(conn_sock, SOL_SOCKET, SO_LINGER, &l, sizeof(l));
+
+ tcpdump_receive_request(conn_sock);
+ close(conn_sock);
+ break;
+ }
+}
+
static int
eal_intr_process_interrupts(struct epoll_event *events, int nfds)
{
@@ -655,6 +984,13 @@ eal_intr_process_interrupts(struct epoll_event *events, int nfds)
for (n = 0; n < nfds; n++) {
+ if (internal_config.process_type == RTE_PROC_PRIMARY) {
+
+ /** tcpdump socket fd */
+ if (events[n].data.fd == tcpdump_socket_fd)
+ tcpdump_socket_ready(tcpdump_socket_fd);
+ }
+
/**
* if the pipe fd is ready to read, return out to
* rebuild the wait list.
@@ -786,6 +1122,61 @@ eal_intr_handle_interrupts(int pfd, unsigned totalfds)
}
}
+/* get socket path (/var/run if root, $HOME otherwise) */
+ static void
+tcpdump_get_socket_path(char *buffer, int bufsz)
+{
+ const char *dir = "/var/run/tcpdump_socket";
+ const char *home_dir = getenv("HOME/tcpdump_socket");
+
+ if (getuid() != 0 && home_dir != NULL)
+ dir = home_dir;
+ mkdir(dir, 700);
+ /* use current prefix as file path */
+ snprintf(buffer, bufsz, TCPDUMP_SOCKET_PATH, dir);
+}
+
+static int
+tcpdump_create_primary_socket(void)
+{
+ int ret, socket_fd;
+ struct sockaddr_un addr;
+ socklen_t sockaddr_len;
+
+ /* set up a socket */
+ socket_fd = socket(AF_UNIX, SOCK_SEQPACKET, 0);
+ if (socket_fd < 0) {
+ RTE_LOG(ERR, EAL, "Failed to create socket!\n");
+ return -1;
+ }
+
+ tcpdump_get_socket_path(addr.sun_path, sizeof(addr.sun_path));
+ addr.sun_family = AF_UNIX;
+ sockaddr_len = sizeof(struct sockaddr_un);
+
+ /* unlink() before bind() to remove the socket if it already exists */
+ unlink(addr.sun_path);
+
+ ret = bind(socket_fd, (struct sockaddr *) &addr, sockaddr_len);
+ if (ret) {
+ RTE_LOG(ERR, EAL, "Failed to bind socket: %s!\n", strerror(errno));
+ close(socket_fd);
+ return -1;
+ }
+
+ ret = listen(socket_fd, 1);
+ if (ret) {
+ RTE_LOG(ERR, EAL, "Failed to listen: %s!\n", strerror(errno));
+ close(socket_fd);
+ return -1;
+ }
+
+ /* save the socket in local configuration */
+ tcpdump_socket_fd = socket_fd;
+
+ return 0;
+}
+
/**
* It builds/rebuilds up the epoll file descriptor with all the
* file descriptors being waited on. Then handles the interrupts.
@@ -800,9 +1191,9 @@ static __attribute__((noreturn)) void *
eal_intr_thread_main(__rte_unused void *arg)
{
struct epoll_event ev;
-
/* host thread, never break out */
for (;;) {
+
/* build up the epoll fd with all descriptors we are to
* wait on then pass it to the handle_interrupts function
*/
@@ -829,6 +1220,23 @@ eal_intr_thread_main(__rte_unused void *arg)
}
numfds++;
+ /* build up the epoll fd with tcpdump descriptor.
+ */
+ static struct epoll_event tcpdump_event = {
+ .events = EPOLLIN | EPOLLPRI,
+ };
+
+ if (internal_config.process_type == RTE_PROC_PRIMARY) {
+ tcpdump_event.data.fd = tcpdump_socket_fd;
+ if (epoll_ctl(pfd, EPOLL_CTL_ADD, tcpdump_socket_fd,
+ &tcpdump_event) < 0) {
+ rte_panic("Error adding tcpdump socket fd to %d "
+ "epoll_ctl, %s\n",
+ tcpdump_socket_fd, strerror(errno));
+ }
+ numfds++;
+ }
+
rte_spinlock_lock(&intr_lock);
TAILQ_FOREACH(src, &intr_sources, next) {
@@ -877,6 +1285,16 @@ rte_eal_intr_init(void)
if (pipe(intr_pipe.pipefd) < 0)
return -1;
+ /* if primary, try to open tcpdump socket */
+ if (internal_config.process_type == RTE_PROC_PRIMARY) {
+ if (tcpdump_create_primary_socket() < 0) {
+ RTE_LOG(ERR, EAL, "Failed to set up tcpdump_socket_fd for "
+ "tcpdump in primary\n");
+ return -1;
+ }
+ tcpdump_create_mpool_n_rings();
+ }
+
/* create the host thread to wait/handle the interrupt */
ret = pthread_create(&intr_thread, NULL,
eal_intr_thread_main, NULL);
--
1.7.4.1
^ permalink raw reply [flat|nested] 21+ messages in thread
* [dpdk-dev] [PATCH v3 5/5] doc: update doc for tcpdump feature
2016-03-02 12:16 ` [dpdk-dev] [PATCH v3 " Reshma Pattan
` (3 preceding siblings ...)
2016-03-02 12:16 ` [dpdk-dev] [PATCH v3 4/5] lib/librte_eal: add tcpdump support in primary process Reshma Pattan
@ 2016-03-02 12:16 ` Reshma Pattan
2016-03-09 0:33 ` [dpdk-dev] [PATCH v3 0/5] add dpdk packet capture support for tcpdump Thomas Monjalon
5 siblings, 0 replies; 21+ messages in thread
From: Reshma Pattan @ 2016-03-02 12:16 UTC (permalink / raw)
To: dev
Added tcpdump design changes to proc_info section of
sample application user guide.
Added tcpdump design changes to env abstraction layer section
of programmers guide.
Updated Release notes
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
---
doc/guides/prog_guide/env_abstraction_layer.rst | 43 ++++++++++++++++-
doc/guides/rel_notes/release_16_04.rst | 6 ++
doc/guides/sample_app_ug/proc_info.rst | 57 +++++++++++++++++++----
3 files changed, 94 insertions(+), 12 deletions(-)
diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst
index 4737dc2..be06764 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -1,5 +1,5 @@
.. BSD LICENSE
- Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
All rights reserved.
Redistribution and use in source and binary forms, with or without
@@ -169,7 +169,8 @@ The EAL can query the CPU at runtime (using the rte_cpu_get_feature() function)
User Space Interrupt Event
~~~~~~~~~~~~~~~~~~~~~~~~~~
-+ User Space Interrupt and Alarm Handling in Host Thread
+User Space Interrupt and Alarm Handling in Host Thread
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The EAL creates a host thread to poll the UIO device file descriptors to detect the interrupts.
Callbacks can be registered or unregistered by the EAL functions for a specific interrupt event
@@ -182,7 +183,8 @@ The EAL also allows timed callbacks to be used in the same way as for NIC interr
i.e. link up and link down notification.
-+ RX Interrupt Event
+RX Interrupt Event
+^^^^^^^^^^^^^^^^^^
The receive and transmit routines provided by each PMD don't limit themselves to execute in polling thread mode.
To ease the idle polling with tiny throughput, it's useful to pause the polling and wait until the wake-up event happens.
@@ -207,6 +209,41 @@ The eth_dev driver takes responsibility to program the latter mapping.
The RX interrupt are controlled/enabled/disabled by ethdev APIs - 'rte_eth_dev_rx_intr_*'. They return failure if the PMD
hasn't support them yet. The intr_conf.rxq flag is used to turn on the capability of RX interrupt per device.
+Tcpdump
+^^^^^^^
+
+The EAL thread polls for the tcpdump file descriptor.
+If the polled file descriptor matches the tcpdump file descriptor it will initiate tcpdump processing.
+
+The EAL thread creates the socket for the tcpdump connection with the secondary process and registers the socket with
+the tcpdump epoll event.
+The tcpdump event will be polled as part of the interrupt thread.
+
+The EAL thread also creates the mempool and two rte_rings for packets duplication, and sharing the packet information
+with the secondary process respectively.
+
+Upon receiving the tcpdump event, the EAL thread either receives a "register" or "remove" message for RX/TX callbacks
+on the socket.
+
+For the "register" RX/TX callback message:
+
+* The EAL thread parses the port and queue information of the message and registers the ``rte_eth_rxtx_callbacks``
+ for the given port and queue.
+
+* The Rx callback will apply a src ip filter to the received packets and the matched packets will be duplicated to the
+ new mempool.
+ Duplicated packets will be enqueued to one of the rte_rings for the secondary process to use.
+ If no filter is provided, all the packets will be duplicated to the new mempool.
+
+* The TX callback will apply a src ip filter to the received packets and the reference count of the matched
+ packets will be incremented before enqueuing to the second rte_ring for the secondary process to use.
+ If no filter is provided, the reference count of all the packets will be incremented.
+
+For the "remove" RX/TX callback message:
+
+* The EAL thread parses the port and queue information of the message and removes the ``rte_eth_rxtx_callbacks``
+ for the given port and queue.
+
Blacklisting
~~~~~~~~~~~~
diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst
index fd7dd1a..bd6f33b 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -51,6 +51,9 @@ This section should contain new features added in this release. Sample format:
* **Added vhost-user live migration support.**
+* **Added dpdk packet capturing support for tcpdump.**
+
+ Now users have the facility to capture the packets of dpdk ports using dpdk_proc_info application.
Resolved Issues
---------------
@@ -83,6 +86,7 @@ Drivers
This made impossible the creation of more than one aesni_mb device
from command line.
+* **pcap: Added public API support for creating pcap device using pcaps and dumpers.**
Libraries
~~~~~~~~~
@@ -92,6 +96,7 @@ Libraries
Fix crc32c hash functions to return a valid crc32c value for data lengths
not multiple of 4 bytes.
+* **Enhanced eal library to support dpdk packet capturing support for tcpdump.**
Examples
~~~~~~~~
@@ -100,6 +105,7 @@ Examples
Other
~~~~~
+* **Enhanced app/proc_info to support dpdk packet capturing support for tcpdump.**
Known Issues
------------
diff --git a/doc/guides/sample_app_ug/proc_info.rst b/doc/guides/sample_app_ug/proc_info.rst
index 542950b..274c9e0 100644
--- a/doc/guides/sample_app_ug/proc_info.rst
+++ b/doc/guides/sample_app_ug/proc_info.rst
@@ -1,6 +1,6 @@
.. BSD LICENSE
- Copyright(c) 2015 Intel Corporation. All rights reserved.
+ Copyright(c) 2015-2016 Intel Corporation. All rights reserved.
All rights reserved.
Redistribution and use in source and binary forms, with or without
@@ -39,33 +39,72 @@ statistics, resetting port statistics and printing DPDK memory information.
This application extends the original functionality that was supported by
dump_cfg.
+The ``dpdk_proc_info`` application supports the command line options for configuring the packet
+capturing support for tcpdump.
+
+Overview of tcpdump flow
+------------------------
+
+* The ``dpdk_proc_info`` application parses the command line options for the port, queue, src ip filter
+ and other tcpdump options.
+
+* The application creates a socket to communicate with the primary process and sends the "register" RX/TX callback
+ message containing port, queue and src ip filter information.
+
+* Based on the ``--single-tcpdump-file`` command line option, it either creates a single pcap or two pcap devices for
+ writing ingress and egress packets for tcpdump.
+
+* It dequeues the packets sent by the primary process over the rte_rings and writes the packets to the pcap devices.
+
+* Upon an application termination i.e. using ``ctrl+c``, it sends the "remove" RX/TX callbacks message to the primary
+ process containing port, queue information.
+
Running the Application
-----------------------
+
The application has a number of command line options:
.. code-block:: console
- ./$(RTE_TARGET)/app/dpdk_proc_info -- -m | [-p PORTMASK] [--stats | --xstats |
- --stats-reset | --xstats-reset]
+ ./$(RTE_TARGET)/app/dpdk_proc_info -- -m | [-p PORTMASK] [--tcpdump (port,queue)] \
+ [ --src-ip-filter \"A.B.C.D\"] \
+ [--single-tcpdump-file] \
+ [--stats | --xstats | \
+ --stats-reset | --xstats-reset]
Parameters
~~~~~~~~~~
-**-p PORTMASK**: Hexadecimal bitmask of ports to configure.
-**--stats**
+``-p PORTMASK``: Hexadecimal bitmask of ports to configure.
+
+``--tcpdump (port,queue)``
+The tcpdump (port,queue) parameter controls the packet capturing support on a given port and queue.
+User need to specify either an unique queue id or '*' to capture the packets from specific queue or from all the queues
+of a given port.
+
+``--src-ip-filter "A.B.C.D"``
+The src-ip-filter parameter controls the packet filtering support.
+If the src ip filter option is not specified, all the packets will be captured.
+
+``--single-tcpdump-file``
+The single-tcpdump-file parameter controls packet capturing to a single pcap file.
+If the option is specified both the ingress and egress packets will be captured to a single pcap file otherwise they
+will be captured to two different RX and TX pcap files.
+
+``--stats``
The stats parameter controls the printing of generic port statistics. If no
port mask is specified stats are printed for all DPDK ports.
-**--xstats**
+``--xstats``
The stats parameter controls the printing of extended port statistics. If no
port mask is specified xstats are printed for all DPDK ports.
-**--stats-reset**
+``--stats-reset``
The stats-reset parameter controls the resetting of generic port statistics. If
no port mask is specified, the generic stats are reset for all DPDK ports.
-**--xstats-reset**
+``--xstats-reset``
The xstats-reset parameter controls the resetting of extended port statistics.
If no port mask is specified xstats are reset for all DPDK ports.
-**-m**: Print DPDK memory information.
+``-m``: Print DPDK memory information.
--
1.7.4.1
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [dpdk-dev] [PATCH v3 0/5] add dpdk packet capture support for tcpdump
2016-03-02 12:16 ` [dpdk-dev] [PATCH v3 " Reshma Pattan
` (4 preceding siblings ...)
2016-03-02 12:16 ` [dpdk-dev] [PATCH v3 5/5] doc: update doc for tcpdump feature Reshma Pattan
@ 2016-03-09 0:33 ` Thomas Monjalon
2016-03-11 14:18 ` Pattan, Reshma
5 siblings, 1 reply; 21+ messages in thread
From: Thomas Monjalon @ 2016-03-09 0:33 UTC (permalink / raw)
To: Reshma Pattan; +Cc: dev
Hi,
This series has not been reviewed enough to be ready for 16.04.
So it would be good to restart the discussion about the tcpdump requirements.
> packet capture flow for tcpdump:
> ================================
> Part of the design is implemented in secondary process (proc_info.c) and other part
> in primary process (eal_interrupt.c).
Why proc_info is used? Why not a dedicated tool?
> *User should request packet capture via proc_info application command line by passing newly
> added tcpdump command line options i.e. [--tcpdump (port,queue)] [ --src-ip-filter \"A.B.C.D\"]
> [--single-tcpdump-file].
>
> Note: As basic support, a src ip filter option is provided for filtering the packets.
> This is optional. If user dont provide any src ip filter option all packets will be captured
> for tcpdump.
Why filtering? Why only on IP address? Why not BPF?
> 2: Because of the underlying pcap writing overhead packets can only be captured at slow rates.
What is the benefit of slow rate capture in DPDK?
Shouldn't we target a high rate mechanism?
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [dpdk-dev] [PATCH v3 0/5] add dpdk packet capture support for tcpdump
2016-03-09 0:33 ` [dpdk-dev] [PATCH v3 0/5] add dpdk packet capture support for tcpdump Thomas Monjalon
@ 2016-03-11 14:18 ` Pattan, Reshma
0 siblings, 0 replies; 21+ messages in thread
From: Pattan, Reshma @ 2016-03-11 14:18 UTC (permalink / raw)
To: Thomas Monjalon; +Cc: dev
Hi Thomas,
> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Wednesday, March 9, 2016 12:34 AM
> To: Pattan, Reshma <reshma.pattan@intel.com>
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 0/5] add dpdk packet capture support for
> tcpdump
>
> Hi,
>
> This series has not been reviewed enough to be ready for 16.04.
> So it would be good to restart the discussion about the tcpdump requirements.
Yes, will plan for next steps.
>
> > packet capture flow for tcpdump:
> > ================================
> > Part of the design is implemented in secondary process (proc_info.c)
> > and other part in primary process (eal_interrupt.c).
>
> Why proc_info is used? Why not a dedicated tool?
proc_info or any other new tool, it must be secondary process.
proc_info is already simple secondary process and does take care of printing dpdk port's packet statistics upon users request,
hence same application was enhanced for dpdk packet capturing support.
>
> > *User should request packet capture via proc_info application command
> > line by passing newly added tcpdump command line options i.e.
> > [--tcpdump (port,queue)] [ --src-ip-filter \"A.B.C.D\"] [--single-tcpdump-file].
> >
> > Note: As basic support, a src ip filter option is provided for filtering the
> packets.
> > This is optional. If user dont provide any src ip filter option all
> > packets will be captured for tcpdump.
>
> Why filtering? Why only on IP address? Why not BPF?
>
Here, simple src-ip-filtering was demonstrated to give an idea on where filtering logic can fit in this design.
The filtering logic can be enhanced with BPF or by other filtering methods. This also improves performance.
> > 2: Because of the underlying pcap writing overhead packets can only be
> captured at slow rates.
>
> What is the benefit of slow rate capture in DPDK?
> Shouldn't we target a high rate mechanism?
I believe there will be performance improvements if we also use TUN/TAP PMD, but this is not up streamed in to DPDK.
Thanks,
Reshma
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2016-03-11 14:19 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-12 14:57 [dpdk-dev] [PATCH v2 0/5] add dpdk packet capture support for tcpdump Reshma Pattan
2016-02-12 14:57 ` [dpdk-dev] [PATCH v2 1/5] app/test-pmd: fix nb_rxq and nb_txq checks Reshma Pattan
2016-02-12 14:57 ` [dpdk-dev] [PATCH v2 2/5] drivers/net/pcap: add public api to create pcap device Reshma Pattan
2016-02-17 9:03 ` Pavel Fedin
2016-02-12 14:57 ` [dpdk-dev] [PATCH v2 3/5] app/proc_info: add tcpdump support in secondary process Reshma Pattan
2016-02-12 14:57 ` [dpdk-dev] [PATCH v2 4/5] lib/librte_eal: add tcpdump support in primary process Reshma Pattan
2016-02-17 9:57 ` Pavel Fedin
2016-02-12 14:57 ` [dpdk-dev] [PATCH v2 5/5] doc: update doc for tcpdump feature Reshma Pattan
2016-02-22 10:01 ` Mcnamara, John
2016-02-18 14:08 ` [dpdk-dev] [PATCH v2 0/5] add dpdk packet capture support for tcpdump Pavel Fedin
2016-02-23 13:16 ` Pattan, Reshma
2016-02-24 15:04 ` Pavel Fedin
2016-02-29 16:11 ` Pattan, Reshma
2016-03-02 12:16 ` [dpdk-dev] [PATCH v3 " Reshma Pattan
2016-03-02 12:16 ` [dpdk-dev] [PATCH v3 1/5] app/test-pmd: fix nb_rxq and nb_txq checks Reshma Pattan
2016-03-02 12:16 ` [dpdk-dev] [PATCH v3 2/5] drivers/net/pcap: add public api to create pcap device Reshma Pattan
2016-03-02 12:16 ` [dpdk-dev] [PATCH v3 3/5] app/proc_info: add tcpdump support in secondary process Reshma Pattan
2016-03-02 12:16 ` [dpdk-dev] [PATCH v3 4/5] lib/librte_eal: add tcpdump support in primary process Reshma Pattan
2016-03-02 12:16 ` [dpdk-dev] [PATCH v3 5/5] doc: update doc for tcpdump feature Reshma Pattan
2016-03-09 0:33 ` [dpdk-dev] [PATCH v3 0/5] add dpdk packet capture support for tcpdump Thomas Monjalon
2016-03-11 14:18 ` Pattan, Reshma
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).