* Re: [dpdk-dev] [PATCH] hash: fix memcmp function pointer in multi-process environment
2016-03-14 2:16 Dhana Eadala
@ 2016-03-14 2:30 ` 张伟
2016-03-14 4:38 ` 张伟
2016-03-22 11:42 ` Thomas Monjalon
2 siblings, 0 replies; 13+ messages in thread
From: 张伟 @ 2016-03-14 2:30 UTC (permalink / raw)
To: Dhana Eadala; +Cc: bruce.richardson, pablo.de.lara.guarch, michael.qiu, dev
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=GBK, Size: 14159 bytes --]
I met a problem which I used the DPDK hash table for multi processes. One started as primary process and the other as secondary process.
I based on the client and server multiprocess example. My aim is that server creates a hash table, then share it to the client. The client will read and write the hash table, and the server will read the hash table. I use rte_calloc allocate the space for hash table, use memzone tells the client the hash table address.
But once I add an entry into the hash table, calling "lookup" function will have the segment fault. But for the lookup function, I have exactly the same parameters for lookup when the first time calls the lookup.
If I create the hash table inside the client, everything works correctly.
I put pieces of codes for server and client codes related to the hash table. I have spent almost 3 days on this bug. But there is no any clue which can help to solve this bug. If any of you can give some suggestions, I will be appreciated. I post the question into the mail list, but have not yet got any reply.
This problem is that in dpdk multi process - client and server example, dpdk-2.2.0/examples/multi_process/client_server_mp
My aim is that server create a hash table, then share it to client. Client will write the hash table, server will read the hash table. I am using dpdk hash table. What I did is that server create a hash table (table and array entries), return the table address. I use memzone pass the table address to client. In client, the second lookup gets segment fault. The system gets crashed. I will put some related code here.
create hash table function:
struct onvm_ft*
onvm_ft_create(int cnt, int entry_size) {
struct rte_hash* hash;
struct onvm_ft* ft;
struct rte_hash_parameters ipv4_hash_params = {
.name = NULL,
.entries = cnt,
.key_len = sizeof(struct onvm_ft_ipv4_5tuple),
.hash_func = NULL,
.hash_func_init_val = 0,
};
char s[64];
/* create ipv4 hash table. use core number and cycle counter to get a unique name. */
ipv4_hash_params.name = s;
ipv4_hash_params.socket_id = rte_socket_id();
snprintf(s, sizeof(s), "onvm_ft_%d-%"PRIu64, rte_lcore_id(), rte_get_tsc_cycles());
hash = rte_hash_create(&ipv4_hash_params);
if (hash == NULL) {
return NULL;
}
ft = (struct onvm_ft*)rte_calloc("table", 1, sizeof(struct onvm_ft), 0);
if (ft == NULL) {
rte_hash_free(hash);
return NULL;
}
ft->hash = hash;
ft->cnt = cnt;
ft->entry_size = entry_size;
/* Create data array for storing values */
ft->data = rte_calloc("entry", cnt, entry_size, 0);
if (ft->data == NULL) {
rte_hash_free(hash);
rte_free(ft);
return NULL;
}
return ft;
}
related structure:
struct onvm_ft {
struct rte_hash* hash;
char* data;
int cnt;
int entry_size;
};
in server side, I will call the create function, use memzone share it to client. The following is what I do:
related variables:
struct onvm_ft *sdn_ft;
struct onvm_ft **sdn_ft_p;
const struct rte_memzone *mz_ftp;
sdn_ft = onvm_ft_create(1024, sizeof(struct onvm_flow_entry));
if(sdn_ft == NULL) {
rte_exit(EXIT_FAILURE, "Unable to create flow table\n");
}
mz_ftp = rte_memzone_reserve(MZ_FTP_INFO, sizeof(struct onvm_ft *),
rte_socket_id(), NO_FLAGS);
if (mz_ftp == NULL) {
rte_exit(EXIT_FAILURE, "Canot reserve memory zone for flow table pointer\n");
}
memset(mz_ftp->addr, 0, sizeof(struct onvm_ft *));
sdn_ft_p = mz_ftp->addr;
*sdn_ft_p = sdn_ft;
In client side:
struct onvm_ft *sdn_ft;
static void
map_flow_table(void) {
const struct rte_memzone *mz_ftp;
struct onvm_ft **ftp;
mz_ftp = rte_memzone_lookup(MZ_FTP_INFO);
if (mz_ftp == NULL)
rte_exit(EXIT_FAILURE, "Cannot get flow table pointer\n");
ftp = mz_ftp->addr;
sdn_ft = *ftp;
}
The following is my debug message: I set a breakpoint in lookup table line. To narrow down the problem, I just send one flow. So the second time and the first time, the packets are the same.
For the first time, it works. I print out the parameters: inside the onvm_ft_lookup function, if there is a related entry, it will return the address by flow_entry.
Breakpoint 1, datapath_handle_read (dp=0x7ffff00008c0) at /home/zhangwei1984/openNetVM-master/openNetVM/examples/flow_table/sdn.c:191
191 ret = onvm_ft_lookup(sdn_ft, fk, (char**)&flow_entry);
(gdb) print *sdn_ft
$1 = {hash = 0x7fff32cce740, data = 0x7fff32cb0480 "", cnt = 1024, entry_size = 56}
(gdb) print *fk
$2 = {src_addr = 419496202, dst_addr = 453050634, src_port = 53764, dst_port = 11798, proto = 17 '\021'}
(gdb) s
onvm_ft_lookup (table=0x7fff32cbe4c0, key=0x7fff32b99d00, data=0x7ffff68d2b00) at /home/zhangwei1984/openNetVM-master/openNetVM/onvm/shared/onvm_flow_table.c:151
151 softrss = onvm_softrss(key);
(gdb) n
152 printf("software rss %d\n", softrss);
(gdb)
software rss 403183624
154 tbl_index = rte_hash_lookup_with_hash(table->hash, (const void *)key, softrss);
(gdb) print table->hash
$3 = (struct rte_hash *) 0x7fff32cce740
(gdb) print *key
$4 = {src_addr = 419496202, dst_addr = 453050634, src_port = 53764, dst_port = 11798, proto = 17 '\021'}
(gdb) print softrss
$5 = 403183624
(gdb) c
After I hit c, it will do the second lookup,
Breakpoint 1, datapath_handle_read (dp=0x7ffff00008c0) at /home/zhangwei1984/openNetVM-master/openNetVM/examples/flow_table/sdn.c:191
191 ret = onvm_ft_lookup(sdn_ft, fk, (char**)&flow_entry);
(gdb) print *sdn_ft
$7 = {hash = 0x7fff32cce740, data = 0x7fff32cb0480 "", cnt = 1024, entry_size = 56}
(gdb) print *fk
$8 = {src_addr = 419496202, dst_addr = 453050634, src_port = 53764, dst_port = 11798, proto = 17 '\021'}
(gdb) s
onvm_ft_lookup (table=0x7fff32cbe4c0, key=0x7fff32b99c00, data=0x7ffff68d2b00) at /home/zhangwei1984/openNetVM-master/openNetVM/onvm/shared/onvm_flow_table.c:151
151 softrss = onvm_softrss(key);
(gdb) n
152 printf("software rss %d\n", softrss);
(gdb) n
software rss 403183624
154 tbl_index = rte_hash_lookup_with_hash(table->hash, (const void *)key, softrss);
(gdb) print table->hash
$9 = (struct rte_hash *) 0x7fff32cce740
(gdb) print *key
$10 = {src_addr = 419496202, dst_addr = 453050634, src_port = 53764, dst_port = 11798, proto = 17 '\021'}
(gdb) print softrss
$11 = 403183624
(gdb) n
Program received signal SIGSEGV, Segmentation fault.
0x000000000045fb97 in __rte_hash_lookup_bulk ()
(gdb) bt
#0 0x000000000045fb97 in __rte_hash_lookup_bulk ()
#1 0x0000000000000000 in ?? ()
From the debug message, the parameters are exactly the same. I do not know why it has the segmentation fault.
my lookup function:
int
onvm_ft_lookup(struct onvm_ft* table, struct onvm_ft_ipv4_5tuple *key, char** data) {
int32_t tbl_index;
uint32_t softrss;
softrss = onvm_softrss(key);
printf("software rss %d\n", softrss);
tbl_index = rte_hash_lookup_with_hash(table->hash, (const void *)key, softrss);
if (tbl_index >= 0) {
*data = onvm_ft_get_data(table, tbl_index);
return 0;
}
else {
return tbl_index;
}
}
At 2016-03-14 10:16:48, "Dhana Eadala" <edreddy@gmail.com> wrote:
>We found a problem in dpdk-2.2 using under multi-process environment.
>Here is the brief description how we are using the dpdk:
>
>We have two processes proc1, proc2 using dpdk. These proc1 and proc2 are
>two different compiled binaries.
>proc1 is started as primary process and proc2 as secondary process.
>
>proc1:
>Calls srcHash = rte_hash_create("src_hash_name") to create rte_hash structure.
>As part of this, this api initalized the rte_hash structure and set the
>srcHash->rte_hash_cmp_eq to the address of memcmp() from proc1 address space.
>
>proc2:
>calls srcHash = rte_hash_find_existing("src_hash_name").
>This function call returns the rte_hash created by proc1.
>This srcHash->rte_hash_cmp_eq still points to the address of
>memcmp() from proc1 address space.
>Later proc2 calls
>rte_hash_lookup_with_hash(srcHash, (const void*) &key, key.sig);
>rte_hash_lookup_with_hash() invokes __rte_hash_lookup_with_hash(),
>which in turn calls h->rte_hash_cmp_eq(key, k->key, h->key_len).
>This leads to a crash as h->rte_hash_cmp_eq is an address
>from proc1 address space and is invalid address in proc2 address space.
>
>We found, from dpdk documentation, that
>
>"
> The use of function pointers between multiple processes
> running based of different compiled
> binaries is not supported, since the location of a given function
> in one process may be different to
> its location in a second. This prevents the librte_hash library
> from behaving properly as in a multi-
> threaded instance, since it uses a pointer to the hash function internally.
>
> To work around this issue, it is recommended that
> multi-process applications perform the hash
> calculations by directly calling the hashing function
> from the code and then using the
> rte_hash_add_with_hash()/rte_hash_lookup_with_hash() functions
> instead of the functions which do
> the hashing internally, such as rte_hash_add()/rte_hash_lookup().
>"
>
>We did follow the recommended steps by invoking rte_hash_lookup_with_hash().
>It was no issue up to and including dpdk-2.0.
>In later releases started crashing because rte_hash_cmp_eq is
>introduced in dpdk-2.1
>
>We fixed it with the following patch and would like to
>submit the patch to dpdk.org.
>Patch is created such that, if anyone wanted to use dpdk in
>multi-process environment with function pointers not shared, they need to
>define RTE_LIB_MP_NO_FUNC_PTR in their Makefile.
>Without defining this flag in Makefile, it works as it is now.
>
>Signed-off-by: Dhana Eadala <edreddy@gmail.com>
>---
> lib/librte_hash/rte_cuckoo_hash.c | 28 ++++++++++++++++++++++++++++
> 1 file changed, 28 insertions(+)
>
>diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
>index 3e3167c..0946777 100644
>--- a/lib/librte_hash/rte_cuckoo_hash.c
>+++ b/lib/librte_hash/rte_cuckoo_hash.c
>@@ -594,7 +594,11 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
> prim_bkt->signatures[i].alt == alt_hash) {
> k = (struct rte_hash_key *) ((char *)keys +
> prim_bkt->key_idx[i] * h->key_entry_size);
>+#ifdef RTE_LIB_MP_NO_FUNC_PTR
>+ if (memcmp(key, k->key, h->key_len) == 0) {
>+#else
> if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) {
>+#endif
> /* Enqueue index of free slot back in the ring. */
> enqueue_slot_back(h, cached_free_slots, slot_id);
> /* Update data */
>@@ -614,7 +618,11 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
> sec_bkt->signatures[i].current == alt_hash) {
> k = (struct rte_hash_key *) ((char *)keys +
> sec_bkt->key_idx[i] * h->key_entry_size);
>+#ifdef RTE_LIB_MP_NO_FUNC_PTR
>+ if (memcmp(key, k->key, h->key_len) == 0) {
>+#else
> if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) {
>+#endif
> /* Enqueue index of free slot back in the ring. */
> enqueue_slot_back(h, cached_free_slots, slot_id);
> /* Update data */
>@@ -725,7 +733,11 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
> bkt->signatures[i].sig != NULL_SIGNATURE) {
> k = (struct rte_hash_key *) ((char *)keys +
> bkt->key_idx[i] * h->key_entry_size);
>+#ifdef RTE_LIB_MP_NO_FUNC_PTR
>+ if (memcmp (key, k->key, h->key_len) == 0) {
>+#else
> if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) {
>+#endif
> if (data != NULL)
> *data = k->pdata;
> /*
>@@ -748,7 +760,11 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
> bkt->signatures[i].alt == sig) {
> k = (struct rte_hash_key *) ((char *)keys +
> bkt->key_idx[i] * h->key_entry_size);
>+#ifdef RTE_LIB_MP_NO_FUNC_PTR
>+ if (memcmp(key, k->key, h->key_len) == 0) {
>+#else
> if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) {
>+#endif
> if (data != NULL)
> *data = k->pdata;
> /*
>@@ -840,7 +856,11 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
> bkt->signatures[i].sig != NULL_SIGNATURE) {
> k = (struct rte_hash_key *) ((char *)keys +
> bkt->key_idx[i] * h->key_entry_size);
>+#ifdef RTE_LIB_MP_NO_FUNC_PTR
>+ if (memcmp(key, k->key, h->key_len) == 0) {
>+#else
> if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) {
>+#endif
> remove_entry(h, bkt, i);
>
> /*
>@@ -863,7 +883,11 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
> bkt->signatures[i].sig != NULL_SIGNATURE) {
> k = (struct rte_hash_key *) ((char *)keys +
> bkt->key_idx[i] * h->key_entry_size);
>+#ifdef RTE_LIB_MP_NO_FUNC_PTR
>+ if (memcmp(key, k->key, h->key_len) == 0) {
>+#else
> if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) {
>+#endif
> remove_entry(h, bkt, i);
>
> /*
>@@ -980,7 +1004,11 @@ lookup_stage3(unsigned idx, const struct rte_hash_key *key_slot, const void * co
> unsigned hit;
> unsigned key_idx;
>
>+#ifdef RTE_LIB_MP_NO_FUNC_PTR
>+ hit = !memcmp(key_slot->key, keys[idx], h->key_len);
>+#else
> hit = !h->rte_hash_cmp_eq(key_slot->key, keys[idx], h->key_len);
>+#endif
> if (data != NULL)
> data[idx] = key_slot->pdata;
>
>--
>2.5.0
>
\x16º&Î\x17j!yÈ"²\r4âÈb×¥r¢s\x1a¯^4ã¸çm5é\x17º¹ÏjØs
ëZ¢È^r\b¬M8²\x18§µé\¢d^qè¯y×ë¢i kMb×¥r¦5{^Ê&×ݹç¼ñ¼¥Ù(®\x03è²×âÇ\bDLø\x13 ´ÛÂ\rú+uëÝ¥Ù(®\x03(^\fj½´×8ãn8ûMtÐ!\x13Eç\x1e÷~º&~k&4Ùù¢×¥rµÓnwÛºo'æ²h\x1a×M_)ízW(\bDLýw1ªöÓ^¶Ó¸ßNôÑq1´º(§]W"º'>í\x01Q
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [dpdk-dev] [PATCH] hash: fix memcmp function pointer in multi-process environment
2016-03-14 2:16 Dhana Eadala
2016-03-14 2:30 ` 张伟
@ 2016-03-14 4:38 ` 张伟
2016-03-15 0:57 ` Dhananjaya Eadala
2016-03-22 11:42 ` Thomas Monjalon
2 siblings, 1 reply; 13+ messages in thread
From: 张伟 @ 2016-03-14 4:38 UTC (permalink / raw)
To: Dhana Eadala; +Cc: bruce.richardson, pablo.de.lara.guarch, michael.qiu, dev
BTW, the following is my backtrace when the system crashes.
Program received signal SIGSEGV, Segmentation fault.
0x00000000004883ab in rte_hash_reset (h=0x0)
at /home/zhangwei1984/timopenNetVM/dpdk-2.2.0/lib/librte_hash/rte_cuckoo_hash.c:444
444while (rte_ring_dequeue(h->free_slots, &ptr) == 0)
(gdb) bt
#0 0x00000000004883ab in rte_hash_reset (h=0x0)
at /home/zhangwei1984/timopenNetVM/dpdk-2.2.0/lib/librte_hash/rte_cuckoo_hash.c:444
#1 0x000000000048fdfb in rte_hash_lookup_with_hash (h=0x7fff32cce740, key=0x7fffffffe220, sig=403183624)
at /home/zhangwei1984/timopenNetVM/dpdk-2.2.0/lib/librte_hash/rte_cuckoo_hash.c:771
#2 0x000000000042b551 in onvm_ft_lookup_with_hash (table=0x7fff32cbe4c0, pkt=0x7fff390ea9c0,
data=0x7fffffffe298) at /home/zhangwei1984/openNetVM-master/openNetVM/onvm/shared/onvm_flow_table.c:104
#3 0x000000000042b8c3 in onvm_flow_dir_get_with_hash (table=0x7fff32cbe4c0, pkt=0x7fff390ea9c0,
flow_entry=0x7fffffffe298)
at /home/zhangwei1984/openNetVM-master/openNetVM/onvm/shared/onvm_flow_dir.c:14
#4 0x00000000004251d7 in packet_handler (pkt=0x7fff390ea9c0, meta=0x7fff390eaa00)
at /home/zhangwei1984/openNetVM-master/openNetVM/examples/flow_table/flow_table.c:212
#5 0x0000000000429502 in onvm_nf_run ()
#6 0x00000000004253f1 in main (argc=1, argv=0x7fffffffe648)
at /home/zhangwei1984/openNetVM-master/openNetVM/examples/flow_table/flow_table.c:272
(gdb)
I met a problem which I used the DPDK hash table for multi processes. One started as primary process and the other as secondary process.
I based on the client and server multiprocess example. My aim is that server creates a hash table, then share it to the client. The client will read and write the hash table, and the server will read the hash table. I use rte_calloc allocate the space for hash table, use memzone tells the client the hash table address.
But once I add an entry into the hash table, calling "lookup" function will have the segment fault. But for the lookup function, I have exactly the same parameters for lookup when the first time calls the lookup.
If I create the hash table inside the client, everything works correctly.
I put pieces of codes for server and client codes related to the hash table. I have spent almost 3 days on this bug. But there is no any clue which can help to solve this bug. If any of you can give some suggestions, I will be appreciated. I post the question into the mail list, but have not yet got any reply.
This problem is that in dpdk multi process - client and server example, dpdk-2.2.0/examples/multi_process/client_server_mp
My aim is that server create a hash table, then share it to client. Client will write the hash table, server will read the hash table. I am using dpdk hash table. What I did is that server create a hash table (table and array entries), return the table address. I use memzone pass the table address to client. In client, the second lookup gets segment fault. The system gets crashed. I will put some related code here.
create hash table function:
struct onvm_ft*
onvm_ft_create(int cnt, int entry_size) {
struct rte_hash* hash;
struct onvm_ft* ft;
struct rte_hash_parameters ipv4_hash_params = {
.name = NULL,
.entries = cnt,
.key_len = sizeof(struct onvm_ft_ipv4_5tuple),
.hash_func = NULL,
.hash_func_init_val = 0,
};
char s[64];
/* create ipv4 hash table. use core number and cycle counter to get a unique name. */
ipv4_hash_params.name = s;
ipv4_hash_params.socket_id = rte_socket_id();
snprintf(s, sizeof(s), "onvm_ft_%d-%"PRIu64, rte_lcore_id(), rte_get_tsc_cycles());
hash = rte_hash_create(&ipv4_hash_params);
if (hash == NULL) {
return NULL;
}
ft = (struct onvm_ft*)rte_calloc("table", 1, sizeof(struct onvm_ft), 0);
if (ft == NULL) {
rte_hash_free(hash);
return NULL;
}
ft->hash = hash;
ft->cnt = cnt;
ft->entry_size = entry_size;
/* Create data array for storing values */
ft->data = rte_calloc("entry", cnt, entry_size, 0);
if (ft->data == NULL) {
rte_hash_free(hash);
rte_free(ft);
return NULL;
}
return ft;
}
related structure:
struct onvm_ft {
struct rte_hash* hash;
char* data;
int cnt;
int entry_size;
};
in server side, I will call the create function, use memzone share it to client. The following is what I do:
related variables:
struct onvm_ft *sdn_ft;
struct onvm_ft **sdn_ft_p;
const struct rte_memzone *mz_ftp;
sdn_ft = onvm_ft_create(1024, sizeof(struct onvm_flow_entry));
if(sdn_ft == NULL) {
rte_exit(EXIT_FAILURE, "Unable to create flow table\n");
}
mz_ftp = rte_memzone_reserve(MZ_FTP_INFO, sizeof(struct onvm_ft *),
rte_socket_id(), NO_FLAGS);
if (mz_ftp == NULL) {
rte_exit(EXIT_FAILURE, "Canot reserve memory zone for flow table pointer\n");
}
memset(mz_ftp->addr, 0, sizeof(struct onvm_ft *));
sdn_ft_p = mz_ftp->addr;
*sdn_ft_p = sdn_ft;
In client side:
struct onvm_ft *sdn_ft;
static void
map_flow_table(void) {
const struct rte_memzone *mz_ftp;
struct onvm_ft **ftp;
mz_ftp = rte_memzone_lookup(MZ_FTP_INFO);
if (mz_ftp == NULL)
rte_exit(EXIT_FAILURE, "Cannot get flow table pointer\n");
ftp = mz_ftp->addr;
sdn_ft = *ftp;
}
The following is my debug message: I set a breakpoint in lookup table line. To narrow down the problem, I just send one flow. So the second time and the first time, the packets are the same.
For the first time, it works. I print out the parameters: inside the onvm_ft_lookup function, if there is a related entry, it will return the address by flow_entry.
Breakpoint 1, datapath_handle_read (dp=0x7ffff00008c0) at /home/zhangwei1984/openNetVM-master/openNetVM/examples/flow_table/sdn.c:191
191 ret = onvm_ft_lookup(sdn_ft, fk, (char**)&flow_entry);
(gdb) print *sdn_ft
$1 = {hash = 0x7fff32cce740, data = 0x7fff32cb0480 "", cnt = 1024, entry_size = 56}
(gdb) print *fk
$2 = {src_addr = 419496202, dst_addr = 453050634, src_port = 53764, dst_port = 11798, proto = 17 '\021'}
(gdb) s
onvm_ft_lookup (table=0x7fff32cbe4c0, key=0x7fff32b99d00, data=0x7ffff68d2b00) at /home/zhangwei1984/openNetVM-master/openNetVM/onvm/shared/onvm_flow_table.c:151
151 softrss = onvm_softrss(key);
(gdb) n
152 printf("software rss %d\n", softrss);
(gdb)
software rss 403183624
154 tbl_index = rte_hash_lookup_with_hash(table->hash, (const void *)key, softrss);
(gdb) print table->hash
$3 = (struct rte_hash *) 0x7fff32cce740
(gdb) print *key
$4 = {src_addr = 419496202, dst_addr = 453050634, src_port = 53764, dst_port = 11798, proto = 17 '\021'}
(gdb) print softrss
$5 = 403183624
(gdb) c
After I hit c, it will do the second lookup,
Breakpoint 1, datapath_handle_read (dp=0x7ffff00008c0) at /home/zhangwei1984/openNetVM-master/openNetVM/examples/flow_table/sdn.c:191
191 ret = onvm_ft_lookup(sdn_ft, fk, (char**)&flow_entry);
(gdb) print *sdn_ft
$7 = {hash = 0x7fff32cce740, data = 0x7fff32cb0480 "", cnt = 1024, entry_size = 56}
(gdb) print *fk
$8 = {src_addr = 419496202, dst_addr = 453050634, src_port = 53764, dst_port = 11798, proto = 17 '\021'}
(gdb) s
onvm_ft_lookup (table=0x7fff32cbe4c0, key=0x7fff32b99c00, data=0x7ffff68d2b00) at /home/zhangwei1984/openNetVM-master/openNetVM/onvm/shared/onvm_flow_table.c:151
151 softrss = onvm_softrss(key);
(gdb) n
152 printf("software rss %d\n", softrss);
(gdb) n
software rss 403183624
154 tbl_index = rte_hash_lookup_with_hash(table->hash, (const void *)key, softrss);
(gdb) print table->hash
$9 = (struct rte_hash *) 0x7fff32cce740
(gdb) print *key
$10 = {src_addr = 419496202, dst_addr = 453050634, src_port = 53764, dst_port = 11798, proto = 17 '\021'}
(gdb) print softrss
$11 = 403183624
(gdb) n
Program received signal SIGSEGV, Segmentation fault.
0x000000000045fb97 in __rte_hash_lookup_bulk ()
(gdb) bt
#0 0x000000000045fb97 in __rte_hash_lookup_bulk ()
#1 0x0000000000000000 in ?? ()
From the debug message, the parameters are exactly the same. I do not know why it has the segmentation fault.
my lookup function:
int
onvm_ft_lookup(struct onvm_ft* table, struct onvm_ft_ipv4_5tuple *key, char** data) {
int32_t tbl_index;
uint32_t softrss;
softrss = onvm_softrss(key);
printf("software rss %d\n", softrss);
tbl_index = rte_hash_lookup_with_hash(table->hash, (const void *)key, softrss);
if (tbl_index >= 0) {
*data = onvm_ft_get_data(table, tbl_index);
return 0;
}
else {
return tbl_index;
}
}
At 2016-03-14 10:16:48, "Dhana Eadala" <edreddy@gmail.com> wrote:
>We found a problem in dpdk-2.2 using under multi-process environment.
>Here is the brief description how we are using the dpdk:
>
>We have two processes proc1, proc2 using dpdk. These proc1 and proc2 are
>two different compiled binaries.
>proc1 is started as primary process and proc2 as secondary process.
>
>proc1:
>Calls srcHash = rte_hash_create("src_hash_name") to create rte_hash structure.
>As part of this, this api initalized the rte_hash structure and set the
>srcHash->rte_hash_cmp_eq to the address of memcmp() from proc1 address space.
>
>proc2:
>calls srcHash = rte_hash_find_existing("src_hash_name").
>This function call returns the rte_hash created by proc1.
>This srcHash->rte_hash_cmp_eq still points to the address of
>memcmp() from proc1 address space.
>Later proc2 calls
>rte_hash_lookup_with_hash(srcHash, (const void*) &key, key.sig);
>rte_hash_lookup_with_hash() invokes __rte_hash_lookup_with_hash(),
>which in turn calls h->rte_hash_cmp_eq(key, k->key, h->key_len).
>This leads to a crash as h->rte_hash_cmp_eq is an address
>from proc1 address space and is invalid address in proc2 address space.
>
>We found, from dpdk documentation, that
>
>"
> The use of function pointers between multiple processes
> running based of different compiled
> binaries is not supported, since the location of a given function
> in one process may be different to
> its location in a second. This prevents the librte_hash library
> from behaving properly as in a multi-
> threaded instance, since it uses a pointer to the hash function internally.
>
> To work around this issue, it is recommended that
> multi-process applications perform the hash
> calculations by directly calling the hashing function
> from the code and then using the
> rte_hash_add_with_hash()/rte_hash_lookup_with_hash() functions
> instead of the functions which do
> the hashing internally, such as rte_hash_add()/rte_hash_lookup().
>"
>
>We did follow the recommended steps by invoking rte_hash_lookup_with_hash().
>It was no issue up to and including dpdk-2.0.
>In later releases started crashing because rte_hash_cmp_eq is
>introduced in dpdk-2.1
>
>We fixed it with the following patch and would like to
>submit the patch to dpdk.org.
>Patch is created such that, if anyone wanted to use dpdk in
>multi-process environment with function pointers not shared, they need to
>define RTE_LIB_MP_NO_FUNC_PTR in their Makefile.
>Without defining this flag in Makefile, it works as it is now.
>
>Signed-off-by: Dhana Eadala <edreddy@gmail.com>
>---
> lib/librte_hash/rte_cuckoo_hash.c | 28 ++++++++++++++++++++++++++++
> 1 file changed, 28 insertions(+)
>
>diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
>index 3e3167c..0946777 100644
>--- a/lib/librte_hash/rte_cuckoo_hash.c
>+++ b/lib/librte_hash/rte_cuckoo_hash.c
>@@ -594,7 +594,11 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
> prim_bkt->signatures[i].alt == alt_hash) {
> k = (struct rte_hash_key *) ((char *)keys +
> prim_bkt->key_idx[i] * h->key_entry_size);
>+#ifdef RTE_LIB_MP_NO_FUNC_PTR
>+ if (memcmp(key, k->key, h->key_len) == 0) {
>+#else
> if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) {
>+#endif
> /* Enqueue index of free slot back in the ring. */
> enqueue_slot_back(h, cached_free_slots, slot_id);
> /* Update data */
>@@ -614,7 +618,11 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
> sec_bkt->signatures[i].current == alt_hash) {
> k = (struct rte_hash_key *) ((char *)keys +
> sec_bkt->key_idx[i] * h->key_entry_size);
>+#ifdef RTE_LIB_MP_NO_FUNC_PTR
>+ if (memcmp(key, k->key, h->key_len) == 0) {
>+#else
> if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) {
>+#endif
> /* Enqueue index of free slot back in the ring. */
> enqueue_slot_back(h, cached_free_slots, slot_id);
> /* Update data */
>@@ -725,7 +733,11 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
> bkt->signatures[i].sig != NULL_SIGNATURE) {
> k = (struct rte_hash_key *) ((char *)keys +
> bkt->key_idx[i] * h->key_entry_size);
>+#ifdef RTE_LIB_MP_NO_FUNC_PTR
>+ if (memcmp (key, k->key, h->key_len) == 0) {
>+#else
> if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) {
>+#endif
> if (data != NULL)
> *data = k->pdata;
> /*
>@@ -748,7 +760,11 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
> bkt->signatures[i].alt == sig) {
> k = (struct rte_hash_key *) ((char *)keys +
> bkt->key_idx[i] * h->key_entry_size);
>+#ifdef RTE_LIB_MP_NO_FUNC_PTR
>+ if (memcmp(key, k->key, h->key_len) == 0) {
>+#else
> if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) {
>+#endif
> if (data != NULL)
> *data = k->pdata;
> /*
>@@ -840,7 +856,11 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
> bkt->signatures[i].sig != NULL_SIGNATURE) {
> k = (struct rte_hash_key *) ((char *)keys +
> bkt->key_idx[i] * h->key_entry_size);
>+#ifdef RTE_LIB_MP_NO_FUNC_PTR
>+ if (memcmp(key, k->key, h->key_len) == 0) {
>+#else
> if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) {
>+#endif
> remove_entry(h, bkt, i);
>
> /*
>@@ -863,7 +883,11 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
> bkt->signatures[i].sig != NULL_SIGNATURE) {
> k = (struct rte_hash_key *) ((char *)keys +
> bkt->key_idx[i] * h->key_entry_size);
>+#ifdef RTE_LIB_MP_NO_FUNC_PTR
>+ if (memcmp(key, k->key, h->key_len) == 0) {
>+#else
> if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) {
>+#endif
> remove_entry(h, bkt, i);
>
> /*
>@@ -980,7 +1004,11 @@ lookup_stage3(unsigned idx, const struct rte_hash_key *key_slot, const void * co
> unsigned hit;
> unsigned key_idx;
>
>+#ifdef RTE_LIB_MP_NO_FUNC_PTR
>+ hit = !memcmp(key_slot->key, keys[idx], h->key_len);
>+#else
> hit = !h->rte_hash_cmp_eq(key_slot->key, keys[idx], h->key_len);
>+#endif
> if (data != NULL)
> data[idx] = key_slot->pdata;
>
>--
>2.5.0
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [dpdk-dev] [PATCH] hash: fix memcmp function pointer in multi-process environment
2016-03-14 4:38 ` 张伟
@ 2016-03-15 0:57 ` Dhananjaya Eadala
2016-03-15 1:02 ` 张伟
0 siblings, 1 reply; 13+ messages in thread
From: Dhananjaya Eadala @ 2016-03-15 0:57 UTC (permalink / raw)
To: 张伟; +Cc: bruce.richardson, pablo.de.lara.guarch, michael.qiu, dev
Hi
I looked at your info from gdb and source code.
From source code, rte_hash_lookup_with_hash /
__rte_hash_lookup_with_hash() function doesn't invoke rte_hash_reset()
function.
It may be possible that rte_hash_cmp_eq function pointer from one
process is matching to address of rte_hash_reset () in other process.
For quick test, you can do two things.
1. just apply the patch I provided and recompile the dpdk with a -D flag
"RTE_LIB_MP_NO_FUNC_PTR" defined and re-run your test
2. If it still crashes, disable the optimization in Makefiles and see,
using gdb, how you are entering rte_hash_reset() from
__rte_hash_lookup_with_hash(). this will tell you if it was a problem
with function pointers in multi-process environment
Thanks
Dhana
On 03/14/2016 12:38 AM, 张伟 wrote:
> BTW, the following is my backtrace when the system crashes.
>
> Program received signal SIGSEGV, Segmentation fault.
>
> 0x00000000004883ab in rte_hash_reset (h=0x0)
>
> at
> /home/zhangwei1984/timopenNetVM/dpdk-2.2.0/lib/librte_hash/rte_cuckoo_hash.c:444
>
> 444while (rte_ring_dequeue(h->free_slots, &ptr) == 0)
>
> (gdb) bt
>
> #0 0x00000000004883ab in rte_hash_reset (h=0x0)
>
> at
> /home/zhangwei1984/timopenNetVM/dpdk-2.2.0/lib/librte_hash/rte_cuckoo_hash.c:444
>
> #1 0x000000000048fdfb in rte_hash_lookup_with_hash (h=0x7fff32cce740,
> key=0x7fffffffe220, sig=403183624)
>
> at
> /home/zhangwei1984/timopenNetVM/dpdk-2.2.0/lib/librte_hash/rte_cuckoo_hash.c:771
>
> #2 0x000000000042b551 in onvm_ft_lookup_with_hash
> (table=0x7fff32cbe4c0, pkt=0x7fff390ea9c0,
>
> data=0x7fffffffe298) at
> /home/zhangwei1984/openNetVM-master/openNetVM/onvm/shared/onvm_flow_table.c:104
>
> #3 0x000000000042b8c3 in onvm_flow_dir_get_with_hash
> (table=0x7fff32cbe4c0, pkt=0x7fff390ea9c0,
>
> flow_entry=0x7fffffffe298)
>
> at
> /home/zhangwei1984/openNetVM-master/openNetVM/onvm/shared/onvm_flow_dir.c:14
>
> #4 0x00000000004251d7 in packet_handler (pkt=0x7fff390ea9c0,
> meta=0x7fff390eaa00)
>
> at
> /home/zhangwei1984/openNetVM-master/openNetVM/examples/flow_table/flow_table.c:212
>
> #5 0x0000000000429502 in onvm_nf_run ()
>
> #6 0x00000000004253f1 in main (argc=1, argv=0x7fffffffe648)
>
> at
> /home/zhangwei1984/openNetVM-master/openNetVM/examples/flow_table/flow_table.c:272
>
> (gdb)
>
>
>
>
> I met a problem which I used the DPDK hash table for multi processes.
> One started as primary process and the other as secondary process.
> I based on the client and server multiprocess example. My aim is that
> server creates a hash table, then share it to the client. The client
> will read and write the hash table, and the server will read the hash
> table. I use rte_calloc allocate the space for hash table, use memzone
> tells the client the hash table address.
> But once I add an entry into the hash table, calling "lookup" function
> will have the segment fault. But for the lookup function, I have
> exactly the same parameters for lookup when the first time calls the
> lookup.
> If I create the hash table inside the client, everything works correctly.
> I put pieces of codes for server and client codes related to the hash
> table. I have spent almost 3 days on this bug. But there is no any
> clue which can help to solve this bug. If any of you can give some
> suggestions, I will be appreciated. I post the question into the mail
> list, but have not yet got any reply.
>
> This problem is that in dpdk multi process - client and server
> example, dpdk-2.2.0/examples/multi_process/client_server_mp
> My aim is that server create a hash table, then share it to client.
> Client will write the hash table, server will read the hash table. I
> am using dpdk hash table. What I did is that server create a hash
> table (table and array entries), return the table address. I use
> memzone pass the table address to client. In client, the second
> lookup gets segment fault. The system gets crashed. I will put some
> related code here.
> create hash table function:
>
> struct onvm_ft*
>
> onvm_ft_create(int cnt, int entry_size) {
>
> struct rte_hash* hash;
>
> struct onvm_ft* ft;
>
> struct rte_hash_parameters ipv4_hash_params = {
>
> .name = NULL,
>
> .entries = cnt,
>
> .key_len = sizeof(struct onvm_ft_ipv4_5tuple),
>
> .hash_func = NULL,
>
> .hash_func_init_val = 0,
>
> };
>
>
> char s[64];
>
> /* create ipv4 hash table. use core number and cycle counter
> to get a unique name. */
>
> ipv4_hash_params.name <http://ipv4_hash_params.name/> = s;
>
> ipv4_hash_params.socket_id = rte_socket_id();
>
> snprintf(s, sizeof(s), "onvm_ft_%d-%"PRIu64, rte_lcore_id(),
> rte_get_tsc_cycles());
>
> hash = rte_hash_create(&ipv4_hash_params);
>
> if (hash == NULL) {
>
> return NULL;
>
> }
>
> ft =
> (struct onvm_ft*)rte_calloc("table", 1, sizeof(struct onvm_ft), 0);
>
> if (ft == NULL) {
>
> rte_hash_free(hash);
>
> return NULL;
>
> }
>
> ft->hash = hash;
>
> ft->cnt = cnt;
>
> ft->entry_size = entry_size;
>
> /* Create data array for storing values */
>
> ft->data = rte_calloc("entry", cnt, entry_size, 0);
>
> if (ft->data == NULL) {
>
> rte_hash_free(hash);
>
> rte_free(ft);
>
> return NULL;
>
> }
>
> return ft;
>
> }
>
>
> related structure:
>
> struct onvm_ft {
>
> struct rte_hash* hash;
>
> char* data;
>
> int cnt;
>
> int entry_size;
>
> };
>
>
> in server side, I will call the create function, use memzone share it
> to client. The following is what I do:
>
> related variables:
>
> struct onvm_ft *sdn_ft;
>
> struct onvm_ft **sdn_ft_p;
>
> const struct rte_memzone *mz_ftp;
>
>
> sdn_ft = onvm_ft_create(1024, sizeof(struct onvm_flow_entry));
>
> if(sdn_ft == NULL) {
>
> rte_exit(EXIT_FAILURE, "Unable to create flow table\n");
>
> }
>
> mz_ftp =
> rte_memzone_reserve(MZ_FTP_INFO, sizeof(struct onvm_ft *),
>
> rte_socket_id(), NO_FLAGS);
>
> if (mz_ftp == NULL) {
>
> rte_exit(EXIT_FAILURE, "Canot reserve memory zone for
> flow table pointer\n");
>
> }
>
> memset(mz_ftp->addr, 0, sizeof(struct onvm_ft *));
>
> sdn_ft_p = mz_ftp->addr;
>
> *sdn_ft_p = sdn_ft;
>
>
> In client side:
>
> struct onvm_ft *sdn_ft;
>
> static void
>
> map_flow_table(void) {
>
> const struct rte_memzone *mz_ftp;
>
> struct onvm_ft **ftp;
>
>
> mz_ftp = rte_memzone_lookup(MZ_FTP_INFO);
>
> if (mz_ftp == NULL)
>
> rte_exit(EXIT_FAILURE, "Cannot get flow table pointer\n");
>
> ftp = mz_ftp->addr;
>
> sdn_ft = *ftp;
>
> }
>
>
> The following is my debug message: I set a breakpoint in lookup table
> line. To narrow down the problem, I just send one flow. So the second
> time and the first time, the packets are the same.
>
> For the first time, it works. I print out the parameters: inside the
> onvm_ft_lookup function, if there is a related entry, it will return
> the address by flow_entry.
>
> Breakpoint 1, datapath_handle_read (dp=0x7ffff00008c0) at
> /home/zhangwei1984/openNetVM-master/openNetVM/examples/flow_table/sdn.c:191
>
> 191 ret = onvm_ft_lookup(sdn_ft, fk,
> (char**)&flow_entry);
>
> (gdb) print *sdn_ft
>
> $1 = {hash = 0x7fff32cce740, data = 0x7fff32cb0480 "", cnt = 1024,
> entry_size = 56}
>
> (gdb) print *fk
>
> $2 = {src_addr = 419496202, dst_addr = 453050634, src_port = 53764,
> dst_port = 11798, proto = 17 '\021'}
>
> (gdb) s
>
> onvm_ft_lookup (table=0x7fff32cbe4c0, key=0x7fff32b99d00,
> data=0x7ffff68d2b00) at
> /home/zhangwei1984/openNetVM-master/openNetVM/onvm/shared/onvm_flow_table.c:151
>
> 151 softrss = onvm_softrss(key);
>
> (gdb) n
>
> 152 printf("software rss %d\n", softrss);
>
> (gdb)
>
> software rss 403183624
>
> 154 tbl_index = rte_hash_lookup_with_hash(table->hash, (const
> void *)key, softrss);
>
> (gdb) print table->hash
>
> $3 = (struct rte_hash *) 0x7fff32cce740
>
> (gdb) print *key
>
> $4 = {src_addr = 419496202, dst_addr = 453050634, src_port = 53764,
> dst_port = 11798, proto = 17 '\021'}
>
> (gdb) print softrss
>
> $5 = 403183624
>
> (gdb) c
>
>
> After I hit c, it will do the second lookup,
>
> Breakpoint 1, datapath_handle_read (dp=0x7ffff00008c0) at
> /home/zhangwei1984/openNetVM-master/openNetVM/examples/flow_table/sdn.c:191
>
> 191 ret = onvm_ft_lookup(sdn_ft, fk,
> (char**)&flow_entry);
>
> (gdb) print *sdn_ft
>
> $7 = {hash = 0x7fff32cce740, data = 0x7fff32cb0480 "", cnt = 1024,
> entry_size = 56}
>
> (gdb) print *fk
>
> $8 = {src_addr = 419496202, dst_addr = 453050634, src_port = 53764,
> dst_port = 11798, proto = 17 '\021'}
>
> (gdb) s
>
> onvm_ft_lookup (table=0x7fff32cbe4c0, key=0x7fff32b99c00,
> data=0x7ffff68d2b00) at
> /home/zhangwei1984/openNetVM-master/openNetVM/onvm/shared/onvm_flow_table.c:151
>
> 151 softrss = onvm_softrss(key);
>
> (gdb) n
>
> 152 printf("software rss %d\n", softrss);
>
> (gdb) n
>
> software rss 403183624
>
> 154 tbl_index = rte_hash_lookup_with_hash(table->hash, (const
> void *)key, softrss);
>
> (gdb) print table->hash
>
> $9 = (struct rte_hash *) 0x7fff32cce740
>
> (gdb) print *key
>
> $10 = {src_addr = 419496202, dst_addr = 453050634, src_port = 53764,
> dst_port = 11798, proto = 17 '\021'}
>
> (gdb) print softrss
>
> $11 = 403183624
>
> (gdb) n
>
>
> Program received signal SIGSEGV, Segmentation fault.
>
> 0x000000000045fb97 in __rte_hash_lookup_bulk ()
>
> (gdb) bt
>
> #0 0x000000000045fb97 in __rte_hash_lookup_bulk ()
>
> #1 0x0000000000000000 in ?? ()
>
>
> From the debug message, the parameters are exactly the same. I do not
> know why it has the segmentation fault.
>
> my lookup function:
>
> int
>
> onvm_ft_lookup(struct onvm_ft* table, struct onvm_ft_ipv4_5tuple
> *key, char** data) {
>
> int32_t tbl_index;
>
> uint32_t softrss;
>
>
> softrss = onvm_softrss(key);
>
> printf("software rss %d\n", softrss);
>
>
> tbl_index = rte_hash_lookup_with_hash(table->hash,
> (const void *)key, softrss);
>
> if (tbl_index >= 0) {
>
> *data = onvm_ft_get_data(table, tbl_index);
>
> return 0;
>
> }
>
> else {
>
> return tbl_index;
>
> }
>
> }
>
>
>
>
> At 2016-03-14 10:16:48, "Dhana Eadala" <edreddy@gmail.com <mailto:edreddy@gmail.com>> wrote:
> >We found a problem in dpdk-2.2 using under multi-process environment.
> >Here is the brief description how we are using the dpdk:
> >
> >We have two processes proc1, proc2 using dpdk. These proc1 and proc2 are
> >two different compiled binaries.
> >proc1 is started as primary process and proc2 as secondary process.
> >
> >proc1:
> >Calls srcHash = rte_hash_create("src_hash_name") to create rte_hash structure.
> >As part of this, this api initalized the rte_hash structure and set the
> >srcHash->rte_hash_cmp_eq to the address of memcmp() from proc1 address space.
> >
> >proc2:
> >calls srcHash = rte_hash_find_existing("src_hash_name").
> >This function call returns the rte_hash created by proc1.
> >This srcHash->rte_hash_cmp_eq still points to the address of
> >memcmp() from proc1 address space.
> >Later proc2 calls
> >rte_hash_lookup_with_hash(srcHash, (const void*) &key, key.sig);
> >rte_hash_lookup_with_hash() invokes __rte_hash_lookup_with_hash(),
> >which in turn calls h->rte_hash_cmp_eq(key, k->key, h->key_len).
> >This leads to a crash as h->rte_hash_cmp_eq is an address
> >from proc1 address space and is invalid address in proc2 address space.
> >
> >We found, from dpdk documentation, that
> >
> >"
> > The use of function pointers between multiple processes
> > running based of different compiled
> > binaries is not supported, since the location of a given function
> > in one process may be different to
> > its location in a second. This prevents the librte_hash library
> > from behaving properly as in a multi-
> > threaded instance, since it uses a pointer to the hash function internally.
> >
> > To work around this issue, it is recommended that
> > multi-process applications perform the hash
> > calculations by directly calling the hashing function
> > from the code and then using the
> > rte_hash_add_with_hash()/rte_hash_lookup_with_hash() functions
> > instead of the functions which do
> > the hashing internally, such as rte_hash_add()/rte_hash_lookup().
> >"
> >
> >We did follow the recommended steps by invoking rte_hash_lookup_with_hash().
> >It was no issue up to and including dpdk-2.0.
> >In later releases started crashing because rte_hash_cmp_eq is
> >introduced in dpdk-2.1
> >
> >We fixed it with the following patch and would like to
> >submit the patch to dpdk.org.
> >Patch is created such that, if anyone wanted to use dpdk in
> >multi-process environment with function pointers not shared, they need to
> >define RTE_LIB_MP_NO_FUNC_PTR in their Makefile.
> >Without defining this flag in Makefile, it works as it is now.
> >
> >Signed-off-by: Dhana Eadala <edreddy@gmail.com <mailto:edreddy@gmail.com>>
> >---
> > lib/librte_hash/rte_cuckoo_hash.c | 28 ++++++++++++++++++++++++++++
> > 1 file changed, 28 insertions(+)
> >
> >diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
> >index 3e3167c..0946777 100644
> >--- a/lib/librte_hash/rte_cuckoo_hash.c
> >+++ b/lib/librte_hash/rte_cuckoo_hash.c
> >@@ -594,7 +594,11 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
> > prim_bkt->signatures[i].alt == alt_hash) {
> > k = (struct rte_hash_key *) ((char *)keys +
> > prim_bkt->key_idx[i] * h->key_entry_size);
> >+#ifdef RTE_LIB_MP_NO_FUNC_PTR
> >+ if (memcmp(key, k->key, h->key_len) == 0) {
> >+#else
> > if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) {
> >+#endif
> > /* Enqueue index of free slot back in the ring. */
> > enqueue_slot_back(h, cached_free_slots, slot_id);
> > /* Update data */
> >@@ -614,7 +618,11 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
> > sec_bkt->signatures[i].current == alt_hash) {
> > k = (struct rte_hash_key *) ((char *)keys +
> > sec_bkt->key_idx[i] * h->key_entry_size);
> >+#ifdef RTE_LIB_MP_NO_FUNC_PTR
> >+ if (memcmp(key, k->key, h->key_len) == 0) {
> >+#else
> > if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) {
> >+#endif
> > /* Enqueue index of free slot back in the ring. */
> > enqueue_slot_back(h, cached_free_slots, slot_id);
> > /* Update data */
> >@@ -725,7 +733,11 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
> > bkt->signatures[i].sig != NULL_SIGNATURE) {
> > k = (struct rte_hash_key *) ((char *)keys +
> > bkt->key_idx[i] * h->key_entry_size);
> >+#ifdef RTE_LIB_MP_NO_FUNC_PTR
> >+ if (memcmp (key, k->key, h->key_len) == 0) {
> >+#else
> > if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) {
> >+#endif
> > if (data != NULL)
> > *data = k->pdata;
> > /*
> >@@ -748,7 +760,11 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
> > bkt->signatures[i].alt == sig) {
> > k = (struct rte_hash_key *) ((char *)keys +
> > bkt->key_idx[i] * h->key_entry_size);
> >+#ifdef RTE_LIB_MP_NO_FUNC_PTR
> >+ if (memcmp(key, k->key, h->key_len) == 0) {
> >+#else
> > if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) {
> >+#endif
> > if (data != NULL)
> > *data = k->pdata;
> > /*
> >@@ -840,7 +856,11 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
> > bkt->signatures[i].sig != NULL_SIGNATURE) {
> > k = (struct rte_hash_key *) ((char *)keys +
> > bkt->key_idx[i] * h->key_entry_size);
> >+#ifdef RTE_LIB_MP_NO_FUNC_PTR
> >+ if (memcmp(key, k->key, h->key_len) == 0) {
> >+#else
> > if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) {
> >+#endif
> > remove_entry(h, bkt, i);
> >
> > /*
> >@@ -863,7 +883,11 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
> > bkt->signatures[i].sig != NULL_SIGNATURE) {
> > k = (struct rte_hash_key *) ((char *)keys +
> > bkt->key_idx[i] * h->key_entry_size);
> >+#ifdef RTE_LIB_MP_NO_FUNC_PTR
> >+ if (memcmp(key, k->key, h->key_len) == 0) {
> >+#else
> > if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) {
> >+#endif
> > remove_entry(h, bkt, i);
> >
> > /*
> >@@ -980,7 +1004,11 @@ lookup_stage3(unsigned idx, const struct rte_hash_key *key_slot, const void * co
> > unsigned hit;
> > unsigned key_idx;
> >
> >+#ifdef RTE_LIB_MP_NO_FUNC_PTR
> >+ hit = !memcmp(key_slot->key, keys[idx], h->key_len);
> >+#else
> > hit = !h->rte_hash_cmp_eq(key_slot->key, keys[idx], h->key_len);
> >+#endif
> > if (data != NULL)
> > data[idx] = key_slot->pdata;
> >
> >--
> >2.5.0
> >
>
>
>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [dpdk-dev] [PATCH] hash: fix memcmp function pointer in multi-process environment
2016-03-15 0:57 ` Dhananjaya Eadala
@ 2016-03-15 1:02 ` 张伟
2016-03-24 14:00 ` De Lara Guarch, Pablo
0 siblings, 1 reply; 13+ messages in thread
From: 张伟 @ 2016-03-15 1:02 UTC (permalink / raw)
To: Dhananjaya Eadala
Cc: bruce.richardson, pablo.de.lara.guarch, michael.qiu, dev
Thanks so much for your patch! Your patch exactly solves my bug. :)
At 2016-03-15 08:57:29, "Dhananjaya Eadala" <edreddy@gmail.com> wrote:
Hi
I looked at your info from gdb and source code.
From source code, rte_hash_lookup_with_hash / __rte_hash_lookup_with_hash() function doesn't invoke rte_hash_reset() function.
It may be possible that rte_hash_cmp_eq function pointer from one process is matching to address of rte_hash_reset () in other process.
For quick test, you can do two things.
1. just apply the patch I provided and recompile the dpdk with a -D flag "RTE_LIB_MP_NO_FUNC_PTR" defined and re-run your test
2. If it still crashes, disable the optimization in Makefiles and see, using gdb, how you are entering rte_hash_reset() from __rte_hash_lookup_with_hash(). this will tell you if it was a problem with function pointers in multi-process environment
Thanks
Dhana
On 03/14/2016 12:38 AM, 张伟 wrote:
BTW, the following is my backtrace when the system crashes.
Program received signal SIGSEGV, Segmentation fault.
0x00000000004883ab in rte_hash_reset (h=0x0)
at /home/zhangwei1984/timopenNetVM/dpdk-2.2.0/lib/librte_hash/rte_cuckoo_hash.c:444
444while (rte_ring_dequeue(h->free_slots, &ptr) == 0)
(gdb) bt
#0 0x00000000004883ab in rte_hash_reset (h=0x0)
at /home/zhangwei1984/timopenNetVM/dpdk-2.2.0/lib/librte_hash/rte_cuckoo_hash.c:444
#1 0x000000000048fdfb in rte_hash_lookup_with_hash (h=0x7fff32cce740, key=0x7fffffffe220, sig=403183624)
at /home/zhangwei1984/timopenNetVM/dpdk-2.2.0/lib/librte_hash/rte_cuckoo_hash.c:771
#2 0x000000000042b551 in onvm_ft_lookup_with_hash (table=0x7fff32cbe4c0, pkt=0x7fff390ea9c0,
data=0x7fffffffe298) at /home/zhangwei1984/openNetVM-master/openNetVM/onvm/shared/onvm_flow_table.c:104
#3 0x000000000042b8c3 in onvm_flow_dir_get_with_hash (table=0x7fff32cbe4c0, pkt=0x7fff390ea9c0,
flow_entry=0x7fffffffe298)
at /home/zhangwei1984/openNetVM-master/openNetVM/onvm/shared/onvm_flow_dir.c:14
#4 0x00000000004251d7 in packet_handler (pkt=0x7fff390ea9c0, meta=0x7fff390eaa00)
at /home/zhangwei1984/openNetVM-master/openNetVM/examples/flow_table/flow_table.c:212
#5 0x0000000000429502 in onvm_nf_run ()
#6 0x00000000004253f1 in main (argc=1, argv=0x7fffffffe648)
at /home/zhangwei1984/openNetVM-master/openNetVM/examples/flow_table/flow_table.c:272
(gdb)
I met a problem which I used the DPDK hash table for multi processes. One started as primary process and the other as secondary process.
I based on the client and server multiprocess example. My aim is that server creates a hash table, then share it to the client. The client will read and write the hash table, and the server will read the hash table. I use rte_calloc allocate the space for hash table, use memzone tells the client the hash table address.
But once I add an entry into the hash table, calling "lookup" function will have the segment fault. But for the lookup function, I have exactly the same parameters for lookup when the first time calls the lookup.
If I create the hash table inside the client, everything works correctly.
I put pieces of codes for server and client codes related to the hash table. I have spent almost 3 days on this bug. But there is no any clue which can help to solve this bug. If any of you can give some suggestions, I will be appreciated. I post the question into the mail list, but have not yet got any reply.
This problem is that in dpdk multi process - client and server example, dpdk-2.2.0/examples/multi_process/client_server_mp
My aim is that server create a hash table, then share it to client. Client will write the hash table, server will read the hash table. I am using dpdk hash table. What I did is that server create a hash table (table and array entries), return the table address. I use memzone pass the table address to client. In client, the second lookup gets segment fault. The system gets crashed. I will put some related code here.
create hash table function:
struct onvm_ft*
onvm_ft_create(int cnt, int entry_size) {
struct rte_hash* hash;
struct onvm_ft* ft;
struct rte_hash_parameters ipv4_hash_params = {
.name = NULL,
.entries = cnt,
.key_len = sizeof(struct onvm_ft_ipv4_5tuple),
.hash_func = NULL,
.hash_func_init_val = 0,
};
char s[64];
/* create ipv4 hash table. use core number and cycle counter to get a unique name. */
ipv4_hash_params.name = s;
ipv4_hash_params.socket_id = rte_socket_id();
snprintf(s, sizeof(s), "onvm_ft_%d-%"PRIu64, rte_lcore_id(), rte_get_tsc_cycles());
hash = rte_hash_create(&ipv4_hash_params);
if (hash == NULL) {
return NULL;
}
ft = (struct onvm_ft*)rte_calloc("table", 1, sizeof(struct onvm_ft), 0);
if (ft == NULL) {
rte_hash_free(hash);
return NULL;
}
ft->hash = hash;
ft->cnt = cnt;
ft->entry_size = entry_size;
/* Create data array for storing values */
ft->data = rte_calloc("entry", cnt, entry_size, 0);
if (ft->data == NULL) {
rte_hash_free(hash);
rte_free(ft);
return NULL;
}
return ft;
}
related structure:
struct onvm_ft {
struct rte_hash* hash;
char* data;
int cnt;
int entry_size;
};
in server side, I will call the create function, use memzone share it to client. The following is what I do:
related variables:
struct onvm_ft *sdn_ft;
struct onvm_ft **sdn_ft_p;
const struct rte_memzone *mz_ftp;
sdn_ft = onvm_ft_create(1024, sizeof(struct onvm_flow_entry));
if(sdn_ft == NULL) {
rte_exit(EXIT_FAILURE, "Unable to create flow table\n");
}
mz_ftp = rte_memzone_reserve(MZ_FTP_INFO, sizeof(struct onvm_ft *),
rte_socket_id(), NO_FLAGS);
if (mz_ftp == NULL) {
rte_exit(EXIT_FAILURE, "Canot reserve memory zone for flow table pointer\n");
}
memset(mz_ftp->addr, 0, sizeof(struct onvm_ft *));
sdn_ft_p = mz_ftp->addr;
*sdn_ft_p = sdn_ft;
In client side:
struct onvm_ft *sdn_ft;
static void
map_flow_table(void) {
const struct rte_memzone *mz_ftp;
struct onvm_ft **ftp;
mz_ftp = rte_memzone_lookup(MZ_FTP_INFO);
if (mz_ftp == NULL)
rte_exit(EXIT_FAILURE, "Cannot get flow table pointer\n");
ftp = mz_ftp->addr;
sdn_ft = *ftp;
}
The following is my debug message: I set a breakpoint in lookup table line. To narrow down the problem, I just send one flow. So the second time and the first time, the packets are the same.
For the first time, it works. I print out the parameters: inside the onvm_ft_lookup function, if there is a related entry, it will return the address by flow_entry.
Breakpoint 1, datapath_handle_read (dp=0x7ffff00008c0) at /home/zhangwei1984/openNetVM-master/openNetVM/examples/flow_table/sdn.c:191
191 ret = onvm_ft_lookup(sdn_ft, fk, (char**)&flow_entry);
(gdb) print *sdn_ft
$1 = {hash = 0x7fff32cce740, data = 0x7fff32cb0480 "", cnt = 1024, entry_size = 56}
(gdb) print *fk
$2 = {src_addr = 419496202, dst_addr = 453050634, src_port = 53764, dst_port = 11798, proto = 17 '\021'}
(gdb) s
onvm_ft_lookup (table=0x7fff32cbe4c0, key=0x7fff32b99d00, data=0x7ffff68d2b00) at /home/zhangwei1984/openNetVM-master/openNetVM/onvm/shared/onvm_flow_table.c:151
151 softrss = onvm_softrss(key);
(gdb) n
152 printf("software rss %d\n", softrss);
(gdb)
software rss 403183624
154 tbl_index = rte_hash_lookup_with_hash(table->hash, (const void *)key, softrss);
(gdb) print table->hash
$3 = (struct rte_hash *) 0x7fff32cce740
(gdb) print *key
$4 = {src_addr = 419496202, dst_addr = 453050634, src_port = 53764, dst_port = 11798, proto = 17 '\021'}
(gdb) print softrss
$5 = 403183624
(gdb) c
After I hit c, it will do the second lookup,
Breakpoint 1, datapath_handle_read (dp=0x7ffff00008c0) at /home/zhangwei1984/openNetVM-master/openNetVM/examples/flow_table/sdn.c:191
191 ret = onvm_ft_lookup(sdn_ft, fk, (char**)&flow_entry);
(gdb) print *sdn_ft
$7 = {hash = 0x7fff32cce740, data = 0x7fff32cb0480 "", cnt = 1024, entry_size = 56}
(gdb) print *fk
$8 = {src_addr = 419496202, dst_addr = 453050634, src_port = 53764, dst_port = 11798, proto = 17 '\021'}
(gdb) s
onvm_ft_lookup (table=0x7fff32cbe4c0, key=0x7fff32b99c00, data=0x7ffff68d2b00) at /home/zhangwei1984/openNetVM-master/openNetVM/onvm/shared/onvm_flow_table.c:151
151 softrss = onvm_softrss(key);
(gdb) n
152 printf("software rss %d\n", softrss);
(gdb) n
software rss 403183624
154 tbl_index = rte_hash_lookup_with_hash(table->hash, (const void *)key, softrss);
(gdb) print table->hash
$9 = (struct rte_hash *) 0x7fff32cce740
(gdb) print *key
$10 = {src_addr = 419496202, dst_addr = 453050634, src_port = 53764, dst_port = 11798, proto = 17 '\021'}
(gdb) print softrss
$11 = 403183624
(gdb) n
Program received signal SIGSEGV, Segmentation fault.
0x000000000045fb97 in __rte_hash_lookup_bulk ()
(gdb) bt
#0 0x000000000045fb97 in __rte_hash_lookup_bulk ()
#1 0x0000000000000000 in ?? ()
From the debug message, the parameters are exactly the same. I do not know why it has the segmentation fault.
my lookup function:
int
onvm_ft_lookup(struct onvm_ft* table, struct onvm_ft_ipv4_5tuple *key, char** data) {
int32_t tbl_index;
uint32_t softrss;
softrss = onvm_softrss(key);
printf("software rss %d\n", softrss);
tbl_index = rte_hash_lookup_with_hash(table->hash, (const void *)key, softrss);
if (tbl_index >= 0) {
*data = onvm_ft_get_data(table, tbl_index);
return 0;
}
else {
return tbl_index;
}
}
At 2016-03-14 10:16:48, "Dhana Eadala" <edreddy@gmail.com> wrote:
>We found a problem in dpdk-2.2 using under multi-process environment.
>Here is the brief description how we are using the dpdk:
>
>We have two processes proc1, proc2 using dpdk. These proc1 and proc2 are
>two different compiled binaries.
>proc1 is started as primary process and proc2 as secondary process.
>
>proc1:
>Calls srcHash = rte_hash_create("src_hash_name") to create rte_hash structure.
>As part of this, this api initalized the rte_hash structure and set the
>srcHash->rte_hash_cmp_eq to the address of memcmp() from proc1 address space.
>
>proc2:
>calls srcHash = rte_hash_find_existing("src_hash_name").
>This function call returns the rte_hash created by proc1.
>This srcHash->rte_hash_cmp_eq still points to the address of
>memcmp() from proc1 address space.
>Later proc2 calls
>rte_hash_lookup_with_hash(srcHash, (const void*) &key, key.sig);
>rte_hash_lookup_with_hash() invokes __rte_hash_lookup_with_hash(),
>which in turn calls h->rte_hash_cmp_eq(key, k->key, h->key_len).
>This leads to a crash as h->rte_hash_cmp_eq is an address
>from proc1 address space and is invalid address in proc2 address space.
>
>We found, from dpdk documentation, that
>
>"
> The use of function pointers between multiple processes
> running based of different compiled
> binaries is not supported, since the location of a given function
> in one process may be different to
> its location in a second. This prevents the librte_hash library
> from behaving properly as in a multi-
> threaded instance, since it uses a pointer to the hash function internally.
>
> To work around this issue, it is recommended that
> multi-process applications perform the hash
> calculations by directly calling the hashing function
> from the code and then using the
> rte_hash_add_with_hash()/rte_hash_lookup_with_hash() functions
> instead of the functions which do
> the hashing internally, such as rte_hash_add()/rte_hash_lookup().
>"
>
>We did follow the recommended steps by invoking rte_hash_lookup_with_hash().
>It was no issue up to and including dpdk-2.0.
>In later releases started crashing because rte_hash_cmp_eq is
>introduced in dpdk-2.1
>
>We fixed it with the following patch and would like to
>submit the patch to dpdk.org.
>Patch is created such that, if anyone wanted to use dpdk in
>multi-process environment with function pointers not shared, they need to
>define RTE_LIB_MP_NO_FUNC_PTR in their Makefile.
>Without defining this flag in Makefile, it works as it is now.
>
>Signed-off-by: Dhana Eadala <edreddy@gmail.com>
>---
> lib/librte_hash/rte_cuckoo_hash.c | 28 ++++++++++++++++++++++++++++
> 1 file changed, 28 insertions(+)
>
>diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
>index 3e3167c..0946777 100644
>--- a/lib/librte_hash/rte_cuckoo_hash.c
>+++ b/lib/librte_hash/rte_cuckoo_hash.c
>@@ -594,7 +594,11 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
> prim_bkt->signatures[i].alt == alt_hash) {
> k = (struct rte_hash_key *) ((char *)keys +
> prim_bkt->key_idx[i] * h->key_entry_size);
>+#ifdef RTE_LIB_MP_NO_FUNC_PTR
>+ if (memcmp(key, k->key, h->key_len) == 0) {
>+#else
> if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) {
>+#endif
> /* Enqueue index of free slot back in the ring. */
> enqueue_slot_back(h, cached_free_slots, slot_id);
> /* Update data */
>@@ -614,7 +618,11 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key,
> sec_bkt->signatures[i].current == alt_hash) {
> k = (struct rte_hash_key *) ((char *)keys +
> sec_bkt->key_idx[i] * h->key_entry_size);
>+#ifdef RTE_LIB_MP_NO_FUNC_PTR
>+ if (memcmp(key, k->key, h->key_len) == 0) {
>+#else
> if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) {
>+#endif
> /* Enqueue index of free slot back in the ring. */
> enqueue_slot_back(h, cached_free_slots, slot_id);
> /* Update data */
>@@ -725,7 +733,11 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
> bkt->signatures[i].sig != NULL_SIGNATURE) {
> k = (struct rte_hash_key *) ((char *)keys +
> bkt->key_idx[i] * h->key_entry_size);
>+#ifdef RTE_LIB_MP_NO_FUNC_PTR
>+ if (memcmp (key, k->key, h->key_len) == 0) {
>+#else
> if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) {
>+#endif
> if (data != NULL)
> *data = k->pdata;
> /*
>@@ -748,7 +760,11 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key,
> bkt->signatures[i].alt == sig) {
> k = (struct rte_hash_key *) ((char *)keys +
> bkt->key_idx[i] * h->key_entry_size);
>+#ifdef RTE_LIB_MP_NO_FUNC_PTR
>+ if (memcmp(key, k->key, h->key_len) == 0) {
>+#else
> if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) {
>+#endif
> if (data != NULL)
> *data = k->pdata;
> /*
>@@ -840,7 +856,11 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
> bkt->signatures[i].sig != NULL_SIGNATURE) {
> k = (struct rte_hash_key *) ((char *)keys +
> bkt->key_idx[i] * h->key_entry_size);
>+#ifdef RTE_LIB_MP_NO_FUNC_PTR
>+ if (memcmp(key, k->key, h->key_len) == 0) {
>+#else
> if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) {
>+#endif
> remove_entry(h, bkt, i);
>
> /*
>@@ -863,7 +883,11 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
> bkt->signatures[i].sig != NULL_SIGNATURE) {
> k = (struct rte_hash_key *) ((char *)keys +
> bkt->key_idx[i] * h->key_entry_size);
>+#ifdef RTE_LIB_MP_NO_FUNC_PTR
>+ if (memcmp(key, k->key, h->key_len) == 0) {
>+#else
> if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) {
>+#endif
> remove_entry(h, bkt, i);
>
> /*
>@@ -980,7 +1004,11 @@ lookup_stage3(unsigned idx, const struct rte_hash_key *key_slot, const void * co
> unsigned hit;
> unsigned key_idx;
>
>+#ifdef RTE_LIB_MP_NO_FUNC_PTR
>+ hit = !memcmp(key_slot->key, keys[idx], h->key_len);
>+#else
> hit = !h->rte_hash_cmp_eq(key_slot->key, keys[idx], h->key_len);
>+#endif
> if (data != NULL)
> data[idx] = key_slot->pdata;
>
>--
>2.5.0
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [dpdk-dev] [PATCH] hash: fix memcmp function pointer in multi-process environment
2016-03-15 1:02 ` 张伟
@ 2016-03-24 14:00 ` De Lara Guarch, Pablo
2016-04-01 13:14 ` Thomas Monjalon
0 siblings, 1 reply; 13+ messages in thread
From: De Lara Guarch, Pablo @ 2016-03-24 14:00 UTC (permalink / raw)
To: zhangwqh, Dhananjaya Eadala; +Cc: Richardson, Bruce, Qiu, Michael, dev
Hi,
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of ??
> Sent: Tuesday, March 15, 2016 1:03 AM
> To: Dhananjaya Eadala
> Cc: Richardson, Bruce; De Lara Guarch, Pablo; Qiu, Michael; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] hash: fix memcmp function pointer in multi-
> process environment
>
> Thanks so much for your patch! Your patch exactly solves my bug. :)
I just sent an alternative patch to solve the issue, without requiring #ifdefs.
Could you take a look at it and check that works for you?
Thanks,
Pablo
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [dpdk-dev] [PATCH] hash: fix memcmp function pointer in multi-process environment
2016-03-24 14:00 ` De Lara Guarch, Pablo
@ 2016-04-01 13:14 ` Thomas Monjalon
0 siblings, 0 replies; 13+ messages in thread
From: Thomas Monjalon @ 2016-04-01 13:14 UTC (permalink / raw)
To: Dhananjaya Eadala
Cc: dev, De Lara Guarch, Pablo, zhangwqh, Richardson, Bruce, Qiu, Michael
2016-03-24 14:00, De Lara Guarch, Pablo:
> Hi,
>
> I just sent an alternative patch to solve the issue, without requiring #ifdefs.
> Could you take a look at it and check that works for you?
Please Dhananjaya Eadala,
Could you confirm the bug is fixed with this patch:
http://dpdk.org/patch/11822
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [dpdk-dev] [PATCH] hash: fix memcmp function pointer in multi-process environment
2016-03-14 2:16 Dhana Eadala
2016-03-14 2:30 ` 张伟
2016-03-14 4:38 ` 张伟
@ 2016-03-22 11:42 ` Thomas Monjalon
2016-03-22 19:53 ` De Lara Guarch, Pablo
2 siblings, 1 reply; 13+ messages in thread
From: Thomas Monjalon @ 2016-03-22 11:42 UTC (permalink / raw)
To: pablo.de.lara.guarch, sergio.gonzalez.monroy
Cc: dev, Dhana Eadala, bruce.richardson, michael.qiu
Hi,
Pablo, Sergio, please could you help with this issue?
2016-03-13 22:16, Dhana Eadala:
> We found a problem in dpdk-2.2 using under multi-process environment.
> Here is the brief description how we are using the dpdk:
>
> We have two processes proc1, proc2 using dpdk. These proc1 and proc2 are
> two different compiled binaries.
> proc1 is started as primary process and proc2 as secondary process.
>
> proc1:
> Calls srcHash = rte_hash_create("src_hash_name") to create rte_hash structure.
> As part of this, this api initalized the rte_hash structure and set the
> srcHash->rte_hash_cmp_eq to the address of memcmp() from proc1 address space.
>
> proc2:
> calls srcHash = rte_hash_find_existing("src_hash_name").
> This function call returns the rte_hash created by proc1.
> This srcHash->rte_hash_cmp_eq still points to the address of
> memcmp() from proc1 address space.
> Later proc2 calls
> rte_hash_lookup_with_hash(srcHash, (const void*) &key, key.sig);
> rte_hash_lookup_with_hash() invokes __rte_hash_lookup_with_hash(),
> which in turn calls h->rte_hash_cmp_eq(key, k->key, h->key_len).
> This leads to a crash as h->rte_hash_cmp_eq is an address
> from proc1 address space and is invalid address in proc2 address space.
>
> We found, from dpdk documentation, that
>
> "
> The use of function pointers between multiple processes
> running based of different compiled
> binaries is not supported, since the location of a given function
> in one process may be different to
> its location in a second. This prevents the librte_hash library
> from behaving properly as in a multi-
> threaded instance, since it uses a pointer to the hash function internally.
>
> To work around this issue, it is recommended that
> multi-process applications perform the hash
> calculations by directly calling the hashing function
> from the code and then using the
> rte_hash_add_with_hash()/rte_hash_lookup_with_hash() functions
> instead of the functions which do
> the hashing internally, such as rte_hash_add()/rte_hash_lookup().
> "
>
> We did follow the recommended steps by invoking rte_hash_lookup_with_hash().
> It was no issue up to and including dpdk-2.0.
> In later releases started crashing because rte_hash_cmp_eq is
> introduced in dpdk-2.1
>
> We fixed it with the following patch and would like to
> submit the patch to dpdk.org.
> Patch is created such that, if anyone wanted to use dpdk in
> multi-process environment with function pointers not shared, they need to
> define RTE_LIB_MP_NO_FUNC_PTR in their Makefile.
> Without defining this flag in Makefile, it works as it is now.
Introducing #ifdef RTE_LIB_MP_NO_FUNC_PTR is not recommended.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [dpdk-dev] [PATCH] hash: fix memcmp function pointer in multi-process environment
2016-03-22 11:42 ` Thomas Monjalon
@ 2016-03-22 19:53 ` De Lara Guarch, Pablo
0 siblings, 0 replies; 13+ messages in thread
From: De Lara Guarch, Pablo @ 2016-03-22 19:53 UTC (permalink / raw)
To: Thomas Monjalon, Gonzalez Monroy, Sergio
Cc: dev, Dhana Eadala, Richardson, Bruce, Qiu, Michael
Hi Thomas,
> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Tuesday, March 22, 2016 11:42 AM
> To: De Lara Guarch, Pablo; Gonzalez Monroy, Sergio
> Cc: dev@dpdk.org; Dhana Eadala; Richardson, Bruce; Qiu, Michael
> Subject: Re: [dpdk-dev] [PATCH] hash: fix memcmp function pointer in multi-
> process environment
>
> Hi,
>
> Pablo, Sergio, please could you help with this issue?
I agree this is not the best way to fix this. I will try to have a fix without having to use ifdefs.
Thanks,
Pablo
>
> 2016-03-13 22:16, Dhana Eadala:
> > We found a problem in dpdk-2.2 using under multi-process environment.
> > Here is the brief description how we are using the dpdk:
> >
> > We have two processes proc1, proc2 using dpdk. These proc1 and proc2
> are
> > two different compiled binaries.
> > proc1 is started as primary process and proc2 as secondary process.
> >
> > proc1:
> > Calls srcHash = rte_hash_create("src_hash_name") to create rte_hash
> structure.
> > As part of this, this api initalized the rte_hash structure and set the
> > srcHash->rte_hash_cmp_eq to the address of memcmp() from proc1
> address space.
> >
> > proc2:
> > calls srcHash = rte_hash_find_existing("src_hash_name").
> > This function call returns the rte_hash created by proc1.
> > This srcHash->rte_hash_cmp_eq still points to the address of
> > memcmp() from proc1 address space.
> > Later proc2 calls
> > rte_hash_lookup_with_hash(srcHash, (const void*) &key, key.sig);
> > rte_hash_lookup_with_hash() invokes __rte_hash_lookup_with_hash(),
> > which in turn calls h->rte_hash_cmp_eq(key, k->key, h->key_len).
> > This leads to a crash as h->rte_hash_cmp_eq is an address
> > from proc1 address space and is invalid address in proc2 address space.
> >
> > We found, from dpdk documentation, that
> >
> > "
> > The use of function pointers between multiple processes
> > running based of different compiled
> > binaries is not supported, since the location of a given function
> > in one process may be different to
> > its location in a second. This prevents the librte_hash library
> > from behaving properly as in a multi-
> > threaded instance, since it uses a pointer to the hash function internally.
> >
> > To work around this issue, it is recommended that
> > multi-process applications perform the hash
> > calculations by directly calling the hashing function
> > from the code and then using the
> > rte_hash_add_with_hash()/rte_hash_lookup_with_hash() functions
> > instead of the functions which do
> > the hashing internally, such as rte_hash_add()/rte_hash_lookup().
> > "
> >
> > We did follow the recommended steps by invoking
> rte_hash_lookup_with_hash().
> > It was no issue up to and including dpdk-2.0.
> > In later releases started crashing because rte_hash_cmp_eq is
> > introduced in dpdk-2.1
> >
> > We fixed it with the following patch and would like to
> > submit the patch to dpdk.org.
> > Patch is created such that, if anyone wanted to use dpdk in
> > multi-process environment with function pointers not shared, they need to
> > define RTE_LIB_MP_NO_FUNC_PTR in their Makefile.
> > Without defining this flag in Makefile, it works as it is now.
>
> Introducing #ifdef RTE_LIB_MP_NO_FUNC_PTR is not recommended.
^ permalink raw reply [flat|nested] 13+ messages in thread