I met a problem which I used the DPDK hash table for multi processes. One started as primary process and the other as secondary process. I based on the client and server multiprocess example. My aim is that server creates a hash table, then share it to the client. The client will read and write the hash table, and the server will read the hash table. I use rte_calloc allocate the space for hash table, use memzone tells the client the hash table address. But once I add an entry into the hash table, calling "lookup" function will have the segment fault. But for the lookup function, I have exactly the same parameters for lookup when the first time calls the lookup. If I create the hash table inside the client, everything works correctly. I put pieces of codes for server and client codes related to the hash table. I have spent almost 3 days on this bug. But there is no any clue which can help to solve this bug. If any of you can give some suggestions, I will be appreciated. I post the question into the mail list, but have not yet got any reply. This problem is that in dpdk multi process - client and server example, dpdk-2.2.0/examples/multi_process/client_server_mp My aim is that server create a hash table, then share it to client. Client will write the hash table, server will read the hash table. I am using dpdk hash table. What I did is that server create a hash table (table and array entries), return the table address. I use memzone pass the table address to client. In client, the second lookup gets segment fault. The system gets crashed. I will put some related code here. create hash table function: struct onvm_ft* onvm_ft_create(int cnt, int entry_size) { struct rte_hash* hash; struct onvm_ft* ft; struct rte_hash_parameters ipv4_hash_params = { .name = NULL, .entries = cnt, .key_len = sizeof(struct onvm_ft_ipv4_5tuple), .hash_func = NULL, .hash_func_init_val = 0, }; char s[64]; /* create ipv4 hash table. use core number and cycle counter to get a unique name. */ ipv4_hash_params.name = s; ipv4_hash_params.socket_id = rte_socket_id(); snprintf(s, sizeof(s), "onvm_ft_%d-%"PRIu64, rte_lcore_id(), rte_get_tsc_cycles()); hash = rte_hash_create(&ipv4_hash_params); if (hash == NULL) { return NULL; } ft = (struct onvm_ft*)rte_calloc("table", 1, sizeof(struct onvm_ft), 0); if (ft == NULL) { rte_hash_free(hash); return NULL; } ft->hash = hash; ft->cnt = cnt; ft->entry_size = entry_size; /* Create data array for storing values */ ft->data = rte_calloc("entry", cnt, entry_size, 0); if (ft->data == NULL) { rte_hash_free(hash); rte_free(ft); return NULL; } return ft; } related structure: struct onvm_ft { struct rte_hash* hash; char* data; int cnt; int entry_size; }; in server side, I will call the create function, use memzone share it to client. The following is what I do: related variables: struct onvm_ft *sdn_ft; struct onvm_ft **sdn_ft_p; const struct rte_memzone *mz_ftp; sdn_ft = onvm_ft_create(1024, sizeof(struct onvm_flow_entry)); if(sdn_ft == NULL) { rte_exit(EXIT_FAILURE, "Unable to create flow table\n"); } mz_ftp = rte_memzone_reserve(MZ_FTP_INFO, sizeof(struct onvm_ft *), rte_socket_id(), NO_FLAGS); if (mz_ftp == NULL) { rte_exit(EXIT_FAILURE, "Canot reserve memory zone for flow table pointer\n"); } memset(mz_ftp->addr, 0, sizeof(struct onvm_ft *)); sdn_ft_p = mz_ftp->addr; *sdn_ft_p = sdn_ft; In client side: struct onvm_ft *sdn_ft; static void map_flow_table(void) { const struct rte_memzone *mz_ftp; struct onvm_ft **ftp; mz_ftp = rte_memzone_lookup(MZ_FTP_INFO); if (mz_ftp == NULL) rte_exit(EXIT_FAILURE, "Cannot get flow table pointer\n"); ftp = mz_ftp->addr; sdn_ft = *ftp; } The following is my debug message: I set a breakpoint in lookup table line. To narrow down the problem, I just send one flow. So the second time and the first time, the packets are the same. For the first time, it works. I print out the parameters: inside the onvm_ft_lookup function, if there is a related entry, it will return the address by flow_entry. Breakpoint 1, datapath_handle_read (dp=0x7ffff00008c0) at /home/zhangwei1984/openNetVM-master/openNetVM/examples/flow_table/sdn.c:191 191 ret = onvm_ft_lookup(sdn_ft, fk, (char**)&flow_entry); (gdb) print *sdn_ft $1 = {hash = 0x7fff32cce740, data = 0x7fff32cb0480 "", cnt = 1024, entry_size = 56} (gdb) print *fk $2 = {src_addr = 419496202, dst_addr = 453050634, src_port = 53764, dst_port = 11798, proto = 17 '\021'} (gdb) s onvm_ft_lookup (table=0x7fff32cbe4c0, key=0x7fff32b99d00, data=0x7ffff68d2b00) at /home/zhangwei1984/openNetVM-master/openNetVM/onvm/shared/onvm_flow_table.c:151 151 softrss = onvm_softrss(key); (gdb) n 152 printf("software rss %d\n", softrss); (gdb) software rss 403183624 154 tbl_index = rte_hash_lookup_with_hash(table->hash, (const void *)key, softrss); (gdb) print table->hash $3 = (struct rte_hash *) 0x7fff32cce740 (gdb) print *key $4 = {src_addr = 419496202, dst_addr = 453050634, src_port = 53764, dst_port = 11798, proto = 17 '\021'} (gdb) print softrss $5 = 403183624 (gdb) c After I hit c, it will do the second lookup, Breakpoint 1, datapath_handle_read (dp=0x7ffff00008c0) at /home/zhangwei1984/openNetVM-master/openNetVM/examples/flow_table/sdn.c:191 191 ret = onvm_ft_lookup(sdn_ft, fk, (char**)&flow_entry); (gdb) print *sdn_ft $7 = {hash = 0x7fff32cce740, data = 0x7fff32cb0480 "", cnt = 1024, entry_size = 56} (gdb) print *fk $8 = {src_addr = 419496202, dst_addr = 453050634, src_port = 53764, dst_port = 11798, proto = 17 '\021'} (gdb) s onvm_ft_lookup (table=0x7fff32cbe4c0, key=0x7fff32b99c00, data=0x7ffff68d2b00) at /home/zhangwei1984/openNetVM-master/openNetVM/onvm/shared/onvm_flow_table.c:151 151 softrss = onvm_softrss(key); (gdb) n 152 printf("software rss %d\n", softrss); (gdb) n software rss 403183624 154 tbl_index = rte_hash_lookup_with_hash(table->hash, (const void *)key, softrss); (gdb) print table->hash $9 = (struct rte_hash *) 0x7fff32cce740 (gdb) print *key $10 = {src_addr = 419496202, dst_addr = 453050634, src_port = 53764, dst_port = 11798, proto = 17 '\021'} (gdb) print softrss $11 = 403183624 (gdb) n Program received signal SIGSEGV, Segmentation fault. 0x000000000045fb97 in __rte_hash_lookup_bulk () (gdb) bt #0 0x000000000045fb97 in __rte_hash_lookup_bulk () #1 0x0000000000000000 in ?? () From the debug message, the parameters are exactly the same. I do not know why it has the segmentation fault. my lookup function: int onvm_ft_lookup(struct onvm_ft* table, struct onvm_ft_ipv4_5tuple *key, char** data) { int32_t tbl_index; uint32_t softrss; softrss = onvm_softrss(key); printf("software rss %d\n", softrss); tbl_index = rte_hash_lookup_with_hash(table->hash, (const void *)key, softrss); if (tbl_index >= 0) { *data = onvm_ft_get_data(table, tbl_index); return 0; } else { return tbl_index; } } At 2016-03-14 10:16:48, "Dhana Eadala" wrote: >We found a problem in dpdk-2.2 using under multi-process environment. >Here is the brief description how we are using the dpdk: > >We have two processes proc1, proc2 using dpdk. These proc1 and proc2 are >two different compiled binaries. >proc1 is started as primary process and proc2 as secondary process. > >proc1: >Calls srcHash = rte_hash_create("src_hash_name") to create rte_hash structure. >As part of this, this api initalized the rte_hash structure and set the >srcHash->rte_hash_cmp_eq to the address of memcmp() from proc1 address space. > >proc2: >calls srcHash = rte_hash_find_existing("src_hash_name"). >This function call returns the rte_hash created by proc1. >This srcHash->rte_hash_cmp_eq still points to the address of >memcmp() from proc1 address space. >Later proc2 calls >rte_hash_lookup_with_hash(srcHash, (const void*) &key, key.sig); >rte_hash_lookup_with_hash() invokes __rte_hash_lookup_with_hash(), >which in turn calls h->rte_hash_cmp_eq(key, k->key, h->key_len). >This leads to a crash as h->rte_hash_cmp_eq is an address >from proc1 address space and is invalid address in proc2 address space. > >We found, from dpdk documentation, that > >" > The use of function pointers between multiple processes > running based of different compiled > binaries is not supported, since the location of a given function > in one process may be different to > its location in a second. This prevents the librte_hash library > from behaving properly as in a multi- > threaded instance, since it uses a pointer to the hash function internally. > > To work around this issue, it is recommended that > multi-process applications perform the hash > calculations by directly calling the hashing function > from the code and then using the > rte_hash_add_with_hash()/rte_hash_lookup_with_hash() functions > instead of the functions which do > the hashing internally, such as rte_hash_add()/rte_hash_lookup(). >" > >We did follow the recommended steps by invoking rte_hash_lookup_with_hash(). >It was no issue up to and including dpdk-2.0. >In later releases started crashing because rte_hash_cmp_eq is >introduced in dpdk-2.1 > >We fixed it with the following patch and would like to >submit the patch to dpdk.org. >Patch is created such that, if anyone wanted to use dpdk in >multi-process environment with function pointers not shared, they need to >define RTE_LIB_MP_NO_FUNC_PTR in their Makefile. >Without defining this flag in Makefile, it works as it is now. > >Signed-off-by: Dhana Eadala >--- > lib/librte_hash/rte_cuckoo_hash.c | 28 ++++++++++++++++++++++++++++ > 1 file changed, 28 insertions(+) > >diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c >index 3e3167c..0946777 100644 >--- a/lib/librte_hash/rte_cuckoo_hash.c >+++ b/lib/librte_hash/rte_cuckoo_hash.c >@@ -594,7 +594,11 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, > prim_bkt->signatures[i].alt == alt_hash) { > k = (struct rte_hash_key *) ((char *)keys + > prim_bkt->key_idx[i] * h->key_entry_size); >+#ifdef RTE_LIB_MP_NO_FUNC_PTR >+ if (memcmp(key, k->key, h->key_len) == 0) { >+#else > if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) { >+#endif > /* Enqueue index of free slot back in the ring. */ > enqueue_slot_back(h, cached_free_slots, slot_id); > /* Update data */ >@@ -614,7 +618,11 @@ __rte_hash_add_key_with_hash(const struct rte_hash *h, const void *key, > sec_bkt->signatures[i].current == alt_hash) { > k = (struct rte_hash_key *) ((char *)keys + > sec_bkt->key_idx[i] * h->key_entry_size); >+#ifdef RTE_LIB_MP_NO_FUNC_PTR >+ if (memcmp(key, k->key, h->key_len) == 0) { >+#else > if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) { >+#endif > /* Enqueue index of free slot back in the ring. */ > enqueue_slot_back(h, cached_free_slots, slot_id); > /* Update data */ >@@ -725,7 +733,11 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key, > bkt->signatures[i].sig != NULL_SIGNATURE) { > k = (struct rte_hash_key *) ((char *)keys + > bkt->key_idx[i] * h->key_entry_size); >+#ifdef RTE_LIB_MP_NO_FUNC_PTR >+ if (memcmp (key, k->key, h->key_len) == 0) { >+#else > if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) { >+#endif > if (data != NULL) > *data = k->pdata; > /* >@@ -748,7 +760,11 @@ __rte_hash_lookup_with_hash(const struct rte_hash *h, const void *key, > bkt->signatures[i].alt == sig) { > k = (struct rte_hash_key *) ((char *)keys + > bkt->key_idx[i] * h->key_entry_size); >+#ifdef RTE_LIB_MP_NO_FUNC_PTR >+ if (memcmp(key, k->key, h->key_len) == 0) { >+#else > if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) { >+#endif > if (data != NULL) > *data = k->pdata; > /* >@@ -840,7 +856,11 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key, > bkt->signatures[i].sig != NULL_SIGNATURE) { > k = (struct rte_hash_key *) ((char *)keys + > bkt->key_idx[i] * h->key_entry_size); >+#ifdef RTE_LIB_MP_NO_FUNC_PTR >+ if (memcmp(key, k->key, h->key_len) == 0) { >+#else > if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) { >+#endif > remove_entry(h, bkt, i); > > /* >@@ -863,7 +883,11 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key, > bkt->signatures[i].sig != NULL_SIGNATURE) { > k = (struct rte_hash_key *) ((char *)keys + > bkt->key_idx[i] * h->key_entry_size); >+#ifdef RTE_LIB_MP_NO_FUNC_PTR >+ if (memcmp(key, k->key, h->key_len) == 0) { >+#else > if (h->rte_hash_cmp_eq(key, k->key, h->key_len) == 0) { >+#endif > remove_entry(h, bkt, i); > > /* >@@ -980,7 +1004,11 @@ lookup_stage3(unsigned idx, const struct rte_hash_key *key_slot, const void * co > unsigned hit; > unsigned key_idx; > >+#ifdef RTE_LIB_MP_NO_FUNC_PTR >+ hit = !memcmp(key_slot->key, keys[idx], h->key_len); >+#else > hit = !h->rte_hash_cmp_eq(key_slot->key, keys[idx], h->key_len); >+#endif > if (data != NULL) > data[idx] = key_slot->pdata; > >-- >2.5.0 > &j!y" 4bץrs^4㍸m5jsZ^rM8\d^qyikMbץr5{^&ݹ睼񼝥(DLێ +uݥ((^ j׭8n8Mt!E~&~k&4ץrnwۍo'hM_)zW(DLw1^ӍNq1(]W"'>Q