From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay-out2.mail.masterhost.ru (relay-out2.mail.masterhost.ru [83.222.12.12]) by dpdk.org (Postfix) with ESMTP id EA6152C60 for ; Wed, 22 Aug 2018 11:48:03 +0200 (CEST) Received: from [37.139.80.50] (helo=nw) by relay2.mail.masterhost.ru with esmtpa envelope from authenticated with alex@therouter.net message id 1fsPk5-0000Bm-Co; Wed, 22 Aug 2018 12:47:58 +0300 Date: Wed, 22 Aug 2018 12:47:48 +0300 From: Alex Kiselev Message-ID: <1327662293.20180822124748@therouter.net> To: "Ananyev, Konstantin" , "dev@dpdk.org" , "Burakov, Anatoly" In-Reply-To: <1208098598.20180718103320@therouter.net> References: <2601191342CEEE43887BDE71AB977258C0C44FE9@irsmsx105.ger.corp.intel.com> <38797181.20180629204653@therouter.net> <2601191342CEEE43887BDE71AB977258DA90E731@irsmsx105.ger.corp.intel.com> <1208098598.20180718103320@therouter.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-KLMS-Rule-ID: 1 X-KLMS-Message-Action: clean X-KLMS-AntiSpam-Lua-Profiles: 128114 [Aug 22 2018] X-KLMS-AntiSpam-Version: 5.8.3.0 X-KLMS-AntiSpam-Envelope-From: alex@therouter.net X-KLMS-AntiSpam-Rate: 0 X-KLMS-AntiSpam-Status: not_detected X-KLMS-AntiSpam-Method: none X-KLMS-AntiSpam-Info: LuaCore: 172 172 e4ecfa39f9528cd78396086af59e393990c9e611, {rep_avail}, DmarcAF: none X-MS-Exchange-Organization-SCL: -1 X-KLMS-AntiSpam-Interceptor-Info: scan successful X-KLMS-AntiPhishing: not scanned, disabled by settings X-KLMS-AntiVirus: Kaspersky Security for Linux Mail Server, version 8.0.2.16, not scanned, license restriction Subject: Re: [dpdk-dev] [PATCH v2 2/2] librte_ip_frag: add mbuf counter X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Aug 2018 09:48:04 -0000 Hi Konstantin. Could we please make a final decision about counting mbufs, since It still feels to me like an unfinished business? Below are my final argumens. if they are not sound to you, just nack ;) > Hi Konstantin. >> Hi Alex, >> Sorry for delay in reply. >>> >> There might be situations (kind of attack when a lot of >>> >> fragmented packets are sent to a dpdk application in order >>> >> to flood the fragmentation table) when no additional mbufs >>> >> must be added to the fragmentations table since it already >>> >> contains to many of them. Currently there is no way to >>> >> determine the number of mbufs holded int the fragmentation >>> >> table. This patch allows to keep track of the number of mbufs >>> >> holded in the fragmentation table. >>> > I understand your intention, but still not sure it is worth it. >>> > My thought was that you can estimate by upper limit (num_entries * entries_per_bucket) or so. >>> No, I can't. The estimation error might be so big that there would be no difference at all. >> Not sure why? If you'll use upper limit, then worst thing could happen - >> you would start your table cleanup a bit earlier. > Since bucket size is 4, an estimation error might be 400%. > So, for example, if I want to setup the upper limit (max number mbufs that > can be stored in frag table) to 20% of all my available mbufs > I have to be ready that 80% of all mbufs might end up in a frag table > (every bucket is full). Or if I take into account bucket size, and devide 20% > by 4 in order the number mbufs to be exactly 20% in the worse case when every bucket is full, > I could end up in the opposite border situation when exactly single mbuf is stored > in every bucket, so upper limit of mbufs would be 20 / 4 = 5%. Both ways are not > good since either you have to reserve extra mbufs just to correct estimation error > or you upper limit would to small and you will be dropping good fragments. >>> > Probably another way to account number of mbufs without changes in the lib - >>> > apply something like that(assuming that your fragmets are not multisegs): >>> > uint32_t mbuf_in_frag_table = 0; >>> > .... >>> n= dr->>cnt; >>> > mb = rte_ipv4_frag_reassemble_packet(...); >>> > if (mb != NULL) >>> > mbuf_in_frag_table += mb->nb_segs; >>> > mbuf_in_frag_table += dr->cnt - n + 1; >> Sorry, my bad, I think it should be >> mbuf_in_frag_table -= dr->cnt - n + 1; >>> > In theory that could be applied even if fragments might be multisegs, but for that, >>> > we'll need to change rte_ip_frag_free_death_row() to return total number of freed segments. >>> That should be a little bit more complicated wrapper code: >>> uint32_t mbuf_in_frag_table = 0; >>> .... >>> n= dr->cnt; >>> reassembled_mbuf = rte_ipv4_frag_reassemble_packet(..., fragmented_mbuf, ...); >>> if (reassembled_mbuf == NULL) >>> mbuf_in_frag_table += fragmented_mbuf->nb_segs; >> We don't know for sure here. >> fragmented_mbuf could be in death row by now. > Yes. That's exactly why you have to keep track of > mbufs here and later after rte_ip_frag_free_death_row(). > User have to think about frag table and death row as a single entity, > kind of a black box, since it's impossible to say where > (in the frag table or in the death row) your mbuf will be > after you call rte_ipv4_frag_reassemble_packet(). So, a caller/user should > keep track of mbuf on every border/interface of that black box. > One interface is rte_ipv4_frag_reassemble_packet and the other is > rte_ip_frag_free_death_row. > So, that's why it's easier to keep track of mbufs inside the library. > >>> else >>> mbuf_in_frag_table -= reassembled_mbuf->nb_segs; >>> mbuf_in_frag_table += dr->cnt - n; >>> Also, in that case every rte_ip_frag_free_death_row() needs a wrapper code too. >>> n= dr->cnt; >>> rte_ip_frag_free_death_row(..) >>> mbuf_in_frag_table += dr->cnt - n; >> I don't think it is necessary. >> After packet is put in the death-row it is no longer in the table. > It's critical, since from a user point of view death row and frag table > is a black box due rte_ipv4_frag_reassemble_packet() doesn't indicate a caller > where his packet has been stored (in the frag table or death row). >> Konstantin >>> I think my approach is simplier. >>> > Konstantin >>> >> Signed-off-by: Alex Kiselev >>> >> --- >>> >> lib/librte_ip_frag/ip_frag_common.h | 16 +++++++++------- >>> >> lib/librte_ip_frag/ip_frag_internal.c | 16 +++++++++------- >>> >> lib/librte_ip_frag/rte_ip_frag.h | 18 +++++++++++++++++- >>> >> lib/librte_ip_frag/rte_ip_frag_common.c | 1 + >>> >> lib/librte_ip_frag/rte_ip_frag_version.map | 1 + >>> >> lib/librte_ip_frag/rte_ipv4_reassembly.c | 2 +- >>> >> lib/librte_ip_frag/rte_ipv6_reassembly.c | 2 +- >>> >> 7 files changed, 39 insertions(+), 17 deletions(-) >>> >> diff --git a/lib/librte_ip_frag/ip_frag_common.h b/lib/librte_ip_frag/ip_frag_common.h >>> >> index 0fdcc7d0f..9fe5c0559 100644 >>> >> --- a/lib/librte_ip_frag/ip_frag_common.h >>> >> +++ b/lib/librte_ip_frag/ip_frag_common.h >>> >> @@ -32,15 +32,15 @@ >>> >> #endif /* IP_FRAG_TBL_STAT */ >>> >> /* internal functions declarations */ >>> >> -struct rte_mbuf * ip_frag_process(struct ip_frag_pkt *fp, >>> >> - struct rte_ip_frag_death_row *dr, struct rte_mbuf *mb, >>> >> - uint16_t ofs, uint16_t len, uint16_t more_frags); >>> >> +struct rte_mbuf *ip_frag_process(struct rte_ip_frag_tbl *tbl, >>> >> + struct ip_frag_pkt *fp, struct rte_ip_frag_death_row *dr, >>> >> + struct rte_mbuf *mb, uint16_t ofs, uint16_t len, uint16_t more_frags); >>> >> -struct ip_frag_pkt * ip_frag_find(struct rte_ip_frag_tbl *tbl, >>> >> +struct ip_frag_pkt *ip_frag_find(struct rte_ip_frag_tbl *tbl, >>> >> struct rte_ip_frag_death_row *dr, >>> >> const struct ip_frag_key *key, uint64_t tms); >>> >> -struct ip_frag_pkt * ip_frag_lookup(struct rte_ip_frag_tbl *tbl, >>> >> +struct ip_frag_pkt *ip_frag_lookup(struct rte_ip_frag_tbl *tbl, >>> >> const struct ip_frag_key *key, uint64_t tms, >>> >> struct ip_frag_pkt **free, struct ip_frag_pkt **stale); >>> >> @@ -91,7 +91,8 @@ ip_frag_key_cmp(const struct ip_frag_key * k1, const struct ip_frag_key * k2) >>> >> /* put fragment on death row */ >>> >> static inline void >>> >> -ip_frag_free(struct ip_frag_pkt *fp, struct rte_ip_frag_death_row *dr) >>> >> +ip_frag_free(struct rte_ip_frag_tbl *tbl, struct ip_frag_pkt *fp, >>> >> + struct rte_ip_frag_death_row *dr) >>> >> { >>> >> uint32_t i, k; >>> >> @@ -100,6 +101,7 @@ ip_frag_free(struct ip_frag_pkt *fp, struct rte_ip_frag_death_row *dr) >>> >> if (fp->frags[i].mb != NULL) { >>> >> dr->row[k++] = fp->frags[i].mb; >>> >> fp->frags[i].mb = NULL; >>> >> + tbl->nb_mbufs--; >>> >> } >>> >> } >>> >> @@ -160,7 +162,7 @@ static inline void >>> >> ip_frag_tbl_del(struct rte_ip_frag_tbl *tbl, struct rte_ip_frag_death_row *dr, >>> >> struct ip_frag_pkt *fp) >>> >> { >>> >> - ip_frag_free(fp, dr); >>> >> + ip_frag_free(tbl, fp, dr); >>> >> ip_frag_key_invalidate(&fp->key); >>> >> TAILQ_REMOVE(&tbl->lru, fp, lru); >>> >> tbl->use_entries--; >>> >> diff --git a/lib/librte_ip_frag/ip_frag_internal.c b/lib/librte_ip_frag/ip_frag_internal.c >>> >> index 97470a872..4c47d3fb4 100644 >>> >> --- a/lib/librte_ip_frag/ip_frag_internal.c >>> >> +++ b/lib/librte_ip_frag/ip_frag_internal.c >>> >> @@ -29,14 +29,13 @@ static inline void >>> >> ip_frag_tbl_reuse(struct rte_ip_frag_tbl *tbl, struct rte_ip_frag_death_row *dr, >>> >> struct ip_frag_pkt *fp, uint64_t tms) >>> >> { >>> >> - ip_frag_free(fp, dr); >>> >> + ip_frag_free(tbl, fp, dr); >>> >> ip_frag_reset(fp, tms); >>> >> TAILQ_REMOVE(&tbl->lru, fp, lru); >>> >> TAILQ_INSERT_TAIL(&tbl->lru, fp, lru); >>> >> IP_FRAG_TBL_STAT_UPDATE(&tbl->stat, reuse_num, 1); >>> >> } >>> >> - >>> >> static inline void >>> >> ipv4_frag_hash(const struct ip_frag_key *key, uint32_t *v1, uint32_t *v2) >>> >> { >>> >> @@ -88,8 +87,9 @@ ipv6_frag_hash(const struct ip_frag_key *key, uint32_t *v1, uint32_t *v2) >>> >> } >>> >> struct rte_mbuf * >>> >> -ip_frag_process(struct ip_frag_pkt *fp, struct rte_ip_frag_death_row *dr, >>> >> - struct rte_mbuf *mb, uint16_t ofs, uint16_t len, uint16_t more_frags) >>> >> +ip_frag_process(struct rte_ip_frag_tbl *tbl, struct ip_frag_pkt *fp, >>> >> + struct rte_ip_frag_death_row *dr, struct rte_mbuf *mb, uint16_t ofs, >>> >> + uint16_t len, uint16_t more_frags) >>> >> { >>> >> uint32_t idx; >>> >> @@ -147,7 +147,7 @@ ip_frag_process(struct ip_frag_pkt *fp, struct rte_ip_frag_death_row *dr, >>> >> fp->frags[IP_LAST_FRAG_IDX].len); >>> >> /* free all fragments, invalidate the entry. */ >>> >> - ip_frag_free(fp, dr); >>> >> + ip_frag_free(tbl, fp, dr); >>> >> ip_frag_key_invalidate(&fp->key); >>> >> IP_FRAG_MBUF2DR(dr, mb); >>> >> @@ -157,6 +157,7 @@ ip_frag_process(struct ip_frag_pkt *fp, struct rte_ip_frag_death_row *dr, >>> >> fp->frags[idx].ofs = ofs; >>> >> fp->frags[idx].len = len; >>> >> fp->frags[idx].mb = mb; >>> >> + tbl->nb_mbufs++; >>> >> mb = NULL; >>> >> @@ -205,8 +206,9 @@ ip_frag_process(struct ip_frag_pkt *fp, struct rte_ip_frag_death_row *dr, >>> >> fp->frags[IP_LAST_FRAG_IDX].len); >>> >> /* free associated resources. */ >>> >> - ip_frag_free(fp, dr); >>> >> - } >>> >> + ip_frag_free(tbl, fp, dr); >>> >> + } else >>> >> + tbl->nb_mbufs -= fp->last_idx; >>> >> /* we are done with that entry, invalidate it. */ >>> >> ip_frag_key_invalidate(&fp->key); >>> >> diff --git a/lib/librte_ip_frag/rte_ip_frag.h b/lib/librte_ip_frag/rte_ip_frag.h >>> >> index 7f425f610..623934d87 100644 >>> >> --- a/lib/librte_ip_frag/rte_ip_frag.h >>> >> +++ b/lib/librte_ip_frag/rte_ip_frag.h >>> >> @@ -96,6 +96,7 @@ struct rte_ip_frag_tbl { >>> >> uint32_t bucket_entries; /**< hash associativity. */ >>> >> uint32_t nb_entries; /**< total size of the table. */ >>> >> uint32_t nb_buckets; /**< num of associativity lines. */ >>> >> + uint32_t nb_mbufs; /**< num of mbufs holded in the tbl. */ >>> >> struct ip_frag_pkt *last; /**< last used entry. */ >>> >> struct ip_pkt_list lru; /**< LRU list for table entries. */ >>> >> struct ip_frag_tbl_stat stat; /**< statistics counters. */ >>> >> @@ -329,8 +330,23 @@ void >>> >> rte_ip_frag_table_statistics_dump(FILE * f, const struct rte_ip_frag_tbl *tbl); >>> >> /** >>> >> - * Delete expired fragments >>> >> + * Number of mbufs holded in the fragmentation table. >>> >> + * >>> >> + * @param tbl >>> >> + * Fragmentation table >>> >> * >>> >> + * @return >>> >> + * Number of mbufs holded in the fragmentation table. >>> >> + */ >>> >> +static inline uint32_t __rte_experimental >>> >> +rte_frag_table_mbuf_count(const struct rte_ip_frag_tbl *tbl) >>> >> +{ >>> >> + return tbl->nb_mbufs; >>> >> +} >>> >> + >>> >> +/** >>> >> + * Delete expired fragments >>> >> + * >>> >> * @param tbl >>> >> * Table to delete expired fragments from >>> >> * @param dr >>> >> diff --git a/lib/librte_ip_frag/rte_ip_frag_common.c b/lib/librte_ip_frag/rte_ip_frag_common.c >>> >> index a23f6f24f..46c2df84a 100644 >>> >> --- a/lib/librte_ip_frag/rte_ip_frag_common.c >>> >> +++ b/lib/librte_ip_frag/rte_ip_frag_common.c >>> >> @@ -75,6 +75,7 @@ rte_ip_frag_table_create(uint32_t bucket_num, uint32_t bucket_entries, >>> >> tbl->nb_buckets = bucket_num; >>> >> tbl->bucket_entries = bucket_entries; >>> >> tbl->entry_mask = (tbl->nb_entries - 1) & ~(tbl->bucket_entries - 1); >>> >> + tbl->nb_mbufs = 0; >>> >> TAILQ_INIT(&(tbl->lru)); >>> >> return tbl; >>> >> diff --git a/lib/librte_ip_frag/rte_ip_frag_version.map b/lib/librte_ip_frag/rte_ip_frag_version.map >>> >> index d40d5515f..f4700f460 100644 >>> >> --- a/lib/librte_ip_frag/rte_ip_frag_version.map >>> >> +++ b/lib/librte_ip_frag/rte_ip_frag_version.map >>> >> @@ -23,4 +23,5 @@ EXPERIMENTAL { >>> >> global: >>> >> rte_frag_table_del_expired_entries; >>> >> + rte_frag_table_mbuf_count; >>> >> }; >>> >> diff --git a/lib/librte_ip_frag/rte_ipv4_reassembly.c b/lib/librte_ip_frag/rte_ipv4_reassembly.c >>> >> index 4956b99ea..fbdfd860a 100644 >>> >> --- a/lib/librte_ip_frag/rte_ipv4_reassembly.c >>> >> +++ b/lib/librte_ip_frag/rte_ipv4_reassembly.c >>> >> @@ -146,7 +146,7 @@ rte_ipv4_frag_reassemble_packet(struct rte_ip_frag_tbl *tbl, >>> >> /* process the fragmented packet. */ >>> >> - mb = ip_frag_process(fp, dr, mb, ip_ofs, ip_len, ip_flag); >>> >> + mb = ip_frag_process(tbl, fp, dr, mb, ip_ofs, ip_len, ip_flag); >>> >> ip_frag_inuse(tbl, fp); >>> >> IP_FRAG_LOG(DEBUG, "%s:%d:\n" >>> >> diff --git a/lib/librte_ip_frag/rte_ipv6_reassembly.c b/lib/librte_ip_frag/rte_ipv6_reassembly.c >>> >> index db249fe60..dda5a57b7 100644 >>> >> --- a/lib/librte_ip_frag/rte_ipv6_reassembly.c >>> >> +++ b/lib/librte_ip_frag/rte_ipv6_reassembly.c >>> >> @@ -186,7 +186,7 @@ rte_ipv6_frag_reassemble_packet(struct rte_ip_frag_tbl *tbl, >>> >> /* process the fragmented packet. */ >>> >> - mb = ip_frag_process(fp, dr, mb, ip_ofs, ip_len, >>> >> + mb = ip_frag_process(tbl, fp, dr, mb, ip_ofs, ip_len, >>> >> MORE_FRAGS(frag_hdr->frag_data)); >>> >> ip_frag_inuse(tbl, fp); >>> >> -- >>> >> 2.16.1.windows.1 >>> -- >>> Alex -- Alex