From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by dpdk.org (Postfix) with ESMTP id D35A22C0C for ; Fri, 24 Aug 2018 14:07:33 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 24 Aug 2018 05:07:32 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.53,281,1531810800"; d="scan'208";a="257709521" Received: from irsmsx105.ger.corp.intel.com ([163.33.3.28]) by fmsmga006.fm.intel.com with ESMTP; 24 Aug 2018 05:07:31 -0700 Received: from irsmsx111.ger.corp.intel.com (10.108.20.4) by irsmsx105.ger.corp.intel.com (163.33.3.28) with Microsoft SMTP Server (TLS) id 14.3.319.2; Fri, 24 Aug 2018 13:07:30 +0100 Received: from irsmsx102.ger.corp.intel.com ([169.254.2.180]) by irsmsx111.ger.corp.intel.com ([169.254.2.190]) with mapi id 14.03.0319.002; Fri, 24 Aug 2018 13:07:30 +0100 From: "Ananyev, Konstantin" To: Alex Kiselev , "dev@dpdk.org" , "Burakov, Anatoly" Thread-Topic: [dpdk-dev] [PATCH v2 2/2] librte_ip_frag: add mbuf counter Thread-Index: AQHT++y4WDY5h1x+fUWcOeCgx7S3YKR3lwvAgAACrICAHBv3wIABFOgAgDcnKwCAA1WgcA== Date: Fri, 24 Aug 2018 12:07:30 +0000 Message-ID: <2601191342CEEE43887BDE71AB977258E9FA5280@IRSMSX102.ger.corp.intel.com> References: <2601191342CEEE43887BDE71AB977258C0C44FE9@irsmsx105.ger.corp.intel.com> <38797181.20180629204653@therouter.net> <2601191342CEEE43887BDE71AB977258DA90E731@irsmsx105.ger.corp.intel.com> <1208098598.20180718103320@therouter.net> <1327662293.20180822124748@therouter.net> In-Reply-To: <1327662293.20180822124748@therouter.net> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiYTcyYWViNGItZmU1NC00MDcyLWFmN2UtYjdkNjc4Mjg3YmE3IiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjEwLjE4MDQuNDkiLCJUcnVzdGVkTGFiZWxIYXNoIjoiTzcxS2t3U3RrbnNzdmVhcWRWQ0NsRzZYMHdYeW9KTm83dzhwVGFXaWtsNW96bzlWYVdaVzkxb2lsZUh0SkR5OSJ9 x-ctpclassification: CTP_NT dlp-product: dlpe-windows dlp-version: 11.0.400.15 dlp-reaction: no-action x-originating-ip: [163.33.239.181] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v2 2/2] librte_ip_frag: add mbuf counter X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Aug 2018 12:07:34 -0000 Hi Alex, >=20 > Hi Konstantin. >=20 > Could we please make a final decision about counting mbufs, since It stil= l feels to me like an unfinished business? > Below are my final argumens. if they are not sound to you, just nack ;) Sorry, but as I said before - not sure that it really worth it. Still think some sort of wrapper might be a better approach. Konstantin >=20 > > Hi Konstantin. > >> Hi Alex, > >> Sorry for delay in reply. >=20 >=20 > >>> >> There might be situations (kind of attack when a lot of > >>> >> fragmented packets are sent to a dpdk application in order > >>> >> to flood the fragmentation table) when no additional mbufs > >>> >> must be added to the fragmentations table since it already > >>> >> contains to many of them. Currently there is no way to > >>> >> determine the number of mbufs holded int the fragmentation > >>> >> table. This patch allows to keep track of the number of mbufs > >>> >> holded in the fragmentation table. >=20 > >>> > I understand your intention, but still not sure it is worth it. > >>> > My thought was that you can estimate by upper limit (num_entries * = entries_per_bucket) or so. > >>> No, I can't. The estimation error might be so big that there would be= no difference at all. >=20 > >> Not sure why? If you'll use upper limit, then worst thing could happen= - > >> you would start your table cleanup a bit earlier. > > Since bucket size is 4, an estimation error might be 400%. > > So, for example, if I want to setup the upper limit (max number mbufs t= hat > > can be stored in frag table) to 20% of all my available mbufs > > I have to be ready that 80% of all mbufs might end up in a frag table > > (every bucket is full). Or if I take into account bucket size, and devi= de 20% > > by 4 in order the number mbufs to be exactly 20% in the worse case when= every bucket is full, > > I could end up in the opposite border situation when exactly single mbu= f is stored > > in every bucket, so upper limit of mbufs would be 20 / 4 =3D 5%. Both w= ays are not > > good since either you have to reserve extra mbufs just to correct estim= ation error > > or you upper limit would to small and you will be dropping good fragmen= ts. >=20 >=20 > >>> > Probably another way to account number of mbufs without changes in = the lib - > >>> > apply something like that(assuming that your fragmets are not multi= segs): >=20 > >>> > uint32_t mbuf_in_frag_table =3D 0; > >>> > .... >=20 > >>> n=3D dr->>cnt; > >>> > mb =3D rte_ipv4_frag_reassemble_packet(...); > >>> > if (mb !=3D NULL) > >>> > mbuf_in_frag_table +=3D mb->nb_segs; > >>> > mbuf_in_frag_table +=3D dr->cnt - n + 1; >=20 > >> Sorry, my bad, I think it should be > >> mbuf_in_frag_table -=3D dr->cnt - n + 1; >=20 >=20 > >>> > In theory that could be applied even if fragments might be multiseg= s, but for that, > >>> > we'll need to change rte_ip_frag_free_death_row() to return total n= umber of freed segments. >=20 > >>> That should be a little bit more complicated wrapper code: >=20 > >>> uint32_t mbuf_in_frag_table =3D 0; > >>> .... >=20 > >>> n=3D dr->cnt; > >>> reassembled_mbuf =3D rte_ipv4_frag_reassemble_packet(..., fragmented_= mbuf, ...); > >>> if (reassembled_mbuf =3D=3D NULL) > >>> mbuf_in_frag_table +=3D fragmented_mbuf->nb_segs; >=20 > >> We don't know for sure here. > >> fragmented_mbuf could be in death row by now. > > Yes. That's exactly why you have to keep track of > > mbufs here and later after rte_ip_frag_free_death_row(). >=20 > > User have to think about frag table and death row as a single entity, > > kind of a black box, since it's impossible to say where > > (in the frag table or in the death row) your mbuf will be > > after you call rte_ipv4_frag_reassemble_packet(). So, a caller/user sho= uld > > keep track of mbuf on every border/interface of that black box. > > One interface is rte_ipv4_frag_reassemble_packet and the other is > > rte_ip_frag_free_death_row. >=20 > > So, that's why it's easier to keep track of mbufs inside the library. > > >=20 > >>> else > >>> mbuf_in_frag_table -=3D reassembled_mbuf->nb_segs; > >>> mbuf_in_frag_table +=3D dr->cnt - n; >=20 >=20 > >>> Also, in that case every rte_ip_frag_free_death_row() needs a wrapper= code too. >=20 > >>> n=3D dr->cnt; > >>> rte_ip_frag_free_death_row(..) > >>> mbuf_in_frag_table +=3D dr->cnt - n; >=20 > >> I don't think it is necessary. > >> After packet is put in the death-row it is no longer in the table. > > It's critical, since from a user point of view death row and frag table > > is a black box due rte_ipv4_frag_reassemble_packet() doesn't indicate a= caller > > where his packet has been stored (in the frag table or death row). >=20 > >> Konstantin >=20 >=20 >=20 > >>> I think my approach is simplier. >=20 > >>> > Konstantin >=20 >=20 > >>> >> Signed-off-by: Alex Kiselev > >>> >> --- > >>> >> lib/librte_ip_frag/ip_frag_common.h | 16 +++++++++------- > >>> >> lib/librte_ip_frag/ip_frag_internal.c | 16 +++++++++------- > >>> >> lib/librte_ip_frag/rte_ip_frag.h | 18 +++++++++++++++++= - > >>> >> lib/librte_ip_frag/rte_ip_frag_common.c | 1 + > >>> >> lib/librte_ip_frag/rte_ip_frag_version.map | 1 + > >>> >> lib/librte_ip_frag/rte_ipv4_reassembly.c | 2 +- > >>> >> lib/librte_ip_frag/rte_ipv6_reassembly.c | 2 +- > >>> >> 7 files changed, 39 insertions(+), 17 deletions(-) >=20 > >>> >> diff --git a/lib/librte_ip_frag/ip_frag_common.h b/lib/librte_ip_f= rag/ip_frag_common.h > >>> >> index 0fdcc7d0f..9fe5c0559 100644 > >>> >> --- a/lib/librte_ip_frag/ip_frag_common.h > >>> >> +++ b/lib/librte_ip_frag/ip_frag_common.h > >>> >> @@ -32,15 +32,15 @@ > >>> >> #endif /* IP_FRAG_TBL_STAT */ >=20 > >>> >> /* internal functions declarations */ > >>> >> -struct rte_mbuf * ip_frag_process(struct ip_frag_pkt *fp, > >>> >> - struct rte_ip_frag_death_row *dr, struct rte_mbuf *m= b, > >>> >> - uint16_t ofs, uint16_t len, uint16_t more_frags); > >>> >> +struct rte_mbuf *ip_frag_process(struct rte_ip_frag_tbl *tbl, > >>> >> + struct ip_frag_pkt *fp, struct rte_ip_frag_death_row *dr, > >>> >> + struct rte_mbuf *mb, uint16_t ofs, uint16_t len, uint16_t mo= re_frags); >=20 > >>> >> -struct ip_frag_pkt * ip_frag_find(struct rte_ip_frag_tbl *tbl, > >>> >> +struct ip_frag_pkt *ip_frag_find(struct rte_ip_frag_tbl *tbl, > >>> >> struct rte_ip_frag_death_row *dr, > >>> >> const struct ip_frag_key *key, uint64_t tms); >=20 > >>> >> -struct ip_frag_pkt * ip_frag_lookup(struct rte_ip_frag_tbl *tbl, > >>> >> +struct ip_frag_pkt *ip_frag_lookup(struct rte_ip_frag_tbl *tbl, > >>> >> const struct ip_frag_key *key, uint64_t tms, > >>> >> struct ip_frag_pkt **free, struct ip_frag_pkt **stale); >=20 > >>> >> @@ -91,7 +91,8 @@ ip_frag_key_cmp(const struct ip_frag_key * k1, c= onst struct ip_frag_key * k2) >=20 > >>> >> /* put fragment on death row */ > >>> >> static inline void > >>> >> -ip_frag_free(struct ip_frag_pkt *fp, struct rte_ip_frag_death_row= *dr) > >>> >> +ip_frag_free(struct rte_ip_frag_tbl *tbl, struct ip_frag_pkt *fp, > >>> >> + struct rte_ip_frag_death_row *dr) > >>> >> { > >>> >> uint32_t i, k; >=20 > >>> >> @@ -100,6 +101,7 @@ ip_frag_free(struct ip_frag_pkt *fp, struct rt= e_ip_frag_death_row *dr) > >>> >> if (fp->frags[i].mb !=3D NULL) { > >>> >> dr->row[k++] =3D fp->frags[i].mb; > >>> >> fp->frags[i].mb =3D NULL; > >>> >> + tbl->nb_mbufs--; > >>> >> } > >>> >> } >=20 > >>> >> @@ -160,7 +162,7 @@ static inline void > >>> >> ip_frag_tbl_del(struct rte_ip_frag_tbl *tbl, struct rte_ip_frag_d= eath_row *dr, > >>> >> struct ip_frag_pkt *fp) > >>> >> { > >>> >> - ip_frag_free(fp, dr); > >>> >> + ip_frag_free(tbl, fp, dr); > >>> >> ip_frag_key_invalidate(&fp->key); > >>> >> TAILQ_REMOVE(&tbl->lru, fp, lru); > >>> >> tbl->use_entries--; > >>> >> diff --git a/lib/librte_ip_frag/ip_frag_internal.c b/lib/librte_ip= _frag/ip_frag_internal.c > >>> >> index 97470a872..4c47d3fb4 100644 > >>> >> --- a/lib/librte_ip_frag/ip_frag_internal.c > >>> >> +++ b/lib/librte_ip_frag/ip_frag_internal.c > >>> >> @@ -29,14 +29,13 @@ static inline void > >>> >> ip_frag_tbl_reuse(struct rte_ip_frag_tbl *tbl, struct rte_ip_frag= _death_row *dr, > >>> >> struct ip_frag_pkt *fp, uint64_t tms) > >>> >> { > >>> >> - ip_frag_free(fp, dr); > >>> >> + ip_frag_free(tbl, fp, dr); > >>> >> ip_frag_reset(fp, tms); > >>> >> TAILQ_REMOVE(&tbl->lru, fp, lru); > >>> >> TAILQ_INSERT_TAIL(&tbl->lru, fp, lru); > >>> >> IP_FRAG_TBL_STAT_UPDATE(&tbl->stat, reuse_num, 1); > >>> >> } >=20 > >>> >> - > >>> >> static inline void > >>> >> ipv4_frag_hash(const struct ip_frag_key *key, uint32_t *v1, uint3= 2_t *v2) > >>> >> { > >>> >> @@ -88,8 +87,9 @@ ipv6_frag_hash(const struct ip_frag_key *key, ui= nt32_t *v1, uint32_t *v2) > >>> >> } >=20 > >>> >> struct rte_mbuf * > >>> >> -ip_frag_process(struct ip_frag_pkt *fp, struct rte_ip_frag_death_= row *dr, > >>> >> - struct rte_mbuf *mb, uint16_t ofs, uint16_t len, uint16_t mo= re_frags) > >>> >> +ip_frag_process(struct rte_ip_frag_tbl *tbl, struct ip_frag_pkt *= fp, > >>> >> + struct rte_ip_frag_death_row *dr, struct rte_mbuf *mb, uint1= 6_t ofs, > >>> >> + uint16_t len, uint16_t more_frags) > >>> >> { > >>> >> uint32_t idx; >=20 > >>> >> @@ -147,7 +147,7 @@ ip_frag_process(struct ip_frag_pkt *fp, struct= rte_ip_frag_death_row *dr, > >>> >> fp->frags[IP_LAST_FRAG_IDX].len); >=20 > >>> >> /* free all fragments, invalidate the entry. */ > >>> >> - ip_frag_free(fp, dr); > >>> >> + ip_frag_free(tbl, fp, dr); > >>> >> ip_frag_key_invalidate(&fp->key); > >>> >> IP_FRAG_MBUF2DR(dr, mb); >=20 > >>> >> @@ -157,6 +157,7 @@ ip_frag_process(struct ip_frag_pkt *fp, struct= rte_ip_frag_death_row *dr, > >>> >> fp->frags[idx].ofs =3D ofs; > >>> >> fp->frags[idx].len =3D len; > >>> >> fp->frags[idx].mb =3D mb; > >>> >> + tbl->nb_mbufs++; >=20 > >>> >> mb =3D NULL; >=20 > >>> >> @@ -205,8 +206,9 @@ ip_frag_process(struct ip_frag_pkt *fp, struct= rte_ip_frag_death_row *dr, > >>> >> fp->frags[IP_LAST_FRAG_IDX].len); >=20 > >>> >> /* free associated resources. */ > >>> >> - ip_frag_free(fp, dr); > >>> >> - } > >>> >> + ip_frag_free(tbl, fp, dr); > >>> >> + } else > >>> >> + tbl->nb_mbufs -=3D fp->last_idx; >=20 > >>> >> /* we are done with that entry, invalidate it. */ > >>> >> ip_frag_key_invalidate(&fp->key); > >>> >> diff --git a/lib/librte_ip_frag/rte_ip_frag.h b/lib/librte_ip_frag= /rte_ip_frag.h > >>> >> index 7f425f610..623934d87 100644 > >>> >> --- a/lib/librte_ip_frag/rte_ip_frag.h > >>> >> +++ b/lib/librte_ip_frag/rte_ip_frag.h > >>> >> @@ -96,6 +96,7 @@ struct rte_ip_frag_tbl { > >>> >> uint32_t bucket_entries; /**< hash associativit= y. */ > >>> >> uint32_t nb_entries; /**< total size of the= table. */ > >>> >> uint32_t nb_buckets; /**< num of associativ= ity lines. */ > >>> >> + uint32_t nb_mbufs; /**< num of mbufs hold= ed in the tbl. */ > >>> >> struct ip_frag_pkt *last; /**< last used entry. */ > >>> >> struct ip_pkt_list lru; /**< LRU list for table en= tries. */ > >>> >> struct ip_frag_tbl_stat stat; /**< statistics counters. = */ > >>> >> @@ -329,8 +330,23 @@ void > >>> >> rte_ip_frag_table_statistics_dump(FILE * f, const struct rte_ip_f= rag_tbl *tbl); >=20 > >>> >> /** > >>> >> - * Delete expired fragments > >>> >> + * Number of mbufs holded in the fragmentation table. > >>> >> + * > >>> >> + * @param tbl > >>> >> + * Fragmentation table > >>> >> * > >>> >> + * @return > >>> >> + * Number of mbufs holded in the fragmentation table. > >>> >> + */ > >>> >> +static inline uint32_t __rte_experimental > >>> >> +rte_frag_table_mbuf_count(const struct rte_ip_frag_tbl *tbl) > >>> >> +{ > >>> >> + return tbl->nb_mbufs; > >>> >> +} > >>> >> + > >>> >> +/** > >>> >> + * Delete expired fragments > >>> >> + * > >>> >> * @param tbl > >>> >> * Table to delete expired fragments from > >>> >> * @param dr > >>> >> diff --git a/lib/librte_ip_frag/rte_ip_frag_common.c b/lib/librte_= ip_frag/rte_ip_frag_common.c > >>> >> index a23f6f24f..46c2df84a 100644 > >>> >> --- a/lib/librte_ip_frag/rte_ip_frag_common.c > >>> >> +++ b/lib/librte_ip_frag/rte_ip_frag_common.c > >>> >> @@ -75,6 +75,7 @@ rte_ip_frag_table_create(uint32_t bucket_num, ui= nt32_t bucket_entries, > >>> >> tbl->nb_buckets =3D bucket_num; > >>> >> tbl->bucket_entries =3D bucket_entries; > >>> >> tbl->entry_mask =3D (tbl->nb_entries - 1) & ~(tbl->bucket_en= tries - 1); > >>> >> + tbl->nb_mbufs =3D 0; >=20 > >>> >> TAILQ_INIT(&(tbl->lru)); > >>> >> return tbl; > >>> >> diff --git a/lib/librte_ip_frag/rte_ip_frag_version.map b/lib/libr= te_ip_frag/rte_ip_frag_version.map > >>> >> index d40d5515f..f4700f460 100644 > >>> >> --- a/lib/librte_ip_frag/rte_ip_frag_version.map > >>> >> +++ b/lib/librte_ip_frag/rte_ip_frag_version.map > >>> >> @@ -23,4 +23,5 @@ EXPERIMENTAL { > >>> >> global: >=20 > >>> >> rte_frag_table_del_expired_entries; > >>> >> + rte_frag_table_mbuf_count; > >>> >> }; > >>> >> diff --git a/lib/librte_ip_frag/rte_ipv4_reassembly.c b/lib/librte= _ip_frag/rte_ipv4_reassembly.c > >>> >> index 4956b99ea..fbdfd860a 100644 > >>> >> --- a/lib/librte_ip_frag/rte_ipv4_reassembly.c > >>> >> +++ b/lib/librte_ip_frag/rte_ipv4_reassembly.c > >>> >> @@ -146,7 +146,7 @@ rte_ipv4_frag_reassemble_packet(struct rte_ip_= frag_tbl *tbl, >=20 >=20 > >>> >> /* process the fragmented packet. */ > >>> >> - mb =3D ip_frag_process(fp, dr, mb, ip_ofs, ip_len, ip_flag); > >>> >> + mb =3D ip_frag_process(tbl, fp, dr, mb, ip_ofs, ip_len, ip_f= lag); > >>> >> ip_frag_inuse(tbl, fp); >=20 > >>> >> IP_FRAG_LOG(DEBUG, "%s:%d:\n" > >>> >> diff --git a/lib/librte_ip_frag/rte_ipv6_reassembly.c b/lib/librte= _ip_frag/rte_ipv6_reassembly.c > >>> >> index db249fe60..dda5a57b7 100644 > >>> >> --- a/lib/librte_ip_frag/rte_ipv6_reassembly.c > >>> >> +++ b/lib/librte_ip_frag/rte_ipv6_reassembly.c > >>> >> @@ -186,7 +186,7 @@ rte_ipv6_frag_reassemble_packet(struct rte_ip_= frag_tbl *tbl, >=20 >=20 > >>> >> /* process the fragmented packet. */ > >>> >> - mb =3D ip_frag_process(fp, dr, mb, ip_ofs, ip_len, > >>> >> + mb =3D ip_frag_process(tbl, fp, dr, mb, ip_ofs, ip_len, > >>> >> MORE_FRAGS(frag_hdr->frag_data)); > >>> >> ip_frag_inuse(tbl, fp); >=20 > >>> >> -- > >>> >> 2.16.1.windows.1 >=20 >=20 >=20 >=20 > >>> -- > >>> Alex >=20 >=20 >=20 >=20 >=20 >=20 >=20 > -- > Alex