From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <konstantin.ananyev@intel.com>
Received: from mga03.intel.com (mga03.intel.com [134.134.136.65])
 by dpdk.org (Postfix) with ESMTP id 6F4A5292D
 for <dev@dpdk.org>; Sat, 27 May 2017 01:10:25 +0200 (CEST)
Received: from fmsmga003.fm.intel.com ([10.253.24.29])
 by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 26 May 2017 16:10:25 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.38,399,1491289200"; d="scan'208";a="861798998"
Received: from irsmsx152.ger.corp.intel.com ([163.33.192.66])
 by FMSMGA003.fm.intel.com with ESMTP; 26 May 2017 16:10:23 -0700
Received: from irsmsx109.ger.corp.intel.com ([169.254.13.12]) by
 IRSMSX152.ger.corp.intel.com ([169.254.6.83]) with mapi id 14.03.0319.002;
 Sat, 27 May 2017 00:10:23 +0100
From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: "Hu, Jiayu" <jiayu.hu@intel.com>
CC: "dev@dpdk.org" <dev@dpdk.org>, "Wiles, Keith" <keith.wiles@intel.com>,
 "yuanhan.liu@linux.intel.com" <yuanhan.liu@linux.intel.com>
Thread-Topic: [PATCH v3 1/3] lib: add Generic Receive Offload API framework
Thread-Index: AQHSvNIioTj7qilZNEunAvoVQTYKUKIAL6+AgAGlgACAAN/38IADoyeAgAEVmeA=
Date: Fri, 26 May 2017 23:10:21 +0000
Message-ID: <2601191342CEEE43887BDE71AB9772583FAFB22A@IRSMSX109.ger.corp.intel.com>
References: <1491309106-94264-1-git-send-email-jiayu.hu@intel.com>
 <1493021398-115955-1-git-send-email-jiayu.hu@intel.com>
 <1493021398-115955-2-git-send-email-jiayu.hu@intel.com>
 <2601191342CEEE43887BDE71AB9772583FAF8C27@IRSMSX109.ger.corp.intel.com>
 <20170523103154.GA2033@localhost.localdomain>
 <2601191342CEEE43887BDE71AB9772583FAF9BAE@IRSMSX109.ger.corp.intel.com>
 <20170526072613.GA85810@localhost.localdomain>
In-Reply-To: <20170526072613.GA85810@localhost.localdomain>
Accept-Language: en-IE, en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
dlp-product: dlpe-windows
dlp-version: 10.0.102.7
dlp-reaction: no-action
x-originating-ip: [163.33.239.180]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [dpdk-dev] [PATCH v3 1/3] lib: add Generic Receive Offload API
	framework
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Fri, 26 May 2017 23:10:27 -0000

Hi Jiayu,

> -----Original Message-----
> From: Hu, Jiayu
> Sent: Friday, May 26, 2017 8:26 AM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Cc: dev@dpdk.org; Wiles, Keith <keith.wiles@intel.com>; yuanhan.liu@linux=
.intel.com
> Subject: Re: [PATCH v3 1/3] lib: add Generic Receive Offload API framewor=
k
>=20
> Hi Konstantin,
>=20
> On Wed, May 24, 2017 at 08:38:25PM +0800, Ananyev, Konstantin wrote:
> >
> > Hi Jiayu,
> >
> > >
> > > Hi Konstantin,
> > >
> > > Thanks for your comments. My replies/questions are below.
> > >
> > > BRs,
> > > Jiayu
> > >
> > > On Mon, May 22, 2017 at 05:19:19PM +0800, Ananyev, Konstantin wrote:
> > > > Hi Jiayu,
> > > > My comments/questions below.
> > > > Konstantin
> > > >
> > > > >
> > > > > For applications, DPDK GRO provides three external functions to
> > > > > enable/disable GRO:
> > > > > - rte_gro_init: initialize GRO environment;
> > > > > - rte_gro_enable: enable GRO for a given port;
> > > > > - rte_gro_disable: disable GRO for a given port.
> > > > > Before using GRO, applications should explicitly call rte_gro_ini=
t to
> > > > > initizalize GRO environment. After that, applications can call
> > > > > rte_gro_enable to enable GRO and call rte_gro_disable to disable =
GRO for
> > > > > specific ports.
> > > >
> > > > I think this is too restrictive and wouldn't meet various user's ne=
eds.
> > > > User might want to:
> > > > - enable/disable GRO for particular RX queue
> > > > - or even setup different GRO types for different RX queues,
> > > >    i.e, - GRO over IPV4/TCP for queue 0, and  GRO over IPV6/TCP for=
 queue 1, etc.
> > >
> > > The reason for enabling/disabling GRO per-port instead of per-queue i=
s that LINUX
> > > controls GRO per-port. To control GRO per-queue indeed can provide mo=
re flexibility
> > > to applications. But are there any scenarios that different queues of=
 a port may
> > > require different GRO control (i.e. GRO types and enable/disable GRO)=
?
> >
> > I think yes.
> >
> > >
> > > > - For various reasons, user might prefer not to use RX callbacks fo=
r various reasons,
> > > >   But invoke gro() manually at somepoint in his code.
> > >
> > > An application-used GRO library can enable more flexibility to applic=
ations. Besides,
> > > when perform GRO in ethdev layer or inside PMD drivers, it is an issu=
e that
> > > rte_eth_rx_burst returns actually received packet number or GROed pac=
ket number. And
> > > the same issue happens in GSO, and even more seriously. This is becau=
se applications
> > > , like VPP, always rely on the return value of rte_eth_tx_burst to de=
cide further
> > > operations. If applications can direcly call GRO/GSO libraries, this =
issue won't exist.
> > > And DPDK is a library, which is not a holistic system like LINUX. We =
don't need to do
> > > the same as LINUX. Therefore, maybe it's a better idea to directly pr=
ovide SW
> > > segmentation/reassembling libraries to applications.
> > >
> > > > - Many users would like to control size (number of flows/items per =
flow),
> > > >   max allowed packet size, max timeout, etc., for different GRO tab=
les.
> > > > - User would need a way to flush all or only timeout packets from p=
articular GRO tables.
> > > >
> > > > So I think that API needs to extended to become be much more fine-g=
rained.
> > > > Something like that:
> > > >
> > > > struct rte_gro_tbl_param {
> > > >    int32_t socket_id;
> > > >    size_t max_flows;
> > > >    size_t max_items_per_flow;
> > > >    size_t max_pkt_size;
> > > >    uint64_t packet_timeout_cycles;
> > > >    <desired GRO types (IPV4_TCP | IPV6_TCP, ...)>
> > > >   <probably type specific params>
> > > >   ...
> > > > };
> > > >
> > > > struct rte_gro_tbl;
> > > > strct rte_gro_tbl *rte_gro_tbl_create(const struct rte_gro_tbl_para=
m *param);
> > > >
> > > > void rte_gro_tbl_destroy(struct rte_gro_tbl *tbl);
> > >
> > > Yes, I agree with you. It's necessary to provide more fine-grained co=
ntrol APIs to
> > > applications. But what's 'packet_timeout_cycles' used for? Is it for =
TCP packets?
> >
> > For any packets that sits in the gro_table for too long.
> >
> > >
> > > >
> > > > /*
> > > >  * process packets, might store some packets inside the GRO table,
> > > >  * returns number of filled entries in pkt[]
> > > >  */
> > > > uint32_t rte_gro_tbl_process(struct rte_gro_tbl *tbl, struct rte_mb=
uf *pkt[], uint32_t num);
> > > >
> > > > /*
> > > >   * retirieves up to num timeouted packets from the table.
> > > >   */
> > > > uint32_t rtre_gro_tbl_timeout(struct rte_gro_tbl *tbl, uint64_t tmt=
, struct rte_mbuf *pkt[], uint32_t num);
> > >
> > > Currently, we implement GRO as RX callback, whose processing logic is=
 simple:
> > > receive burst packets -> perform GRO -> return to application. GRO st=
ops after
> > > finishing processing received packets. If we provide rte_gro_tbl_time=
out, when
> > > and who will call it?
> >
> > I mean the following scenario:
> > We receive a packet, find it is eligible for GRO and put it into gro_ta=
ble
> > in expectation - there would be more packets for the same flow.
> > But it could happen that we would never (or for some long time) receive
> > any new packets for that flow.
> > So the first packet would never be delivered to the upper layer,
> > or delivered too late.
> > So we need a mechanism to extract such not merged packets
> > and push them to the upper layer.
> > My thought it would be application responsibility to call it from time =
to time.
> > That's actually another reason why I think we shouldn't use application
> > to always use RX callbacks for GRO.
>=20
> Currently, we only provide one reassembly function, rte_gro_reassemble_bu=
rst,
> which merges N inputted packets at a time. After finishing processing the=
se
> packets, it returns all of them and reset hashing tables. Therefore, ther=
e
> are no packets in hashing tables after rte_gro_reassemble_burst returns.

Ok, sorry I missed that part with rte_hash_reset().
So gro_ressemble_burst() performs only inline processing on current input p=
ackets
and doesn't try to save packets for future merging, correct?
Such approach indeed is much lightweight and doesn't require any extra time=
outs and flush().
So my opinion let's keep it like that - nice and simple.
BTW, I think in that case we don't need any hashtables (or any other persis=
tent strucures)at all.
What we need is just a set of GROs (tcp4, tpc6, etc.) we want to perform on=
 given array of packets.=20

>=20
> If we provide rte_gro_tbl_timeout, we also need to provide another reassm=
ebly
> function, like rte_gro_reassemble, which processes one given packet at a
> time and won't reset hashing tables. Applications decide when to flush pa=
ckets
> in hashing tables. And rte_gro_tbl_timeout is one of the ways that can be=
 used
> to flush the packets. Do you mean that?

Yes, that's what I meant, but as I said above - I think your approach is pr=
obably
more preferable - it is much simpler and lightweight.
Konstantin
=20