DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] RFC: compressdev: append dest data in PMDs instead of in application
@ 2018-09-11 17:16 Trahe, Fiona
  2018-09-15 11:31 ` [dpdk-dev] " Verma, Shally
  0 siblings, 1 reply; 4+ messages in thread
From: Trahe, Fiona @ 2018-09-11 17:16 UTC (permalink / raw)
  To: dev; +Cc: Trahe, Fiona, Verma, Shally, Daly, Lee, Jozwiak, TomaszX, Akhil Goyal

Proposal:
In compressdev, move the responsibility for appending data in the destination
mbuf from the application to the PMD.

Why:
Compression operations are all out-of-place and the output size is unknown.
Source and destination mbufs are passed to the PMDs.
There's no problem with the src, but there is with the destination.
The application allocates the dest mbuf from the pool, guesses how much of
the buffer the PMD will write with output and calls rte_pktmbuf_append() for this amount. 
No data is actually copied into the dest mbuf by the application.
 
The PMD writes output data to the destination mbuf, and returns the amount written in the 
op.produced parameter.

Throughout this the mbuf is not consistent with expectations, i.e. a call to
rte_pktmbuf_data_len() will NOT return the actual amount of data in the buffer. A call to
rte_pktmbuf_append() will NOT add space at end of existing data. rte_pktmbuf_tailroom()
 will not return how much space is available at the end of the data.
Also, some PMDs, e.g. ISA-L, need scratch space at end of the output buffer for checksum
calculation. So though the appl has appended a specific amount in the expectation that it
can be used for compressed data, the PMD needs to reduce this by 4 bytes to reserve
space for the checksum - even though there may be space to append another 4bytes.

It seems more logical that the PMD should take responsibility for appending.
I.e. the application should pass in an empty mbuf, the PMD uses the rte_pktmbuf_tailroom()
to know what space is available, uses this space as it needs and appends the output data to
match the actual amount of data it writes.
This method caters for an mbuf already containing some data, in this case the PMD will
append the output AFTER the data already in the mbuf.
It also needs to cater for SGL aka chained_mbuf case, though I'd propose in this case only
the first mbuf in the chain is allowed to already contain data, the rest must be empty.
An application can adjust a chain to satisfy this condition.

Code impacts:
 * rte_comp_op.dst.offset should be deprecated.
 * comments on the m_dst would be changed to describe this usage
 * applications should alloc destination mbuf(s) from a pool, but not append.
 * PMDs should use append() to correctly reflect the amount of data returned.

This turns out to be more complex than expected for SGLs as rte_mbuf chains DON'T support
empty mbuf chains, i.e. several macros assume there's only tailroom available in the last 
segment of a chain. So essentially if an application chains a bunch of empty mbufs
together the only tailroom available is the buf_len of the last mbuf and the space in
all the other mbufs is "lost".
We've investigated several ways around this, I think either options 1 or 2 would be ok, but
am not keen on option 3. I'm looking for feedback please.

Option1: create an rte_comp_mbuf, identical to rte_mbuf - probably just cast to it - with its own set of
   macros.  Most of these wrap existing mbuf macros, those that need to cater for the empty chain case.
Option2: pass in a pool on the API instead and the PMD allocs dest mbufs as needed and chains them.
   Storage customers expect to use external buffers attached to mbufs so this may not suit their use-case.
   Although it may be an option if all mbufs in the pool are pre-attached to external buffers.
Option3: appl does as today. PMD trims space not used. Would need changes or additions to mbuf macros
   so it could trim from more than just the last segment. PMD would need to free unused segments. 
   I'm not keen on this as doing the work in 2 places and there's potential
   to leak mbufs here as allocating them in API and freeing in PMD.

Regards,
Fiona

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [dpdk-dev] compressdev: append dest data in PMDs instead of in application
  2018-09-11 17:16 [dpdk-dev] RFC: compressdev: append dest data in PMDs instead of in application Trahe, Fiona
@ 2018-09-15 11:31 ` Verma, Shally
  2018-09-16 10:07   ` Trahe, Fiona
  0 siblings, 1 reply; 4+ messages in thread
From: Verma, Shally @ 2018-09-15 11:31 UTC (permalink / raw)
  To: Trahe, Fiona, dev
  Cc: Daly, Lee, Jozwiak, TomaszX, Akhil Goyal, Sahu, Sunila, Gupta, Ashish

HI Fiona

>-----Original Message-----
>From: Trahe, Fiona <fiona.trahe@intel.com>
>Sent: 11 September 2018 22:47
>To: dev@dpdk.org
>Cc: Trahe, Fiona <fiona.trahe@intel.com>; Verma, Shally <Shally.Verma@cavium.com>; Daly, Lee <lee.daly@intel.com>; Jozwiak,
>TomaszX <tomaszx.jozwiak@intel.com>; Akhil Goyal <akhil.goyal@nxp.com>
>Subject: RFC: compressdev: append dest data in PMDs instead of in application
>
>External Email
>
>Proposal:
>In compressdev, move the responsibility for appending data in the destination
>mbuf from the application to the PMD.
>
>Why:
>Compression operations are all out-of-place and the output size is unknown.
>Source and destination mbufs are passed to the PMDs.
>There's no problem with the src, but there is with the destination.
>The application allocates the dest mbuf from the pool, guesses how much of
>the buffer the PMD will write with output and calls rte_pktmbuf_append() for this amount.
>No data is actually copied into the dest mbuf by the application.
>
>The PMD writes output data to the destination mbuf, and returns the amount written in the
>op.produced parameter.
>
>Throughout this the mbuf is not consistent with expectations, i.e. a call to
>rte_pktmbuf_data_len() will NOT return the actual amount of data in the buffer. A call to
>rte_pktmbuf_append() will NOT add space at end of existing data. rte_pktmbuf_tailroom()
> will not return how much space is available at the end of the data.
>Also, some PMDs, e.g. ISA-L, need scratch space at end of the output buffer for checksum
>calculation. So though the appl has appended a specific amount in the expectation that it
>can be used for compressed data, the PMD needs to reduce this by 4 bytes to reserve
>space for the checksum - even though there may be space to append another 4bytes.
>
Agree to this issue in first place. However before we start on this. I have a scenario and question based on it (probably you've already thought it but I couldn't guess limitations here)
 
Given all limitations of mbuf library(especially) on SGL and compression op behaviour and complexities it's going to add on PMD, why we can't just input length of dst buffer from an app and leave all of 
mbuf management to app,* if it's using mbuf data buffer* i.e. consider this scenario:

App alloc rte_mbuf and call rte_pktmbuf_tailroom() to update dst.len with amount of free space. PMD setup HW according to offset data pointer and length given.
When op is done, then PMD updates op.produced. App call rte_pktmbuf_append() on mbuf to update m->data/pkt_len.

Now there're following possibilities with op.produced:
- if  op.produced < tailroom size ,  app call tailroom size again to find leftover free space and pass offset += op.produced and new length in next call 
- if op.produced == tailroom size, app alloc a new mbuf and chain it to previous mbuf, call tailroom size on new mbuf, set offset += op.produced and length = free space in new mbuf. PMD now will have a chained mbuf in input and offset will now end up in new chained buffer.
  
This, of course, mean PMD can have chained mbuf but only last buffer in chain available for use. 
This is a limitation but only as long as we're working with MBUFs library which were originally designed for other purposes and not so better suited for compression-like operations. 

Why I believe it is more simpler to manage:
-It keeps PMD implementation simple and lives with limitation/expectations of mbuf library
-Having length at input will also be a requirement if we add any other buffer structures types.

Although I am not sure if it can solve checksum issue you mentioned as I don't fully understood how ISAL uses dst buffer for checksum o/p. Does it write checksum in dst buffer for every op? Or just when last chunk of data is processed?
We already have separate checksum variable at input/for output. So, doesn't it just copy from dst buffer to op at output and op.produced updated with only data length?

I remember, initially you had started a discussion to add new and more generic buffer structure type to use in place of rte_mbufs for compression library , such as iovec which simply input pointer to raw bufs with length.
Aren't we thinking in that direction anymore? I believe rather than reworking a library/PMD based/around  mbufs (unless we've any requirement to support mbufs so), we should define a buffer variant to better suit compression and its users need. It also allow us to define lib to input multiple dst buffers either as an array or SGL.


Thanks
Shally

>It seems more logical that the PMD should take responsibility for appending.
>I.e. the application should pass in an empty mbuf, the PMD uses the rte_pktmbuf_tailroom()
>to know what space is available, uses this space as it needs and appends the output data to
>match the actual amount of data it writes.
>This method caters for an mbuf already containing some data, in this case the PMD will
>append the output AFTER the data already in the mbuf.
>It also needs to cater for SGL aka chained_mbuf case, though I'd propose in this case only
>the first mbuf in the chain is allowed to already contain data, the rest must be empty.
>An application can adjust a chain to satisfy this condition.
>
>Code impacts:
> * rte_comp_op.dst.offset should be deprecated.
> * comments on the m_dst would be changed to describe this usage
> * applications should alloc destination mbuf(s) from a pool, but not append.
> * PMDs should use append() to correctly reflect the amount of data returned.
>
>This turns out to be more complex than expected for SGLs as rte_mbuf chains DON'T support
>empty mbuf chains, i.e. several macros assume there's only tailroom available in the last
>segment of a chain. So essentially if an application chains a bunch of empty mbufs
>together the only tailroom available is the buf_len of the last mbuf and the space in
>all the other mbufs is "lost".
>We've investigated several ways around this, I think either options 1 or 2 would be ok, but
>am not keen on option 3. I'm looking for feedback please.
>
>Option1: create an rte_comp_mbuf, identical to rte_mbuf - probably just cast to it - with its own set of
>   macros.  Most of these wrap existing mbuf macros, those that need to cater for the empty chain case.
>Option2: pass in a pool on the API instead and the PMD allocs dest mbufs as needed and chains them.
>   Storage customers expect to use external buffers attached to mbufs so this may not suit their use-case.
>   Although it may be an option if all mbufs in the pool are pre-attached to external buffers.
>Option3: appl does as today. PMD trims space not used. Would need changes or additions to mbuf macros
>   so it could trim from more than just the last segment. PMD would need to free unused segments.
>   I'm not keen on this as doing the work in 2 places and there's potential
>   to leak mbufs here as allocating them in API and freeing in PMD.
>
>Regards,
>Fiona
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [dpdk-dev] compressdev: append dest data in PMDs instead of in application
  2018-09-15 11:31 ` [dpdk-dev] " Verma, Shally
@ 2018-09-16 10:07   ` Trahe, Fiona
  2018-09-17  5:59     ` Verma, Shally
  0 siblings, 1 reply; 4+ messages in thread
From: Trahe, Fiona @ 2018-09-16 10:07 UTC (permalink / raw)
  To: Verma, Shally, dev
  Cc: Daly, Lee, Jozwiak, TomaszX, Akhil Goyal, Sahu, Sunila, Gupta,
	Ashish, Trahe, Fiona

Hi Shally,

> -----Original Message-----
> From: Verma, Shally [mailto:Shally.Verma@cavium.com]
> Sent: Saturday, September 15, 2018 12:32 PM
> To: Trahe, Fiona <fiona.trahe@intel.com>; dev@dpdk.org
> Cc: Daly, Lee <lee.daly@intel.com>; Jozwiak, TomaszX <tomaszx.jozwiak@intel.com>; Akhil Goyal
> <akhil.goyal@nxp.com>; Sahu, Sunila <Sunila.Sahu@cavium.com>; Gupta, Ashish
> <Ashish.Gupta@cavium.com>
> Subject: RE: compressdev: append dest data in PMDs instead of in application
> 
> HI Fiona
> 
> >-----Original Message-----
> >From: Trahe, Fiona <fiona.trahe@intel.com>
> >Sent: 11 September 2018 22:47
> >To: dev@dpdk.org
> >Cc: Trahe, Fiona <fiona.trahe@intel.com>; Verma, Shally <Shally.Verma@cavium.com>; Daly, Lee
> <lee.daly@intel.com>; Jozwiak,
> >TomaszX <tomaszx.jozwiak@intel.com>; Akhil Goyal <akhil.goyal@nxp.com>
> >Subject: RFC: compressdev: append dest data in PMDs instead of in application
> >
> >External Email
> >
> >Proposal:
> >In compressdev, move the responsibility for appending data in the destination
> >mbuf from the application to the PMD.
> >
> >Why:
> >Compression operations are all out-of-place and the output size is unknown.
> >Source and destination mbufs are passed to the PMDs.
> >There's no problem with the src, but there is with the destination.
> >The application allocates the dest mbuf from the pool, guesses how much of
> >the buffer the PMD will write with output and calls rte_pktmbuf_append() for this amount.
> >No data is actually copied into the dest mbuf by the application.
> >
> >The PMD writes output data to the destination mbuf, and returns the amount written in the
> >op.produced parameter.
> >
> >Throughout this the mbuf is not consistent with expectations, i.e. a call to
> >rte_pktmbuf_data_len() will NOT return the actual amount of data in the buffer. A call to
> >rte_pktmbuf_append() will NOT add space at end of existing data. rte_pktmbuf_tailroom()
> > will not return how much space is available at the end of the data.
> >Also, some PMDs, e.g. ISA-L, need scratch space at end of the output buffer for checksum
> >calculation. So though the appl has appended a specific amount in the expectation that it
> >can be used for compressed data, the PMD needs to reduce this by 4 bytes to reserve
> >space for the checksum - even though there may be space to append another 4bytes.
> >
> Agree to this issue in first place. However before we start on this. I have a scenario and question based on
> it (probably you've already thought it but I couldn't guess limitations here)
> 
> Given all limitations of mbuf library(especially) on SGL and compression op behaviour and complexities it's
> going to add on PMD, why we can't just input length of dst buffer from an app and leave all of
> mbuf management to app,* if it's using mbuf data buffer* i.e. consider this scenario:
> 
> App alloc rte_mbuf and call rte_pktmbuf_tailroom() to update dst.len with amount of free space. PMD
> setup HW according to offset data pointer and length given.
> When op is done, then PMD updates op.produced. App call rte_pktmbuf_append() on mbuf to update m-
> >data/pkt_len.
> 
> Now there're following possibilities with op.produced:
> - if  op.produced < tailroom size ,  app call tailroom size again to find leftover free space and pass offset +=
> op.produced and new length in next call
> - if op.produced == tailroom size, app alloc a new mbuf and chain it to previous mbuf, call tailroom size on
> new mbuf, set offset += op.produced and length = free space in new mbuf. PMD now will have a chained
> mbuf in input and offset will now end up in new chained buffer.
> 
> This, of course, mean PMD can have chained mbuf but only last buffer in chain available for use.
> This is a limitation but only as long as we're working with MBUFs library which were originally designed for
> other purposes and not so better suited for compression-like operations.
[Fiona] I like this idea as it keeps all the mbuf mgmt. in the application.
But it removes the possibility of passing more than 64k-1 in one op, which I think is unacceptable for storage purposes.
As regards mbuf alternatives you mention below, yes, that's still open. The mbuf external memory
feature which was recently added may suffice, we wanted to try that out before proposing alternatives.
I'm focussed on other work for 18.11 at the moment, so will get back to this topic after that release. 

> 
> Why I believe it is more simpler to manage:
> -It keeps PMD implementation simple and lives with limitation/expectations of mbuf library
> -Having length at input will also be a requirement if we add any other buffer structures types.
> 
> Although I am not sure if it can solve checksum issue you mentioned as I don't fully understood how ISAL
> uses dst buffer for checksum o/p. Does it write checksum in dst buffer for every op? Or just when last
> chunk of data is processed?
> We already have separate checksum variable at input/for output. So, doesn't it just copy from dst buffer to
> op at output and op.produced updated with only data length?
> 
> I remember, initially you had started a discussion to add new and more generic buffer structure type to use
> in place of rte_mbufs for compression library , such as iovec which simply input pointer to raw bufs with
> length.
> Aren't we thinking in that direction anymore? I believe rather than reworking a library/PMD based/around
> mbufs (unless we've any requirement to support mbufs so), we should define a buffer variant to better suit
> compression and its users need. It also allow us to define lib to input multiple dst buffers either as an array
> or SGL.
> 
> 
> Thanks
> Shally
> 
> >It seems more logical that the PMD should take responsibility for appending.
> >I.e. the application should pass in an empty mbuf, the PMD uses the rte_pktmbuf_tailroom()
> >to know what space is available, uses this space as it needs and appends the output data to
> >match the actual amount of data it writes.
> >This method caters for an mbuf already containing some data, in this case the PMD will
> >append the output AFTER the data already in the mbuf.
> >It also needs to cater for SGL aka chained_mbuf case, though I'd propose in this case only
> >the first mbuf in the chain is allowed to already contain data, the rest must be empty.
> >An application can adjust a chain to satisfy this condition.
> >
> >Code impacts:
> > * rte_comp_op.dst.offset should be deprecated.
> > * comments on the m_dst would be changed to describe this usage
> > * applications should alloc destination mbuf(s) from a pool, but not append.
> > * PMDs should use append() to correctly reflect the amount of data returned.
> >
> >This turns out to be more complex than expected for SGLs as rte_mbuf chains DON'T support
> >empty mbuf chains, i.e. several macros assume there's only tailroom available in the last
> >segment of a chain. So essentially if an application chains a bunch of empty mbufs
> >together the only tailroom available is the buf_len of the last mbuf and the space in
> >all the other mbufs is "lost".
> >We've investigated several ways around this, I think either options 1 or 2 would be ok, but
> >am not keen on option 3. I'm looking for feedback please.
> >
> >Option1: create an rte_comp_mbuf, identical to rte_mbuf - probably just cast to it - with its own set of
> >   macros.  Most of these wrap existing mbuf macros, those that need to cater for the empty chain case.
> >Option2: pass in a pool on the API instead and the PMD allocs dest mbufs as needed and chains them.
> >   Storage customers expect to use external buffers attached to mbufs so this may not suit their use-case.
> >   Although it may be an option if all mbufs in the pool are pre-attached to external buffers.
> >Option3: appl does as today. PMD trims space not used. Would need changes or additions to mbuf
> macros
> >   so it could trim from more than just the last segment. PMD would need to free unused segments.
> >   I'm not keen on this as doing the work in 2 places and there's potential
> >   to leak mbufs here as allocating them in API and freeing in PMD.
> >
> >Regards,
> >Fiona
> >

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [dpdk-dev] compressdev: append dest data in PMDs instead of in application
  2018-09-16 10:07   ` Trahe, Fiona
@ 2018-09-17  5:59     ` Verma, Shally
  0 siblings, 0 replies; 4+ messages in thread
From: Verma, Shally @ 2018-09-17  5:59 UTC (permalink / raw)
  To: Trahe, Fiona, dev
  Cc: Daly, Lee, Jozwiak, TomaszX, Akhil Goyal, Sahu, Sunila, Gupta, Ashish



>-----Original Message-----
>From: Trahe, Fiona <fiona.trahe@intel.com>
>Sent: 16 September 2018 15:38
>To: Verma, Shally <Shally.Verma@cavium.com>; dev@dpdk.org
>Cc: Daly, Lee <lee.daly@intel.com>; Jozwiak, TomaszX <tomaszx.jozwiak@intel.com>; Akhil Goyal <akhil.goyal@nxp.com>; Sahu,
>Sunila <Sunila.Sahu@cavium.com>; Gupta, Ashish <Ashish.Gupta@cavium.com>; Trahe, Fiona <fiona.trahe@intel.com>
>Subject: RE: compressdev: append dest data in PMDs instead of in application
>
>External Email
>
>Hi Shally,
>
>> -----Original Message-----
>> From: Verma, Shally [mailto:Shally.Verma@cavium.com]
>> Sent: Saturday, September 15, 2018 12:32 PM
>> To: Trahe, Fiona <fiona.trahe@intel.com>; dev@dpdk.org
>> Cc: Daly, Lee <lee.daly@intel.com>; Jozwiak, TomaszX <tomaszx.jozwiak@intel.com>; Akhil Goyal
>> <akhil.goyal@nxp.com>; Sahu, Sunila <Sunila.Sahu@cavium.com>; Gupta, Ashish
>> <Ashish.Gupta@cavium.com>
>> Subject: RE: compressdev: append dest data in PMDs instead of in application
>>
>> HI Fiona
>>
>> >-----Original Message-----
>> >From: Trahe, Fiona <fiona.trahe@intel.com>
>> >Sent: 11 September 2018 22:47
>> >To: dev@dpdk.org
>> >Cc: Trahe, Fiona <fiona.trahe@intel.com>; Verma, Shally <Shally.Verma@cavium.com>; Daly, Lee
>> <lee.daly@intel.com>; Jozwiak,
>> >TomaszX <tomaszx.jozwiak@intel.com>; Akhil Goyal <akhil.goyal@nxp.com>
>> >Subject: RFC: compressdev: append dest data in PMDs instead of in application
>> >
>> >External Email
>> >
>> >Proposal:
>> >In compressdev, move the responsibility for appending data in the destination
>> >mbuf from the application to the PMD.
>> >
>> >Why:
>> >Compression operations are all out-of-place and the output size is unknown.
>> >Source and destination mbufs are passed to the PMDs.
>> >There's no problem with the src, but there is with the destination.
>> >The application allocates the dest mbuf from the pool, guesses how much of
>> >the buffer the PMD will write with output and calls rte_pktmbuf_append() for this amount.
>> >No data is actually copied into the dest mbuf by the application.
>> >
>> >The PMD writes output data to the destination mbuf, and returns the amount written in the
>> >op.produced parameter.
>> >
>> >Throughout this the mbuf is not consistent with expectations, i.e. a call to
>> >rte_pktmbuf_data_len() will NOT return the actual amount of data in the buffer. A call to
>> >rte_pktmbuf_append() will NOT add space at end of existing data. rte_pktmbuf_tailroom()
>> > will not return how much space is available at the end of the data.
>> >Also, some PMDs, e.g. ISA-L, need scratch space at end of the output buffer for checksum
>> >calculation. So though the appl has appended a specific amount in the expectation that it
>> >can be used for compressed data, the PMD needs to reduce this by 4 bytes to reserve
>> >space for the checksum - even though there may be space to append another 4bytes.
>> >
>> Agree to this issue in first place. However before we start on this. I have a scenario and question based on
>> it (probably you've already thought it but I couldn't guess limitations here)
>>
>> Given all limitations of mbuf library(especially) on SGL and compression op behaviour and complexities it's
>> going to add on PMD, why we can't just input length of dst buffer from an app and leave all of
>> mbuf management to app,* if it's using mbuf data buffer* i.e. consider this scenario:
>>
>> App alloc rte_mbuf and call rte_pktmbuf_tailroom() to update dst.len with amount of free space. PMD
>> setup HW according to offset data pointer and length given.
>> When op is done, then PMD updates op.produced. App call rte_pktmbuf_append() on mbuf to update m-
>> >data/pkt_len.
>>
>> Now there're following possibilities with op.produced:
>> - if  op.produced < tailroom size ,  app call tailroom size again to find leftover free space and pass offset +=
>> op.produced and new length in next call
>> - if op.produced == tailroom size, app alloc a new mbuf and chain it to previous mbuf, call tailroom size on
>> new mbuf, set offset += op.produced and length = free space in new mbuf. PMD now will have a chained
>> mbuf in input and offset will now end up in new chained buffer.
>>
>> This, of course, mean PMD can have chained mbuf but only last buffer in chain available for use.
>> This is a limitation but only as long as we're working with MBUFs library which were originally designed for
>> other purposes and not so better suited for compression-like operations.
>[Fiona] I like this idea as it keeps all the mbuf mgmt. in the application.
>But it removes the possibility of passing more than 64k-1 in one op, which I think is unacceptable for storage purposes.
>As regards mbuf alternatives you mention below, yes, that's still open. The mbuf external memory
>feature which was recently added may suffice, we wanted to try that out before proposing alternatives.
>I'm focussed on other work for 18.11 at the moment, so will get back to this topic after that release.

[Shally] For storage, in any case mbuf data buffer doesn't seem an apt data structure as they will always have some limitations.
External buffer attachment maybe worth considering though. I would be curious to know if it behave stably. 

Thanks
Shally

>
>>
>> Why I believe it is more simpler to manage:
>> -It keeps PMD implementation simple and lives with limitation/expectations of mbuf library
>> -Having length at input will also be a requirement if we add any other buffer structures types.
>>
>> Although I am not sure if it can solve checksum issue you mentioned as I don't fully understood how ISAL
>> uses dst buffer for checksum o/p. Does it write checksum in dst buffer for every op? Or just when last
>> chunk of data is processed?
>> We already have separate checksum variable at input/for output. So, doesn't it just copy from dst buffer to
>> op at output and op.produced updated with only data length?
>>
>> I remember, initially you had started a discussion to add new and more generic buffer structure type to use
>> in place of rte_mbufs for compression library , such as iovec which simply input pointer to raw bufs with
>> length.
>> Aren't we thinking in that direction anymore? I believe rather than reworking a library/PMD based/around
>> mbufs (unless we've any requirement to support mbufs so), we should define a buffer variant to better suit
>> compression and its users need. It also allow us to define lib to input multiple dst buffers either as an array
>> or SGL.
>>
>>
>> Thanks
>> Shally
>>
>> >It seems more logical that the PMD should take responsibility for appending.
>> >I.e. the application should pass in an empty mbuf, the PMD uses the rte_pktmbuf_tailroom()
>> >to know what space is available, uses this space as it needs and appends the output data to
>> >match the actual amount of data it writes.
>> >This method caters for an mbuf already containing some data, in this case the PMD will
>> >append the output AFTER the data already in the mbuf.
>> >It also needs to cater for SGL aka chained_mbuf case, though I'd propose in this case only
>> >the first mbuf in the chain is allowed to already contain data, the rest must be empty.
>> >An application can adjust a chain to satisfy this condition.
>> >
>> >Code impacts:
>> > * rte_comp_op.dst.offset should be deprecated.
>> > * comments on the m_dst would be changed to describe this usage
>> > * applications should alloc destination mbuf(s) from a pool, but not append.
>> > * PMDs should use append() to correctly reflect the amount of data returned.
>> >
>> >This turns out to be more complex than expected for SGLs as rte_mbuf chains DON'T support
>> >empty mbuf chains, i.e. several macros assume there's only tailroom available in the last
>> >segment of a chain. So essentially if an application chains a bunch of empty mbufs
>> >together the only tailroom available is the buf_len of the last mbuf and the space in
>> >all the other mbufs is "lost".
>> >We've investigated several ways around this, I think either options 1 or 2 would be ok, but
>> >am not keen on option 3. I'm looking for feedback please.
>> >
>> >Option1: create an rte_comp_mbuf, identical to rte_mbuf - probably just cast to it - with its own set of
>> >   macros.  Most of these wrap existing mbuf macros, those that need to cater for the empty chain case.
>> >Option2: pass in a pool on the API instead and the PMD allocs dest mbufs as needed and chains them.
>> >   Storage customers expect to use external buffers attached to mbufs so this may not suit their use-case.
>> >   Although it may be an option if all mbufs in the pool are pre-attached to external buffers.
>> >Option3: appl does as today. PMD trims space not used. Would need changes or additions to mbuf
>> macros
>> >   so it could trim from more than just the last segment. PMD would need to free unused segments.
>> >   I'm not keen on this as doing the work in 2 places and there's potential
>> >   to leak mbufs here as allocating them in API and freeing in PMD.
>> >
>> >Regards,
>> >Fiona
>> >

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-09-17  5:59 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-11 17:16 [dpdk-dev] RFC: compressdev: append dest data in PMDs instead of in application Trahe, Fiona
2018-09-15 11:31 ` [dpdk-dev] " Verma, Shally
2018-09-16 10:07   ` Trahe, Fiona
2018-09-17  5:59     ` Verma, Shally

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).