DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] two tso related questions
@ 2014-12-15 20:20 Helmut Sim
  2014-12-16  9:10 ` Alex Markuze
  0 siblings, 1 reply; 10+ messages in thread
From: Helmut Sim @ 2014-12-15 20:20 UTC (permalink / raw)
  To: dev

Hi,

While working on TSO based solution I faced the following two questions:

1.
is there a maximum pkt_len to be used with TSO?, e.g. let's say if seg_sz
is 1400 can the entire segmented pkt be 256K (higer than 64K) ?, then the
driver gets a list of chanined mbufs while the first mbuf is set to TSO
offload.

2.
I wonder, Is there a specific reason why TSO is supported only for IXGBE
and not for IGB ? the 82576 NIC supports TSO though.
Is it due to a kind of tecnical barrier or is it because of priorities?

It will be great if someone from the forum could address this.

Thanks,
Sim

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] two tso related questions
  2014-12-15 20:20 [dpdk-dev] two tso related questions Helmut Sim
@ 2014-12-16  9:10 ` Alex Markuze
  2014-12-16 12:24   ` Helmut Sim
  0 siblings, 1 reply; 10+ messages in thread
From: Alex Markuze @ 2014-12-16  9:10 UTC (permalink / raw)
  To: Helmut Sim; +Cc: dev

On Mon, Dec 15, 2014 at 10:20 PM, Helmut Sim <simhelmut@gmail.com> wrote:
>
> Hi,
>
> While working on TSO based solution I faced the following two questions:
>
> 1.
> is there a maximum pkt_len to be used with TSO?, e.g. let's say if seg_sz
> is 1400 can the entire segmented pkt be 256K (higer than 64K) ?, then the
> driver gets a list of chanined mbufs while the first mbuf is set to TSO
> offload.
>

TSO segments a TCP packet into mtu sied bits. The TCP/IP protocols are
limited to 64K due to the length fields being 16bit wide. You can't build a
valid packet longer then 64K regardless of the NIC.


> 2.
> I wonder, Is there a specific reason why TSO is supported only for IXGBE
> and not for IGB ? the 82576 NIC supports TSO though.
> Is it due to a kind of tecnical barrier or is it because of priorities?
>
> It will be great if someone from the forum could address this.
>
> Thanks,
> Sim
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] two tso related questions
  2014-12-16  9:10 ` Alex Markuze
@ 2014-12-16 12:24   ` Helmut Sim
  2014-12-16 14:04     ` Alex Markuze
  0 siblings, 1 reply; 10+ messages in thread
From: Helmut Sim @ 2014-12-16 12:24 UTC (permalink / raw)
  To: Alex Markuze; +Cc: dev

Thanks Alex,

So i probably miss something...
what you are saying is correct for IP segmentation where the segmentation
is at the IP level, and all segments are identified according to the
Identification field in the IP header.

However in TCP segmentation the segments are at the TCP level (isn't it?),
where each frame is at a size of
MSS+sizeof(tcp_hdr)+sizeof(ip_hdr)+sizeof(eth_hdr).
Hence, for each of the sent packets, the IP Identification is 0 and the IP
total length is MSS+sizeof(tcp_hdr)+sizeof(ip_hdr).

Please correct me if i am wrong.

thanks.

On Tue, Dec 16, 2014 at 11:10 AM, Alex Markuze <alex@weka.io> wrote:
>
>
>
> On Mon, Dec 15, 2014 at 10:20 PM, Helmut Sim <simhelmut@gmail.com> wrote:
>>
>> Hi,
>>
>> While working on TSO based solution I faced the following two questions:
>>
>> 1.
>> is there a maximum pkt_len to be used with TSO?, e.g. let's say if seg_sz
>> is 1400 can the entire segmented pkt be 256K (higer than 64K) ?, then the
>> driver gets a list of chanined mbufs while the first mbuf is set to TSO
>> offload.
>>
>
> TSO segments a TCP packet into mtu sied bits. The TCP/IP protocols are
> limited to 64K due to the length fields being 16bit wide. You can't build a
> valid packet longer then 64K regardless of the NIC.
>
>
>> 2.
>> I wonder, Is there a specific reason why TSO is supported only for IXGBE
>> and not for IGB ? the 82576 NIC supports TSO though.
>> Is it due to a kind of tecnical barrier or is it because of priorities?
>>
>> It will be great if someone from the forum could address this.
>>
>> Thanks,
>> Sim
>>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] two tso related questions
  2014-12-16 12:24   ` Helmut Sim
@ 2014-12-16 14:04     ` Alex Markuze
  2014-12-17  7:17       ` Helmut Sim
  0 siblings, 1 reply; 10+ messages in thread
From: Alex Markuze @ 2014-12-16 14:04 UTC (permalink / raw)
  To: Helmut Sim; +Cc: dev

On Tue, Dec 16, 2014 at 2:24 PM, Helmut Sim <simhelmut@gmail.com> wrote:
>
> Thanks Alex,
>
> So i probably miss something...
> what you are saying is correct for IP segmentation where the segmentation
> is at the IP level, and all segments are identified according to the
> Identification field in the IP header.
>
> However in TCP segmentation the segments are at the TCP level (isn't it?),
> where each frame is at a size of
> MSS+sizeof(tcp_hdr)+sizeof(ip_hdr)+sizeof(eth_hdr).
> Hence, for each of the sent packets, the IP Identification is 0 and the IP
> total length is MSS+sizeof(tcp_hdr)+sizeof(ip_hdr).
>
> Please correct me if i am wrong.
>
TSO - takes a one packet max size 64KB(not counting mac/vlan size). and
brakes it into valid mtu sized packets each with its one IP and TCP header.
I'm not sure what how the identificayion/Frag off fields are filled. you
can easily check it by running a short tcp stream(perf/netperf) between two
machines and capturing the packets with tcpdump (wireshark to open) use
ethanol -K to disable LRO/GRO (the receive side kernel driver will
rearrange the headers otherwise).

I hope this helps.

>
>
thanks.
>
> On Tue, Dec 16, 2014 at 11:10 AM, Alex Markuze <alex@weka.io> wrote:
>>
>>
>>
>> On Mon, Dec 15, 2014 at 10:20 PM, Helmut Sim <simhelmut@gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> While working on TSO based solution I faced the following two questions:
>>>
>>> 1.
>>> is there a maximum pkt_len to be used with TSO?, e.g. let's say if seg_sz
>>> is 1400 can the entire segmented pkt be 256K (higer than 64K) ?, then the
>>> driver gets a list of chanined mbufs while the first mbuf is set to TSO
>>> offload.
>>>
>>
>> TSO segments a TCP packet into mtu sied bits. The TCP/IP protocols are
>> limited to 64K due to the length fields being 16bit wide. You can't build a
>> valid packet longer then 64K regardless of the NIC.
>>
>>
>>> 2.
>>> I wonder, Is there a specific reason why TSO is supported only for IXGBE
>>> and not for IGB ? the 82576 NIC supports TSO though.
>>> Is it due to a kind of tecnical barrier or is it because of priorities?
>>>
>>> It will be great if someone from the forum could address this.
>>>
>>> Thanks,
>>> Sim
>>>
>>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] two tso related questions
  2014-12-16 14:04     ` Alex Markuze
@ 2014-12-17  7:17       ` Helmut Sim
  2014-12-17 13:02         ` Olivier MATZ
  0 siblings, 1 reply; 10+ messages in thread
From: Helmut Sim @ 2014-12-17  7:17 UTC (permalink / raw)
  To: Alex Markuze; +Cc: dev

thanks. i will check this

On Tue, Dec 16, 2014 at 4:04 PM, Alex Markuze <alex@weka.io> wrote:
>
>
>
> On Tue, Dec 16, 2014 at 2:24 PM, Helmut Sim <simhelmut@gmail.com> wrote:
>>
>> Thanks Alex,
>>
>> So i probably miss something...
>> what you are saying is correct for IP segmentation where the segmentation
>> is at the IP level, and all segments are identified according to the
>> Identification field in the IP header.
>>
>> However in TCP segmentation the segments are at the TCP level (isn't
>> it?), where each frame is at a size of
>> MSS+sizeof(tcp_hdr)+sizeof(ip_hdr)+sizeof(eth_hdr).
>> Hence, for each of the sent packets, the IP Identification is 0 and the
>> IP total length is MSS+sizeof(tcp_hdr)+sizeof(ip_hdr).
>>
>> Please correct me if i am wrong.
>>
> TSO - takes a one packet max size 64KB(not counting mac/vlan size). and
> brakes it into valid mtu sized packets each with its one IP and TCP header.
> I'm not sure what how the identificayion/Frag off fields are filled. you
> can easily check it by running a short tcp stream(perf/netperf) between two
> machines and capturing the packets with tcpdump (wireshark to open) use
> ethanol -K to disable LRO/GRO (the receive side kernel driver will
> rearrange the headers otherwise).
>
> I hope this helps.
>
>>
>>
> thanks.
>>
>> On Tue, Dec 16, 2014 at 11:10 AM, Alex Markuze <alex@weka.io> wrote:
>>>
>>>
>>>
>>> On Mon, Dec 15, 2014 at 10:20 PM, Helmut Sim <simhelmut@gmail.com>
>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> While working on TSO based solution I faced the following two questions:
>>>>
>>>> 1.
>>>> is there a maximum pkt_len to be used with TSO?, e.g. let's say if
>>>> seg_sz
>>>> is 1400 can the entire segmented pkt be 256K (higer than 64K) ?, then
>>>> the
>>>> driver gets a list of chanined mbufs while the first mbuf is set to TSO
>>>> offload.
>>>>
>>>
>>> TSO segments a TCP packet into mtu sied bits. The TCP/IP protocols are
>>> limited to 64K due to the length fields being 16bit wide. You can't build a
>>> valid packet longer then 64K regardless of the NIC.
>>>
>>>
>>>> 2.
>>>> I wonder, Is there a specific reason why TSO is supported only for IXGBE
>>>> and not for IGB ? the 82576 NIC supports TSO though.
>>>> Is it due to a kind of tecnical barrier or is it because of priorities?
>>>>
>>>> It will be great if someone from the forum could address this.
>>>>
>>>> Thanks,
>>>> Sim
>>>>
>>>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] two tso related questions
  2014-12-17  7:17       ` Helmut Sim
@ 2014-12-17 13:02         ` Olivier MATZ
  2015-01-04  8:50           ` Helmut Sim
  0 siblings, 1 reply; 10+ messages in thread
From: Olivier MATZ @ 2014-12-17 13:02 UTC (permalink / raw)
  To: Helmut Sim, Alex Markuze; +Cc: dev

Hi Helmut,

On 12/17/2014 08:17 AM, Helmut Sim wrote:
>>>>> While working on TSO based solution I faced the following two questions:
>>>>>
>>>>> 1.
>>>>> is there a maximum pkt_len to be used with TSO?, e.g. let's say if
>>>>> seg_sz
>>>>> is 1400 can the entire segmented pkt be 256K (higer than 64K) ?, then
>>>>> the
>>>>> driver gets a list of chanined mbufs while the first mbuf is set to TSO
>>>>> offload.

I think the limitations depend on:

- the window size advertised by the peer: your stack should handle this
   and not generate more packets that what the peer can receive

- the driver: on ixgbe, the maximum payload length is 2^18. I don't know
   if there is a limitation on number of chained descriptors.

I think we should define a way to know this limitation in the API. Maybe
a comment saying that the TSO length should not be higher than 256KB (or
fix it to 64KB in case future drivers do not support 256KB) is enough.

Regards,
Olivier

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] two tso related questions
  2014-12-17 13:02         ` Olivier MATZ
@ 2015-01-04  8:50           ` Helmut Sim
  2015-01-04  9:57             ` Alex Markuze
  0 siblings, 1 reply; 10+ messages in thread
From: Helmut Sim @ 2015-01-04  8:50 UTC (permalink / raw)
  To: Olivier MATZ; +Cc: dev

Hi Alex and Olivier,

Alex, I made the test and the segmentation is not at the IP level (i.e.
each packet ip total length indicated the mss length), hence the 16 bits
total length limitation is not relevant here.
I went over the 82599 datasheet and as Olivier mentioned it is a 18 bits
field, hence allowing up to 256KB length.

Olivier, although tcp window size field is 16 bits the advertised window is
typically higher than 64KB using the TCP window scaling option (which is
the common usage today).

Hence I think that the API should allow at least up to 256KB packet length,
while finding a solution to make sure it also support lower lengths for
other NICs.

Any idea?

Sim

On Wed, Dec 17, 2014 at 3:02 PM, Olivier MATZ <olivier.matz@6wind.com>
wrote:

> Hi Helmut,
>
> On 12/17/2014 08:17 AM, Helmut Sim wrote:
>
>> While working on TSO based solution I faced the following two questions:
>>>>>>
>>>>>> 1.
>>>>>> is there a maximum pkt_len to be used with TSO?, e.g. let's say if
>>>>>> seg_sz
>>>>>> is 1400 can the entire segmented pkt be 256K (higer than 64K) ?, then
>>>>>> the
>>>>>> driver gets a list of chanined mbufs while the first mbuf is set to
>>>>>> TSO
>>>>>> offload.
>>>>>>
>>>>>
> I think the limitations depend on:
>
> - the window size advertised by the peer: your stack should handle this
>   and not generate more packets that what the peer can receive
>
> - the driver: on ixgbe, the maximum payload length is 2^18. I don't know
>   if there is a limitation on number of chained descriptors.
>
> I think we should define a way to know this limitation in the API. Maybe
> a comment saying that the TSO length should not be higher than 256KB (or
> fix it to 64KB in case future drivers do not support 256KB) is enough.
>
> Regards,
> Olivier
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] two tso related questions
  2015-01-04  8:50           ` Helmut Sim
@ 2015-01-04  9:57             ` Alex Markuze
  2015-01-04 10:13               ` Helmut Sim
  0 siblings, 1 reply; 10+ messages in thread
From: Alex Markuze @ 2015-01-04  9:57 UTC (permalink / raw)
  To: Helmut Sim; +Cc: dev

On Sun, Jan 4, 2015 at 10:50 AM, Helmut Sim <simhelmut@gmail.com> wrote:

> Hi Alex and Olivier,
>
> Alex, I made the test and the segmentation is not at the IP level (i.e.
> each packet ip total length indicated the mss length), hence the 16 bits
> total length limitation is not relevant here.
>

Oliver thanks for reporting back, this is interesting but doesn't come as a
surprise as the headers must be correct when on the wire, what I couldn't
tell you is what happens with the identificayion/Frag off fields.

The IP length limitation comes from the send side network stack,
theoreticaly Its possible to send a packet of any size as long as your
network stack doesn't mind sending a packet with a malformed IP header(as
the length field is not defined).
  .
The send side HW recieves a single packet with the ip length of the whole
packet. Assuming that the ixgbe HW takes the packet len for its TSO
fragmentation from the TX descriptor rather then from the IP header, it
should be able to send as much as the HW supports.


> I went over the 82599 datasheet and as Olivier mentioned it is a 18 bits
> field, hence allowing up to 256KB length.
>
> Olivier, although tcp window size field is 16 bits the advertised window
> is typically higher than 64KB using the TCP window scaling option (which is
> the common usage today).
>
> Hence I think that the API should allow at least up to 256KB packet
> length, while finding a solution to make sure it also support lower lengths
> for other NICs.
>
> Any idea?
>
> Sim
>
> On Wed, Dec 17, 2014 at 3:02 PM, Olivier MATZ <olivier.matz@6wind.com>
> wrote:
>
>> Hi Helmut,
>>
>> On 12/17/2014 08:17 AM, Helmut Sim wrote:
>>
>>> While working on TSO based solution I faced the following two questions:
>>>>>>>
>>>>>>> 1.
>>>>>>> is there a maximum pkt_len to be used with TSO?, e.g. let's say if
>>>>>>> seg_sz
>>>>>>> is 1400 can the entire segmented pkt be 256K (higer than 64K) ?, then
>>>>>>> the
>>>>>>> driver gets a list of chanined mbufs while the first mbuf is set to
>>>>>>> TSO
>>>>>>> offload.
>>>>>>>
>>>>>>
>> I think the limitations depend on:
>>
>> - the window size advertised by the peer: your stack should handle this
>>   and not generate more packets that what the peer can receive
>>
>> - the driver: on ixgbe, the maximum payload length is 2^18. I don't know
>>   if there is a limitation on number of chained descriptors.
>>
>> I think we should define a way to know this limitation in the API. Maybe
>> a comment saying that the TSO length should not be higher than 256KB (or
>> fix it to 64KB in case future drivers do not support 256KB) is enough.
>>
>> Regards,
>> Olivier
>>
>>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] two tso related questions
  2015-01-04  9:57             ` Alex Markuze
@ 2015-01-04 10:13               ` Helmut Sim
  2015-01-05  8:53                 ` Olivier MATZ
  0 siblings, 1 reply; 10+ messages in thread
From: Helmut Sim @ 2015-01-04 10:13 UTC (permalink / raw)
  To: Alex Markuze; +Cc: dev

correct.
In such case, a modified api should not require to set the ip_hdr
total_length field, which is 16 bits.
The HW will assign the correct packet length for each transmitted IP packet
which is l3_len+l4_len+mss (except of the last segment which may be smaller
than mss).

Sim

On Sun, Jan 4, 2015 at 11:57 AM, Alex Markuze <alex@weka.io> wrote:

>
>
> On Sun, Jan 4, 2015 at 10:50 AM, Helmut Sim <simhelmut@gmail.com> wrote:
>
>> Hi Alex and Olivier,
>>
>> Alex, I made the test and the segmentation is not at the IP level (i.e.
>> each packet ip total length indicated the mss length), hence the 16 bits
>> total length limitation is not relevant here.
>>
>
> Oliver thanks for reporting back, this is interesting but doesn't come as
> a surprise as the headers must be correct when on the wire, what I couldn't
> tell you is what happens with the identificayion/Frag off fields.
>
> The IP length limitation comes from the send side network stack,
> theoreticaly Its possible to send a packet of any size as long as your
> network stack doesn't mind sending a packet with a malformed IP header(as
> the length field is not defined).
>   .
> The send side HW recieves a single packet with the ip length of the whole
> packet. Assuming that the ixgbe HW takes the packet len for its TSO
> fragmentation from the TX descriptor rather then from the IP header, it
> should be able to send as much as the HW supports.
>
>
>> I went over the 82599 datasheet and as Olivier mentioned it is a 18 bits
>> field, hence allowing up to 256KB length.
>>
>> Olivier, although tcp window size field is 16 bits the advertised window
>> is typically higher than 64KB using the TCP window scaling option (which is
>> the common usage today).
>>
>> Hence I think that the API should allow at least up to 256KB packet
>> length, while finding a solution to make sure it also support lower lengths
>> for other NICs.
>>
>> Any idea?
>>
>> Sim
>>
>> On Wed, Dec 17, 2014 at 3:02 PM, Olivier MATZ <olivier.matz@6wind.com>
>> wrote:
>>
>>> Hi Helmut,
>>>
>>> On 12/17/2014 08:17 AM, Helmut Sim wrote:
>>>
>>>> While working on TSO based solution I faced the following two questions:
>>>>>>>>
>>>>>>>> 1.
>>>>>>>> is there a maximum pkt_len to be used with TSO?, e.g. let's say if
>>>>>>>> seg_sz
>>>>>>>> is 1400 can the entire segmented pkt be 256K (higer than 64K) ?,
>>>>>>>> then
>>>>>>>> the
>>>>>>>> driver gets a list of chanined mbufs while the first mbuf is set to
>>>>>>>> TSO
>>>>>>>> offload.
>>>>>>>>
>>>>>>>
>>> I think the limitations depend on:
>>>
>>> - the window size advertised by the peer: your stack should handle this
>>>   and not generate more packets that what the peer can receive
>>>
>>> - the driver: on ixgbe, the maximum payload length is 2^18. I don't know
>>>   if there is a limitation on number of chained descriptors.
>>>
>>> I think we should define a way to know this limitation in the API. Maybe
>>> a comment saying that the TSO length should not be higher than 256KB (or
>>> fix it to 64KB in case future drivers do not support 256KB) is enough.
>>>
>>> Regards,
>>> Olivier
>>>
>>>
>>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [dpdk-dev] two tso related questions
  2015-01-04 10:13               ` Helmut Sim
@ 2015-01-05  8:53                 ` Olivier MATZ
  0 siblings, 0 replies; 10+ messages in thread
From: Olivier MATZ @ 2015-01-05  8:53 UTC (permalink / raw)
  To: Helmut Sim, Alex Markuze; +Cc: dev

Hi,

On 01/04/2015 11:13 AM, Helmut Sim wrote:
> In such case, a modified api should not require to set the ip_hdr
> total_length field, which is 16 bits.
> The HW will assign the correct packet length for each transmitted IP
> packet which is l3_len+l4_len+mss (except of the last segment which may
> be smaller than mss).
> [...]
>         I went over the 82599 datasheet and as Olivier mentioned it is a
>         18 bits field, hence allowing up to 256KB length.
> 
>         Olivier, although tcp window size field is 16 bits the
>         advertised window is typically higher than 64KB using the TCP
>         window scaling option (which is the common usage today).
> 
>         Hence I think that the API should allow at least up to 256KB
>         packet length, while finding a solution to make sure it also
>         support lower lengths for other NICs.


I don't think that the maximum TSO packet should be bigger than
what we have. TSO does not exempt to implement a TCP stack, and
it is not designed to send megabytes of data without the intervention
of the TCP stack.

The objective is to accelerate the segmentation of packets. Indeed,
without TSO, the main costs are the segmentation itself (usually
at ~1.5K) and the fact that each 1.5K packet go through the low
layer code (driver).

TSO solves these 2 problems even with a length limit at 64K: it
would represent ~40 times less packets to segment and transmit to
the driver, dividing the cost by the same amount. I think increasing
the max length won't make any difference in terms of performance.

Regards,
Olivier

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-01-05  8:53 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-15 20:20 [dpdk-dev] two tso related questions Helmut Sim
2014-12-16  9:10 ` Alex Markuze
2014-12-16 12:24   ` Helmut Sim
2014-12-16 14:04     ` Alex Markuze
2014-12-17  7:17       ` Helmut Sim
2014-12-17 13:02         ` Olivier MATZ
2015-01-04  8:50           ` Helmut Sim
2015-01-04  9:57             ` Alex Markuze
2015-01-04 10:13               ` Helmut Sim
2015-01-05  8:53                 ` Olivier MATZ

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).