DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Lam, Tiago" <tiago.lam@intel.com>
To: Ferruh Yigit <ferruh.yigit@intel.com>, dev@dpdk.org
Cc: linville@tuxdriver.com, Bruce Richardson <bruce.richardson@intel.com>
Subject: Re: [dpdk-dev] af_packet dev default "framesz" of 2048B
Date: Tue, 20 Nov 2018 10:28:43 +0000	[thread overview]
Message-ID: <fe7b3fc7-4cec-6086-2e06-9178169cc3d1@intel.com> (raw)
In-Reply-To: <507e0be8-f58b-9d8b-873f-9e1d45f14e93@intel.com>

On 16/11/2018 17:20, Ferruh Yigit wrote:
> On 11/15/2018 7:02 PM, Lam, Tiago wrote:
>> Hi guys,
>>
>> OvS-DPDK has recently had small a change that changed the data room
>> available in an mbuf (commit dfaf00e in OvS). This seems to have had the
>> consequence of breaking the initialisation of eth_af_packets interfaces,
>> when using default values ("options:dpdk-
>> devargs=eth_af_packet0,iface=enp61s0f3").
>>
>> After investigating, what seems to be happening is that the
>> eth_af_packet dev expects an available space of "2048B - TPACKET2_HDRLEN
>> + sizeof(struct sockaddr_ll) = 2016B" to be available in the data room
>> of each mbuf.  Previous to the above commit, OvS would allocate some
>> extra space, and this would mean there would be enough room for the
>> checks performed in eth_rx_queue_setup() and eth_dev_mtu_set() in
>> rte_eth_af_packet.c. However, with the recent commit that isn't the case
>> anymore, and without that extra space the first check in
>> eth_rx_queue_setup() will now be hit and setup of a eth_af_packet
>> interface fails.
>>
>> What I'm trying to understand here is, the logic behind setting a
>> default 'framesz' of 2048B and it being hardcoded (instead of being
>> based on the underlying MTU of the interface, or the mbuf data room
>> directly). The documentation in [1] for mmap() and setting up buffer
>> rings mentions the exact same values
>> (tp_block_size=4096,tp_frame_size=2048), which seem to have been
>> introduced on the first commit, back in 2014. The only constraint
>> for the framesize, it seems, its that it fits inside the blocksize (i.e.
>> doesn't span multiple blocksizes), and is aligned to TPACKET_ALIGNMENT.
> 
> Independent from Bruce's comment to not use smaller mbuf size, I believe
> af_packet can be updated to be more flexible.
> 
> Related to 'framesize', it has default value 2048 but can be updated via
> 'framesz=' devarg, so not exactly hardcoded. Your change looks good there,
> use devarg value if exist, get from underlying device and set default if it fails.
> 
> A few comment below.

Thanks for the review, Ferruh. This was only a proposal, but since you
reviewed it, I've addressed the comments and sent a formal patch:
http://patchwork.dpdk.org/patch/48200/

Thanks,
Tiago.

> 
>>
>> Thus, given those constraints, could we instead base the setting of the
>> framesize like on the below patch? It basically tries to set the
>> framesize to the MTU of the underlying interface +  TPACKET2_HDRLEN +
>> sizeof(structsockaddr_ll), aligned to TPACKET_ALIGNMENT. This would
>> allow applications such as OvS to use this interface by default, based
>> on a framesize set dynamically, instead of relying on a default of
>> 2048B; If one is setting up an eth_af_packet interface with MTU 9000B
>> and the underlying MTU is 1500B, for example, this will still fail, and
>> that's fine as they can always supply the "framesz" argument
>> ("options:dpdk-devargs=eth_af_packet0,iface=enp61s0f3,framesz=9000").
>> However, on the default case, or when one has configured manually the
>> underlying interface with the expected MTU already, this would work out
>> of the box.
>>
>> Any thoughts on this?
>>
>> Thanks,
>> Tiago.
>>
>> diff --git a/drivers/net/af_packet/rte_eth_af_packet.c b/drivers/net/af_packet/rte_eth_af_packet.c
>> index 95a98c6..451cec3 100644
>> --- a/drivers/net/af_packet/rte_eth_af_packet.c
>> +++ b/drivers/net/af_packet/rte_eth_af_packet.c
>> @@ -24,6 +24,8 @@
>>  #include <unistd.h>
>>  #include <poll.h>
>>
>> +#define RTE_ROUNDUP(x, y) ((((x) + ((y) - 1)) / (y)) * (y))
> 
> This can go into more generic header
> 
> <...>
> 
>> @@ -622,6 +617,38 @@ rte_pmd_init_internals(struct rte_vdev_device *dev,
>>  		return -1;
>>  	}
>>  	memcpy(&(*internals)->eth_addr, ifr.ifr_hwaddr.sa_data, ETH_ALEN);
>> +	if (!framesize) {
>> +		if (ioctl(sockfd, SIOCGIFMTU, &ifr) == -1) {
>> +			RTE_LOG(ERR, PMD,
>> +				"%s: ioctl failed (SIOCGIFMTU)",
>> +				name);
>> +			framesize = DFLT_FRAME_SIZE;
>> +		} else {
>> +			framesize = ifr.ifr_mtu;
>> +			framesize += TPACKET2_HDRLEN + sizeof(struct sockaddr_ll);
>> +			framesize = RTE_ROUNDUP(framesize, TPACKET_ALIGNMENT);
>> +        }
>> +	}
>> +
>> +	blockcnt = framecnt / (blocksize / framesize);
> 
> If returned framesize > blocksize, result will be 0 since both are uint, which
> will end framecnt/0 and crash.
> 
> <...>
> 
>> @@ -887,21 +913,8 @@ rte_eth_from_packet(struct rte_vdev_device *dev,
>>  		return -1;
>>  	}
>>
>> -	blockcount = framecount / (blocksize / framesize);
>> -	if (!blockcount) {
>> -		PMD_LOG(ERR,
>> -			"%s: invalid AF_PACKET MMAP parameters", name);
>> -		return -1;
>> -	}
>> -
>> -	PMD_LOG(INFO, "%s: AF_PACKET MMAP parameters:", name);
>> -	PMD_LOG(INFO, "%s:\tblock size %d", name, blocksize);
>> -	PMD_LOG(INFO, "%s:\tblock count %d", name, blockcount);
>> -	PMD_LOG(INFO, "%s:\tframe size %d", name, framesize);
>> -	PMD_LOG(INFO, "%s:\tframe count %d", name, framecount);
>> -
> 
> Why move this block or change rte_pmd_init_internals(), adding SIOCGIFMTU above
> this block looks simpler.
> 

      reply	other threads:[~2018-11-20 10:28 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-15 19:02 Lam, Tiago
2018-11-16 10:29 ` Burakov, Anatoly
2018-11-16 10:37 ` Bruce Richardson
2018-11-16 17:20 ` Ferruh Yigit
2018-11-20 10:28   ` Lam, Tiago [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fe7b3fc7-4cec-6086-2e06-9178169cc3d1@intel.com \
    --to=tiago.lam@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=linville@tuxdriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).