From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f172.google.com (mail-wi0-f172.google.com [209.85.212.172]) by dpdk.org (Postfix) with ESMTP id 5F0BC5A73 for ; Fri, 27 Feb 2015 15:13:59 +0100 (CET) Received: by wiwh11 with SMTP id h11so500724wiw.3 for ; Fri, 27 Feb 2015 06:13:59 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:organization :user-agent:in-reply-to:references:mime-version :content-transfer-encoding:content-type; bh=Gwe8o6EmQm3G/r2RVVmmhhfpmmXH5qZMz292iBl4VUI=; b=R6RdtbNGNRRIbB2Y4aPg32dOfXxGvyi2SxzzwwgxAfsO+r/v7/6DFwr4/ttxQ3RwsV Za1KyJQlZVRYvRMNvzo1EV8rJPOg1JUlFf09nDYUqlcu38DRVjuUmyYPzDgk6j+6bpeg B0QB3fTTmiMm55M4QImaDMs1sKMBBJYd/4DxowvsQNtfqej3BdY39ec/1cfIFR5ikbFY RberhG69kcZi5Z07hQl72eeREcSKICrwH2ZSt1XFKrh7WTmlm9C7ayyxuRCenL+QcXrE DgSMh/l1nYwkpq3QN14O5AfBr1t2Eh3DmzR4IRjuAKR3u+1xVUdxrU6Gfj5JiT2WJEn5 R0Gg== X-Gm-Message-State: ALoCoQn3oIjyImdLiRlzSOmD25cCTfLW44nK+NwpCuNEEuugWqUTc0f/LtX9oo02rHdrhOVlzmTg X-Received: by 10.180.206.14 with SMTP id lk14mr6814862wic.71.1425046439177; Fri, 27 Feb 2015 06:13:59 -0800 (PST) Received: from xps13.localnet (136-92-190-109.dsl.ovh.fr. [109.190.92.136]) by mx.google.com with ESMTPSA id j9sm6151589wjy.18.2015.02.27.06.13.57 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 27 Feb 2015 06:13:58 -0800 (PST) From: Thomas Monjalon To: "Liang, Cunming" Date: Fri, 27 Feb 2015 15:13:21 +0100 Message-ID: <7961783.gxFFSSThyb@xps13> Organization: 6WIND User-Agent: KMail/4.14.4 (Linux/3.18.4-1-ARCH; KDE/4.14.4; x86_64; ; ) In-Reply-To: References: <1424710542-14637-1-git-send-email-danny.zhou@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Cc: dev@dpdk.org Subject: Re: [dpdk-dev] [PATCH v6 4/8] eal/linux: add per rx queue interrupt handling based on VFIO X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Feb 2015 14:13:59 -0000 Hi Cunming, First, sorry to have to say that, but it is not easy to read discussion= s where quote marks are not used. I re-insert them for clarity. Comments below. 2015-02-27 12:22, Liang, Cunming: > From: David Marchand [mailto:david.marchand@6wind.com] > Sent: Friday, February 27, 2015 6:34 PM >=20 > > I am not really comfortable with this api. > >=20 > > This is just creating something on top of the standard epoll api wi= th > > limitations. In the end, we could just use an external lib that doe= s this > > already. >=20 > [Liang, Cunming] Not really, I think. We try to protect the data insi= de > =E2=80=98rte_intr_handle=E2=80=99, it doesn=E2=80=99t expect user to = understand the things defined > inside =E2=80=98rte_intr_handle=E2=80=99. > It=E2=80=99s better typedef =E2=80=98rte_intr_handle=E2=80=99 as a ra= w integer ID, having a function > to get it from a ethdev. Then all the interrupt api is around it. > It provides the common pci NIC devices rxtx interrupt processing appr= oach. > For the limitations, we can fix it step by step. >=20 > > So ok, this will work for your limited use case, but this will not = be > > really useful for anything else. > > Not sure it has its place in eal, this is more an example to me. >=20 > [Liang, Cunming] =E2=80=98limited use case=E2=80=99 do you means only= for rxtx ? > It don=E2=80=99t expect to provide a generic event mechanism (like li= bev/libevent > does), but a simple way to allow PMD work with DMA interrupt. It main= ly > abstract for rx interrupt purpose. I appreciate if you could help to = list > more useful cases. You don't expect to provide a generic event mechanism but application developpers could need to wait for many events at once, not only Rx one= s. That's why it's better to provide only the needed parts to use somethin= g generic like libevent. And we should avoid reinventing the wheel. > > > +static void > > > +eal_intr_process_rxtx_interrupts(struct rte_intr_handle *intr_ha= ndle, > > > + struct epoll_event *events, > > > + uint32_t *vec, int nfds) > > > +{ > > > + int i, bytes_read; > > > + union rte_intr_read_buffer buf; > > > + int fd; > > > + > > > + for (i =3D 0; i < nfds; i++) { > > > + /* set the length to be read for different handle= type */ > > > + switch (intr_handle->type) { > > > + case RTE_INTR_HANDLE_UIO: > > > + bytes_read =3D sizeof(buf.uio_intr_count)= ; > > > + break; > > > + case RTE_INTR_HANDLE_ALARM: > > > + bytes_read =3D sizeof(buf.timerfd_num); > > > + break; > > > +#ifdef VFIO_PRESENT > > > + case RTE_INTR_HANDLE_VFIO_MSIX: > > > + case RTE_INTR_HANDLE_VFIO_MSI: > > > + case RTE_INTR_HANDLE_VFIO_LEGACY: > > > + bytes_read =3D sizeof(buf.vfio_intr_count= ); > > > + break; > > > +#endif > > > + default: > > > + bytes_read =3D 1; > > > + break; > > > + } > > > + > > > + /** > > > + * read out to clear the ready-to-be-read flag > > > + * for epoll_wait. > > > + */ > > > + vec[i] =3D events[i].data.u32; > > > + assert(vec[i] < VFIO_MAX_RXTX_INTR_ID); > > > + > > > + fd =3D intr_handle->efds[vec[i]]; > > > + bytes_read =3D read(fd, &buf, bytes_read); > > > + if (bytes_read < 0) > > > + RTE_LOG(ERR, EAL, "Error reading from fil= e " > > > + "descriptor %d: %s\n", fd, strerr= or(errno)); > > > + else if (bytes_read =3D=3D 0) > > > + RTE_LOG(ERR, EAL, "Read nothing from file= " > > > + "descriptor %d\n", fd); > > > + } > > > +} > >=20 > > Why unconditionnally read ? > > You are absorbing events from the application if the application ga= ve you > > an external epfd and populated it with its own fds. >=20 > [Liang, Cunming] The vector number was checked. If an external epfd > populated some event carry fd rather than a data.u32 but the value > inside the valid range, it considers as a valid vector number. No mat= ter > the read success or not, it always notify the event. Do you have any > suggestion used here to check the condition ? >=20 > > > +static int init_tls_epfd(void) > > > +{ > > > + int pfd =3D epoll_create(1); > > > + if (pfd < 0) { > > > + RTE_LOG(ERR, EAL, > > > + "Cannot create epoll instance\n"); > > > + return -1; > > > + } > > > + return pfd; > > > +} > > > + > > > +int > > > +rte_intr_rx_wait(struct rte_intr_handle *intr_handle, int epfd, > > > + uint32_t *vec, uint16_t num) > > > +{ > >=20 > > In the end, this "rx" does not mean anything to eal. >=20 > [Liang, Cunming] That=E2=80=99s a good point. I tried to remove =E2=80= =98rx=E2=80=99 and use a > generic word here. =E2=80=98rte_intr_wait=E2=80=99 looks like too gen= eric, > =E2=80=98rte_intr_epfd_wait=E2=80=99 looks not abstract with bsd. > As the function only serves for rxtx vector, so using the rx prefix. > Which name do you prefer ? You should understand that you are trying to wrongly replace a generic = lib. The best name is probably /dev/null. > > > +#define MAX_EVENTS 8 > > > + struct epoll_event events[MAX_EVENTS]; > > > + int ret, nfds =3D 0; > > > + > > > + if (!intr_handle || !vec) { > > > + RTE_LOG(ERR, EAL, "invalid input parameter\n"); > > > + return -1; > > > + } > > > + > > > + if (intr_handle->type !=3D RTE_INTR_HANDLE_VFIO_MSIX) { > > > + RTE_LOG(ERR, EAL, "intr type should be VFIO_MSIX\= n"); > > > + return -1; > > > + } > > > + > > > + if (epfd =3D=3D RTE_EPOLL_FD_ANY) { > > > + /* using per thread epoll fd */ > > > + if (unlikely(RTE_PER_LCORE(_epfd) =3D=3D -1)) > > > + RTE_PER_LCORE(_epfd) =3D init_tls_epfd();= > > > + epfd =3D RTE_PER_LCORE(_epfd); > > > + } > >=20 > > Rather than testing every time, this should be set by the caller, > > i.e. epfd is always valid. > > If application does not want to create a epfd, then it calls > > rte_intr_rx_wait with RTE_EPOLL_FD_ANY (this name is not well chose= n) > > that is a macro wrapped to RTE_PER_LCORE(_epfd). >=20 > [Liang, Cunming] It sounds good to me. As we don=E2=80=99t expect to = expose > *rte_per_lcore__epfd* as an public symbol, so will define rte_epfd() > instread. > Within rte_epfd(), if RTE_PER_LCORE(_epfd) not assigned, then > init_tls_epfd() once. >=20 > > init_tls_epfd() should be called only once at init time. > > No need to check every time. >=20 > [Liang, Cunming] As it probably not need per thread epfd at all. > So I prefer to create it when it real needed as above I mentioned. > > > + do { > > > + ret =3D epoll_wait(epfd, events, > > > + RTE_MIN(num, MAX_EVENTS), > > > + EAL_INTR_EPOLL_WAIT_FOREVER); > > > + if (unlikely(ret < 0)) { > > > + /* epoll_wait fail */ > > > + RTE_LOG(ERR, EAL, "epoll_wait returns wit= h fail\n"); > > > + return -1; > > > + } else if (ret > 0) { > > > + /* epoll_wait has at least one fd ready t= o read */ > > > + eal_intr_process_rxtx_interrupts(intr_han= dle, events, > > > + vec, ret= ); > > > + num -=3D ret; > > > + vec +=3D ret; > > > + nfds +=3D ret; > > > + } else if (nfds > 0) > > > + break; > > > + } while (num > 0); > > > + > > > + return nfds; > > > +} > > > > You are blocking unless all fds have been set, so you are serialisi= ng > > all events. >=20 > [Liang, Cunming] I=E2=80=99m not sure fully got your point. If any ev= ent arrives, > it gets back. Do you means if no fds added in, it=E2=80=99s always bl= ocking. > You expect to have a timeout return ? > > > RTE_LOG(ERR, EAL, " cannot get IRQ info,= " > > >=20 > > > - "error %i (%s)\n", errno,= strerror(errno)); > > > + "error %i (%s)\n", errno, strerro= r(errno)); > >=20 > > Garbage, this has nothing to do with the patch. >=20 > [Liang, Cunming] It=E2=80=99s for line number exceed 80 margin compla= in. The title of the patch is "add per rx queue interrupt handling based on= VFIO". So this kind of modification is a garbage. Sorry, I won't play the game "idem / the same" below ;) > > > - if (internal_config.vfio_intr_mode !=3D R= TE_INTR_MODE_NONE) { > > > + if (internal_config.vfio_intr_mode !=3D > > > + RTE_INTR_MODE_NONE) { > > > RTE_LOG(ERR, EAL, > > > - " interrupt vect= or does not support eventfd!\n"); > > > + " interrupt vector " > > > + "does not support eventfd= !\n");> >=20 > >=20 > > Idem. >=20 > [Liang, Cunming] The same. > > > - "error %i (%s)\n", errno,= strerror(errno)); > > > + "error %i (%s)\n", errno, strerro= r(errno)); > >=20 > > Idem. >=20 > [Liang, Cunming] The same. > > > dev->intr_handle.vfio_dev_fd =3D vfio_dev_fd; > > > - > >=20 > > Idem. >=20 > [Liang, Cunming] Accept.