Doing basic operations like info_get or get_stats was broken in af_xdp PMD. The info_get would crash because dev->device was NULL in secondary process. Fix this by doing same initialization as af_packet and tap devices. The get_stats would crash because the XDP socket is not open in primary process. As a workaround don't query kernel for dropped packets when called from secondary process. Note: this does not address the other bug which is that transmitting in secondary process is broken because the send() in tx_kick will fail because XDP socket fd is not valid in secondary process. Bugzilla ID: 805 Fixes: f1debd77efaf ("net/af_xdp: introduce AF_XDP PMD") Cc: stable@dpdk.org Cc: xiaolong.ye@intel.com Ciara Loftus <ciara.loftus@intel.com> Qi Zhang <qi.z.zhang@intel.com> Anatoly Burakov <anatoly.burakov@intel.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> --- drivers/net/af_xdp/rte_eth_af_xdp.c | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c index 74ffa4511284..70abc14fa753 100644 --- a/drivers/net/af_xdp/rte_eth_af_xdp.c +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c @@ -860,7 +860,7 @@ eth_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats) struct pkt_rx_queue *rxq; struct pkt_tx_queue *txq; socklen_t optlen; - int i, ret; + int i; for (i = 0; i < dev->data->nb_rx_queues; i++) { optlen = sizeof(struct xdp_statistics); @@ -876,13 +876,12 @@ eth_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats) stats->ibytes += stats->q_ibytes[i]; stats->imissed += rxq->stats.rx_dropped; stats->oerrors += txq->stats.tx_dropped; - ret = getsockopt(xsk_socket__fd(rxq->xsk), SOL_XDP, - XDP_STATISTICS, &xdp_stats, &optlen); - if (ret != 0) { - AF_XDP_LOG(ERR, "getsockopt() failed for XDP_STATISTICS.\n"); - return -1; - } - stats->imissed += xdp_stats.rx_dropped; + + /* The socket fd is not valid in secondary process */ + if (rte_eal_process_type() != RTE_PROC_SECONDARY && + getsockopt(xsk_socket__fd(rxq->xsk), SOL_XDP, + XDP_STATISTICS, &xdp_stats, &optlen) == 0) + stats->imissed += xdp_stats.rx_dropped; stats->opackets += stats->q_opackets[i]; stats->obytes += stats->q_obytes[i]; @@ -1799,7 +1798,9 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) AF_XDP_LOG(ERR, "Failed to probe %s\n", name); return -EINVAL; } + /* TODO: reconnect socket from primary */ eth_dev->dev_ops = &ops; + eth_dev->device = &dev->device; rte_eth_dev_probing_finish(eth_dev); return 0; } -- 2.30.2
On 9/3/2021 5:15 PM, Stephen Hemminger wrote: > Doing basic operations like info_get or get_stats was broken > in af_xdp PMD. The info_get would crash because dev->device > was NULL in secondary process. Fix this by doing same initialization > as af_packet and tap devices. > > The get_stats would crash because the XDP socket is not open in > primary process. As a workaround don't query kernel for dropped > packets when called from secondary process. > > Note: this does not address the other bug which is that transmitting > in secondary process is broken because the send() in tx_kick > will fail because XDP socket fd is not valid in secondary process. > > Bugzilla ID: 805 > Fixes: f1debd77efaf ("net/af_xdp: introduce AF_XDP PMD") > Cc: stable@dpdk.org > Cc: xiaolong.ye@intel.com > Ciara Loftus <ciara.loftus@intel.com> > Qi Zhang <qi.z.zhang@intel.com> > Anatoly Burakov <anatoly.burakov@intel.com> > +cc Ciara, Qi & Anatoly > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> > --- > drivers/net/af_xdp/rte_eth_af_xdp.c | 17 +++++++++-------- > 1 file changed, 9 insertions(+), 8 deletions(-) > > diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c b/drivers/net/af_xdp/rte_eth_af_xdp.c > index 74ffa4511284..70abc14fa753 100644 > --- a/drivers/net/af_xdp/rte_eth_af_xdp.c > +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c > @@ -860,7 +860,7 @@ eth_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats) > struct pkt_rx_queue *rxq; > struct pkt_tx_queue *txq; > socklen_t optlen; > - int i, ret; > + int i; > > for (i = 0; i < dev->data->nb_rx_queues; i++) { > optlen = sizeof(struct xdp_statistics); > @@ -876,13 +876,12 @@ eth_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats) > stats->ibytes += stats->q_ibytes[i]; > stats->imissed += rxq->stats.rx_dropped; > stats->oerrors += txq->stats.tx_dropped; > - ret = getsockopt(xsk_socket__fd(rxq->xsk), SOL_XDP, > - XDP_STATISTICS, &xdp_stats, &optlen); > - if (ret != 0) { > - AF_XDP_LOG(ERR, "getsockopt() failed for XDP_STATISTICS.\n"); > - return -1; > - } > - stats->imissed += xdp_stats.rx_dropped; > + > + /* The socket fd is not valid in secondary process */ > + if (rte_eal_process_type() != RTE_PROC_SECONDARY && > + getsockopt(xsk_socket__fd(rxq->xsk), SOL_XDP, > + XDP_STATISTICS, &xdp_stats, &optlen) == 0) > + stats->imissed += xdp_stats.rx_dropped; > > stats->opackets += stats->q_opackets[i]; > stats->obytes += stats->q_obytes[i]; > @@ -1799,7 +1798,9 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev) > AF_XDP_LOG(ERR, "Failed to probe %s\n", name); > return -EINVAL; > } > + /* TODO: reconnect socket from primary */ > eth_dev->dev_ops = &ops; > + eth_dev->device = &dev->device; > rte_eth_dev_probing_finish(eth_dev); > return 0; > } >
> -----Original Message----- > From: dev <dev-bounces@dpdk.org> On Behalf Of Stephen Hemminger > Sent: Friday 3 September 2021 17:15 > To: dev@dpdk.org > Cc: Stephen Hemminger <stephen@networkplumber.org>; > stable@dpdk.org; xiaolong.ye@intel.com > Subject: [dpdk-dev] [PATCH] net/af_xdp: fix support of secondary process > > Doing basic operations like info_get or get_stats was broken > in af_xdp PMD. The info_get would crash because dev->device > was NULL in secondary process. Fix this by doing same initialization > as af_packet and tap devices. > > The get_stats would crash because the XDP socket is not open in > primary process. As a workaround don't query kernel for dropped > packets when called from secondary process. > > Note: this does not address the other bug which is that transmitting > in secondary process is broken because the send() in tx_kick > will fail because XDP socket fd is not valid in secondary process. Hi Stephen, Apologies for the delayed reply, I was on vacation. In the Bugzilla report you suggest we: "mark AF_XDP as broken in with primary/secondary and return an error in probe in secondary process". I agree with this suggestion. However with this patch we still permit secondary, and just make sure it doesn't crash for get_stats. Did you change your mind? Personally, I would prefer to have primary/secondary either working 100% or else not allowed at all by throwing an error during probe. What do you think? Do you have a reason/use case to permit secondary processes despite some features not being available eg. full stats, tx? Thanks, Ciara > > Bugzilla ID: 805 > Fixes: f1debd77efaf ("net/af_xdp: introduce AF_XDP PMD") > Cc: stable@dpdk.org > Cc: xiaolong.ye@intel.com > Ciara Loftus <ciara.loftus@intel.com> > Qi Zhang <qi.z.zhang@intel.com> > Anatoly Burakov <anatoly.burakov@intel.com> > > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> > --- > drivers/net/af_xdp/rte_eth_af_xdp.c | 17 +++++++++-------- > 1 file changed, 9 insertions(+), 8 deletions(-) > > diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c > b/drivers/net/af_xdp/rte_eth_af_xdp.c > index 74ffa4511284..70abc14fa753 100644 > --- a/drivers/net/af_xdp/rte_eth_af_xdp.c > +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c > @@ -860,7 +860,7 @@ eth_stats_get(struct rte_eth_dev *dev, struct > rte_eth_stats *stats) > struct pkt_rx_queue *rxq; > struct pkt_tx_queue *txq; > socklen_t optlen; > - int i, ret; > + int i; > > for (i = 0; i < dev->data->nb_rx_queues; i++) { > optlen = sizeof(struct xdp_statistics); > @@ -876,13 +876,12 @@ eth_stats_get(struct rte_eth_dev *dev, struct > rte_eth_stats *stats) > stats->ibytes += stats->q_ibytes[i]; > stats->imissed += rxq->stats.rx_dropped; > stats->oerrors += txq->stats.tx_dropped; > - ret = getsockopt(xsk_socket__fd(rxq->xsk), SOL_XDP, > - XDP_STATISTICS, &xdp_stats, &optlen); > - if (ret != 0) { > - AF_XDP_LOG(ERR, "getsockopt() failed for > XDP_STATISTICS.\n"); > - return -1; > - } > - stats->imissed += xdp_stats.rx_dropped; > + > + /* The socket fd is not valid in secondary process */ > + if (rte_eal_process_type() != RTE_PROC_SECONDARY && > + getsockopt(xsk_socket__fd(rxq->xsk), SOL_XDP, > + XDP_STATISTICS, &xdp_stats, &optlen) == 0) > + stats->imissed += xdp_stats.rx_dropped; > > stats->opackets += stats->q_opackets[i]; > stats->obytes += stats->q_obytes[i]; > @@ -1799,7 +1798,9 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device > *dev) > AF_XDP_LOG(ERR, "Failed to probe %s\n", name); > return -EINVAL; > } > + /* TODO: reconnect socket from primary */ > eth_dev->dev_ops = &ops; > + eth_dev->device = &dev->device; > rte_eth_dev_probing_finish(eth_dev); > return 0; > } > -- > 2.30.2
On Mon, 20 Sep 2021 13:23:57 +0000
"Loftus, Ciara" <ciara.loftus@intel.com> wrote:
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Stephen Hemminger
> > Sent: Friday 3 September 2021 17:15
> > To: dev@dpdk.org
> > Cc: Stephen Hemminger <stephen@networkplumber.org>;
> > stable@dpdk.org; xiaolong.ye@intel.com
> > Subject: [dpdk-dev] [PATCH] net/af_xdp: fix support of secondary process
> >
> > Doing basic operations like info_get or get_stats was broken
> > in af_xdp PMD. The info_get would crash because dev->device
> > was NULL in secondary process. Fix this by doing same initialization
> > as af_packet and tap devices.
> >
> > The get_stats would crash because the XDP socket is not open in
> > primary process. As a workaround don't query kernel for dropped
> > packets when called from secondary process.
> >
> > Note: this does not address the other bug which is that transmitting
> > in secondary process is broken because the send() in tx_kick
> > will fail because XDP socket fd is not valid in secondary process.
>
> Hi Stephen,
>
> Apologies for the delayed reply, I was on vacation.
>
> In the Bugzilla report you suggest we:
> "mark AF_XDP as broken in with primary/secondary
> and return an error in probe in secondary process".
> I agree with this suggestion. However with this patch we still permit secondary, and just make sure it doesn't crash for get_stats. Did you change your mind?
> Personally, I would prefer to have primary/secondary either working 100% or else not allowed at all by throwing an error during probe. What do you think? Do you have a reason/use case to permit secondary processes despite some features not being available eg. full stats, tx?
>
> Thanks,
> Ciara
There are two cases where secondary is useful even if send/receive can't work from secondary process.
The pdump and proc-info applications can work with these patches.
I am using XDP over pdump as an easy way to get packets into the code for testing.
The flag in the documentation doesn't have a "limited" version.
If you want, will send another patch to disable secondary support.
Supporting secondary, means adding a mechanism to pass the socket around.
> > On Mon, 20 Sep 2021 13:23:57 +0000 > "Loftus, Ciara" <ciara.loftus@intel.com> wrote: > > > > -----Original Message----- > > > From: dev <dev-bounces@dpdk.org> On Behalf Of Stephen Hemminger > > > Sent: Friday 3 September 2021 17:15 > > > To: dev@dpdk.org > > > Cc: Stephen Hemminger <stephen@networkplumber.org>; > > > stable@dpdk.org; xiaolong.ye@intel.com > > > Subject: [dpdk-dev] [PATCH] net/af_xdp: fix support of secondary > process > > > > > > Doing basic operations like info_get or get_stats was broken > > > in af_xdp PMD. The info_get would crash because dev->device > > > was NULL in secondary process. Fix this by doing same initialization > > > as af_packet and tap devices. > > > > > > The get_stats would crash because the XDP socket is not open in > > > primary process. As a workaround don't query kernel for dropped > > > packets when called from secondary process. > > > > > > Note: this does not address the other bug which is that transmitting > > > in secondary process is broken because the send() in tx_kick > > > will fail because XDP socket fd is not valid in secondary process. > > > > Hi Stephen, > > > > Apologies for the delayed reply, I was on vacation. > > > > In the Bugzilla report you suggest we: > > "mark AF_XDP as broken in with primary/secondary > > and return an error in probe in secondary process". > > I agree with this suggestion. However with this patch we still permit > secondary, and just make sure it doesn't crash for get_stats. Did you change > your mind? > > Personally, I would prefer to have primary/secondary either working 100% > or else not allowed at all by throwing an error during probe. What do you > think? Do you have a reason/use case to permit secondary processes despite > some features not being available eg. full stats, tx? > > > > Thanks, > > Ciara > > There are two cases where secondary is useful even if send/receive can't > work from secondary process. > The pdump and proc-info applications can work with these patches. > > I am using XDP over pdump as an easy way to get packets into the code for > testing. > > The flag in the documentation doesn't have a "limited" version. > If you want, will send another patch to disable secondary support. Thanks for explaining. Since there are use cases for secondary, even if the functionality is limited, I don't think it should be disabled. Since we can't flag it as 'limited' in the feature matrix, could you please add a note about the send/receive limitation in the AF_XDP PMD documentation in a v2? There are already a number of limitations listed, which you can add to. Thanks, Ciara > > Supporting secondary, means adding a mechanism to pass the socket > around.
On Mon, 20 Sep 2021 15:09:36 +0000
"Loftus, Ciara" <ciara.loftus@intel.com> wrote:
> >
> > On Mon, 20 Sep 2021 13:23:57 +0000
> > "Loftus, Ciara" <ciara.loftus@intel.com> wrote:
> >
> > > > -----Original Message-----
> > > > From: dev <dev-bounces@dpdk.org> On Behalf Of Stephen Hemminger
> > > > Sent: Friday 3 September 2021 17:15
> > > > To: dev@dpdk.org
> > > > Cc: Stephen Hemminger <stephen@networkplumber.org>;
> > > > stable@dpdk.org; xiaolong.ye@intel.com
> > > > Subject: [dpdk-dev] [PATCH] net/af_xdp: fix support of secondary
> > process
> > > >
> > > > Doing basic operations like info_get or get_stats was broken
> > > > in af_xdp PMD. The info_get would crash because dev->device
> > > > was NULL in secondary process. Fix this by doing same initialization
> > > > as af_packet and tap devices.
> > > >
> > > > The get_stats would crash because the XDP socket is not open in
> > > > primary process. As a workaround don't query kernel for dropped
> > > > packets when called from secondary process.
> > > >
> > > > Note: this does not address the other bug which is that transmitting
> > > > in secondary process is broken because the send() in tx_kick
> > > > will fail because XDP socket fd is not valid in secondary process.
> > >
> > > Hi Stephen,
> > >
> > > Apologies for the delayed reply, I was on vacation.
> > >
> > > In the Bugzilla report you suggest we:
> > > "mark AF_XDP as broken in with primary/secondary
> > > and return an error in probe in secondary process".
> > > I agree with this suggestion. However with this patch we still permit
> > secondary, and just make sure it doesn't crash for get_stats. Did you change
> > your mind?
> > > Personally, I would prefer to have primary/secondary either working 100%
> > or else not allowed at all by throwing an error during probe. What do you
> > think? Do you have a reason/use case to permit secondary processes despite
> > some features not being available eg. full stats, tx?
> > >
> > > Thanks,
> > > Ciara
> >
> > There are two cases where secondary is useful even if send/receive can't
> > work from secondary process.
> > The pdump and proc-info applications can work with these patches.
> >
> > I am using XDP over pdump as an easy way to get packets into the code for
> > testing.
> >
> > The flag in the documentation doesn't have a "limited" version.
> > If you want, will send another patch to disable secondary support.
>
> Thanks for explaining. Since there are use cases for secondary, even if the functionality is limited, I don't think it should be disabled.
> Since we can't flag it as 'limited' in the feature matrix, could you please add a note about the send/receive limitation in the AF_XDP PMD documentation in a v2? There are already a number of limitations listed, which you can add to.
>
> Thanks,
> Ciara
>
> >
> > Supporting secondary, means adding a mechanism to pass the socket
> > around.
Looking at this in more detail, and my recommendation is:
For 21.11 release (and mark it as Fixes so it gets backported). Have the driver
return -ENOTSUP in secondary process.
For 22.03 add real secondary support using the rte_mp_msg to pass necessary
state to secondary process. Includes socket (and other memory regions?).
The pdump and proc-info cases are only useful for developer testing, and there are
other ways to do the same thing.
> > > > >
> > > > > Doing basic operations like info_get or get_stats was broken
> > > > > in af_xdp PMD. The info_get would crash because dev->device
> > > > > was NULL in secondary process. Fix this by doing same initialization
> > > > > as af_packet and tap devices.
> > > > >
> > > > > The get_stats would crash because the XDP socket is not open in
> > > > > primary process. As a workaround don't query kernel for dropped
> > > > > packets when called from secondary process.
> > > > >
> > > > > Note: this does not address the other bug which is that transmitting
> > > > > in secondary process is broken because the send() in tx_kick
> > > > > will fail because XDP socket fd is not valid in secondary process.
> > > >
> > > > Hi Stephen,
> > > >
> > > > Apologies for the delayed reply, I was on vacation.
> > > >
> > > > In the Bugzilla report you suggest we:
> > > > "mark AF_XDP as broken in with primary/secondary
> > > > and return an error in probe in secondary process".
> > > > I agree with this suggestion. However with this patch we still permit
> > > secondary, and just make sure it doesn't crash for get_stats. Did you
> change
> > > your mind?
> > > > Personally, I would prefer to have primary/secondary either working
> 100%
> > > or else not allowed at all by throwing an error during probe. What do you
> > > think? Do you have a reason/use case to permit secondary processes
> despite
> > > some features not being available eg. full stats, tx?
> > > >
> > > > Thanks,
> > > > Ciara
> > >
> > > There are two cases where secondary is useful even if send/receive can't
> > > work from secondary process.
> > > The pdump and proc-info applications can work with these patches.
> > >
> > > I am using XDP over pdump as an easy way to get packets into the code
> for
> > > testing.
> > >
> > > The flag in the documentation doesn't have a "limited" version.
> > > If you want, will send another patch to disable secondary support.
> >
> > Thanks for explaining. Since there are use cases for secondary, even if the
> functionality is limited, I don't think it should be disabled.
> > Since we can't flag it as 'limited' in the feature matrix, could you please add
> a note about the send/receive limitation in the AF_XDP PMD documentation
> in a v2? There are already a number of limitations listed, which you can add
> to.
> >
> > Thanks,
> > Ciara
> >
> > >
> > > Supporting secondary, means adding a mechanism to pass the socket
> > > around.
>
> Looking at this in more detail, and my recommendation is:
> For 21.11 release (and mark it as Fixes so it gets backported). Have the driver
> return -ENOTSUP in secondary process.
>
> For 22.03 add real secondary support using the rte_mp_msg to pass
> necessary
> state to secondary process. Includes socket (and other memory regions?).
>
> The pdump and proc-info cases are only useful for developer testing, and
> there are
> other ways to do the same thing.
+1 that sounds reasonable to me.
Thanks,
Ciara
On 9/23/2021 1:33 PM, Loftus, Ciara wrote: >>>>>> >>>>>> Doing basic operations like info_get or get_stats was broken >>>>>> in af_xdp PMD. The info_get would crash because dev->device >>>>>> was NULL in secondary process. Fix this by doing same initialization >>>>>> as af_packet and tap devices. >>>>>> >>>>>> The get_stats would crash because the XDP socket is not open in >>>>>> primary process. As a workaround don't query kernel for dropped >>>>>> packets when called from secondary process. >>>>>> >>>>>> Note: this does not address the other bug which is that transmitting >>>>>> in secondary process is broken because the send() in tx_kick >>>>>> will fail because XDP socket fd is not valid in secondary process. >>>>> >>>>> Hi Stephen, >>>>> >>>>> Apologies for the delayed reply, I was on vacation. >>>>> >>>>> In the Bugzilla report you suggest we: >>>>> "mark AF_XDP as broken in with primary/secondary >>>>> and return an error in probe in secondary process". >>>>> I agree with this suggestion. However with this patch we still permit >>>> secondary, and just make sure it doesn't crash for get_stats. Did you >> change >>>> your mind? >>>>> Personally, I would prefer to have primary/secondary either working >> 100% >>>> or else not allowed at all by throwing an error during probe. What do you >>>> think? Do you have a reason/use case to permit secondary processes >> despite >>>> some features not being available eg. full stats, tx? >>>>> >>>>> Thanks, >>>>> Ciara >>>> >>>> There are two cases where secondary is useful even if send/receive can't >>>> work from secondary process. >>>> The pdump and proc-info applications can work with these patches. >>>> >>>> I am using XDP over pdump as an easy way to get packets into the code >> for >>>> testing. >>>> >>>> The flag in the documentation doesn't have a "limited" version. >>>> If you want, will send another patch to disable secondary support. >>> >>> Thanks for explaining. Since there are use cases for secondary, even if the >> functionality is limited, I don't think it should be disabled. >>> Since we can't flag it as 'limited' in the feature matrix, could you please add >> a note about the send/receive limitation in the AF_XDP PMD documentation >> in a v2? There are already a number of limitations listed, which you can add >> to. >>> >>> Thanks, >>> Ciara >>> >>>> >>>> Supporting secondary, means adding a mechanism to pass the socket >>>> around. >> >> Looking at this in more detail, and my recommendation is: >> For 21.11 release (and mark it as Fixes so it gets backported). Have the driver >> return -ENOTSUP in secondary process. >> >> For 22.03 add real secondary support using the rte_mp_msg to pass >> necessary >> state to secondary process. Includes socket (and other memory regions?). >> >> The pdump and proc-info cases are only useful for developer testing, and >> there are >> other ways to do the same thing. > > > +1 that sounds reasonable to me. > This patch superseded by following which disables the secondary process support: https://patches.dpdk.org/project/dpdk/patch/20210930134604.32585-1-ciara.loftus@intel.com/