DPDK usage discussions
 help / color / mirror / Atom feed
* mbufs getting reused despite nonzero refcnt
@ 2024-11-10 16:23 Alan Beadle
  2024-11-10 17:12 ` Stephen Hemminger
  0 siblings, 1 reply; 4+ messages in thread
From: Alan Beadle @ 2024-11-10 16:23 UTC (permalink / raw)
  To: users

Hi everyone,

I am using DPDK to send two-way traffic between a pair of machines. My
application has local readers, remote acknowledgments, as well as
automatic retries when a packet is lost. For these reasons I am using
rte_mbuf_refcnt_update() to prevent the NIC from freeing the packet
and recycling the mbuf before my local readers are done and the remote
reader has acknowledged the message. I was advised to do this in an
earlier thread on this mailing list.

However, this does not seem to be working. After running my app for
awhile and exchanging about 1000 messages in this way, my queue of
unacknowledged mbufs is getting corrupted. The mbufs attached to my
queue seem to contain data for newer messages than what is supposed to
be in them, and in some cases contains a totally different type of
packet (an acknack for instance). Obviously this results in retries of
those messages failing to send the correct data and my application
gets stuck.

I have ensured that the refcount is not reaching 0. Each new mbuf
immediately has the refcnt incremented by 1. I was concerned that
retries might need the refcnt bumped again, but if I bump the refcount
every time I resend a specific mbuf to the NIC, the refcounts just
keep getting higher. So it looks like re-bumping it on a resend is not
necessary.

I have ruled out other possible explanations. The mbufs are being
reused by rte_pktmbuf_alloc. I even tried playing with the EAL
settings related to the number of mbuf descriptors and saw my changes
directly correlate with how long it takes this problem to occur. How
do I really prevent the driver from reusing packets that I still might
need to resend?

Thanks in advance,
-Alan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mbufs getting reused despite nonzero refcnt
  2024-11-10 16:23 mbufs getting reused despite nonzero refcnt Alan Beadle
@ 2024-11-10 17:12 ` Stephen Hemminger
  2024-11-10 17:31   ` Alan Beadle
  0 siblings, 1 reply; 4+ messages in thread
From: Stephen Hemminger @ 2024-11-10 17:12 UTC (permalink / raw)
  To: Alan Beadle; +Cc: users

On Sun, 10 Nov 2024 11:23:29 -0500
Alan Beadle <ab.beadle@gmail.com> wrote:

> Hi everyone,
> 
> I am using DPDK to send two-way traffic between a pair of machines. My
> application has local readers, remote acknowledgments, as well as
> automatic retries when a packet is lost. For these reasons I am using
> rte_mbuf_refcnt_update() to prevent the NIC from freeing the packet
> and recycling the mbuf before my local readers are done and the remote
> reader has acknowledged the message. I was advised to do this in an
> earlier thread on this mailing list.
> 
> However, this does not seem to be working. After running my app for
> awhile and exchanging about 1000 messages in this way, my queue of
> unacknowledged mbufs is getting corrupted. The mbufs attached to my
> queue seem to contain data for newer messages than what is supposed to
> be in them, and in some cases contains a totally different type of
> packet (an acknack for instance). Obviously this results in retries of
> those messages failing to send the correct data and my application
> gets stuck.
> 
> I have ensured that the refcount is not reaching 0. Each new mbuf
> immediately has the refcnt incremented by 1. I was concerned that
> retries might need the refcnt bumped again, but if I bump the refcount
> every time I resend a specific mbuf to the NIC, the refcounts just
> keep getting higher. So it looks like re-bumping it on a resend is not
> necessary.
> 
> I have ruled out other possible explanations. The mbufs are being
> reused by rte_pktmbuf_alloc. I even tried playing with the EAL
> settings related to the number of mbuf descriptors and saw my changes
> directly correlate with how long it takes this problem to occur. How
> do I really prevent the driver from reusing packets that I still might
> need to resend?
> 
> Thanks in advance,
> -Alan

Which driver, could be a driver bug.

Also, you should be able to trace mbuf functions, either with rte_trace
or by other facility.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mbufs getting reused despite nonzero refcnt
  2024-11-10 17:12 ` Stephen Hemminger
@ 2024-11-10 17:31   ` Alan Beadle
  2024-11-12 13:02     ` Alan Beadle
  0 siblings, 1 reply; 4+ messages in thread
From: Alan Beadle @ 2024-11-10 17:31 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: users

I'm using the vfio-pci module with Intel X550-T2 NICs. I believe this
means it will use the ixgbe driver? To be honest, I am a bit confused
about the use of drivers in DPDK. I am using the first setup that I
got to work and send/receive packets. Additional tips would be greatly
appreciated. After loading the vfio-pci module I run dpdk-devbind.py
--bind vfio-pci 65:00.1 and then I just use the standard DPDK API
calls in my app. I was meaning to revisit this once my app was more
complete.

On Sun, Nov 10, 2024 at 12:12 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Sun, 10 Nov 2024 11:23:29 -0500
> Alan Beadle <ab.beadle@gmail.com> wrote:
>
> > Hi everyone,
> >
> > I am using DPDK to send two-way traffic between a pair of machines. My
> > application has local readers, remote acknowledgments, as well as
> > automatic retries when a packet is lost. For these reasons I am using
> > rte_mbuf_refcnt_update() to prevent the NIC from freeing the packet
> > and recycling the mbuf before my local readers are done and the remote
> > reader has acknowledged the message. I was advised to do this in an
> > earlier thread on this mailing list.
> >
> > However, this does not seem to be working. After running my app for
> > awhile and exchanging about 1000 messages in this way, my queue of
> > unacknowledged mbufs is getting corrupted. The mbufs attached to my
> > queue seem to contain data for newer messages than what is supposed to
> > be in them, and in some cases contains a totally different type of
> > packet (an acknack for instance). Obviously this results in retries of
> > those messages failing to send the correct data and my application
> > gets stuck.
> >
> > I have ensured that the refcount is not reaching 0. Each new mbuf
> > immediately has the refcnt incremented by 1. I was concerned that
> > retries might need the refcnt bumped again, but if I bump the refcount
> > every time I resend a specific mbuf to the NIC, the refcounts just
> > keep getting higher. So it looks like re-bumping it on a resend is not
> > necessary.
> >
> > I have ruled out other possible explanations. The mbufs are being
> > reused by rte_pktmbuf_alloc. I even tried playing with the EAL
> > settings related to the number of mbuf descriptors and saw my changes
> > directly correlate with how long it takes this problem to occur. How
> > do I really prevent the driver from reusing packets that I still might
> > need to resend?
> >
> > Thanks in advance,
> > -Alan
>
> Which driver, could be a driver bug.
>
> Also, you should be able to trace mbuf functions, either with rte_trace
> or by other facility.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mbufs getting reused despite nonzero refcnt
  2024-11-10 17:31   ` Alan Beadle
@ 2024-11-12 13:02     ` Alan Beadle
  0 siblings, 0 replies; 4+ messages in thread
From: Alan Beadle @ 2024-11-12 13:02 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: users

Is there anything in the usage I described in my previous email which
might explain this problem? Is there anything else wrong with what I'm
doing driver-wise?

On Sun, Nov 10, 2024 at 12:31 PM Alan Beadle <ab.beadle@gmail.com> wrote:
>
> I'm using the vfio-pci module with Intel X550-T2 NICs. I believe this
> means it will use the ixgbe driver? To be honest, I am a bit confused
> about the use of drivers in DPDK. I am using the first setup that I
> got to work and send/receive packets. Additional tips would be greatly
> appreciated. After loading the vfio-pci module I run dpdk-devbind.py
> --bind vfio-pci 65:00.1 and then I just use the standard DPDK API
> calls in my app. I was meaning to revisit this once my app was more
> complete.
>
> On Sun, Nov 10, 2024 at 12:12 PM Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> >
> > On Sun, 10 Nov 2024 11:23:29 -0500
> > Alan Beadle <ab.beadle@gmail.com> wrote:
> >
> > > Hi everyone,
> > >
> > > I am using DPDK to send two-way traffic between a pair of machines. My
> > > application has local readers, remote acknowledgments, as well as
> > > automatic retries when a packet is lost. For these reasons I am using
> > > rte_mbuf_refcnt_update() to prevent the NIC from freeing the packet
> > > and recycling the mbuf before my local readers are done and the remote
> > > reader has acknowledged the message. I was advised to do this in an
> > > earlier thread on this mailing list.
> > >
> > > However, this does not seem to be working. After running my app for
> > > awhile and exchanging about 1000 messages in this way, my queue of
> > > unacknowledged mbufs is getting corrupted. The mbufs attached to my
> > > queue seem to contain data for newer messages than what is supposed to
> > > be in them, and in some cases contains a totally different type of
> > > packet (an acknack for instance). Obviously this results in retries of
> > > those messages failing to send the correct data and my application
> > > gets stuck.
> > >
> > > I have ensured that the refcount is not reaching 0. Each new mbuf
> > > immediately has the refcnt incremented by 1. I was concerned that
> > > retries might need the refcnt bumped again, but if I bump the refcount
> > > every time I resend a specific mbuf to the NIC, the refcounts just
> > > keep getting higher. So it looks like re-bumping it on a resend is not
> > > necessary.
> > >
> > > I have ruled out other possible explanations. The mbufs are being
> > > reused by rte_pktmbuf_alloc. I even tried playing with the EAL
> > > settings related to the number of mbuf descriptors and saw my changes
> > > directly correlate with how long it takes this problem to occur. How
> > > do I really prevent the driver from reusing packets that I still might
> > > need to resend?
> > >
> > > Thanks in advance,
> > > -Alan
> >
> > Which driver, could be a driver bug.
> >
> > Also, you should be able to trace mbuf functions, either with rte_trace
> > or by other facility.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-11-12 13:02 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-11-10 16:23 mbufs getting reused despite nonzero refcnt Alan Beadle
2024-11-10 17:12 ` Stephen Hemminger
2024-11-10 17:31   ` Alan Beadle
2024-11-12 13:02     ` Alan Beadle

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).