* [dpdk-dev] TX performance regression caused by the mbuf cachline split @ 2015-05-11 0:14 Paul Emmerich 2015-05-11 9:13 ` Luke Gorrie 2015-05-11 22:32 ` Paul Emmerich 0 siblings, 2 replies; 10+ messages in thread From: Paul Emmerich @ 2015-05-11 0:14 UTC (permalink / raw) To: dev Hi, this is a follow-up to my post from 3 weeks ago [1]. I'm starting a new thread here since I now got a completely new test setup for improved reproducibility. Background for anyone that didn't catch my last post: I'm investigating a performance regression in my packet generator [2] that occurs since I tried to upgrade from DPDK 1.7.1 to 1.8 or 2.0. DPDK 1.7.1 is about 25% faster than 2.0 in my application. I suspected that this is due to the new 2-cacheline mbufs, which I now confirmed with a bisect. My old test setup was based on the l2fwd example and required an external packet generator and was kind of hard to reproduce. I built a simple tx benchmark application that simply sends nonsensical packets with a sequence number as fast as possible on two ports with a single single core. You can download the benchmark app at [3]. Hardware setup: CPU: E5-2620 v3 underclocked to 1.2 GHz RAM: 4x 8 GB 1866 MHz DDR4 memory NIC: X540-T2 Baseline test results: DPDK simple tx full-featured tx 1.7.1 14.1 Mpps 10.7 Mpps 2.0.0 11.0 Mpps 9.3 Mpps DPDK 1.7.1 is 28%/15% faster than 2.0 with simple/full-featured tx in this benchmark. I then did a few runs of git bisect to identify commits that caused a significant drop in performance. You can find the script that I used to quickly test the performance of a version at [4]. Commit simple full-featured 7869536f3f8edace05043be6f322b835702b201c 13.9 10.4 mbuf: flatten struct vlan_macip The commit log explains that there is a perf regression and that it cannot be avoided to be future-compatible. The log claims < 5% which is consistent with my test results (old code is 4% faster). I guess that is okay and cannot be avoided. Commit simple full-featured 08b563ffb19d8baf59dd84200f25bc85031d18a7 12.8 10.4 mbuf: replace data pointer by an offset This affects the simple tx path significantly. This performance regression is probably simply be caused by the (temporarily) disabled vector tx code that is mentioned in the commit log. Not investigated further. Commit simple full-featured f867492346bd271742dd34974e9cf8ac55ddb869 10.7 9.1 mbuf: split mbuf across two cache lines. This one is the real culprit. The commit log does not mention any performance evaluations and a quick scan of the mailing list also doesn't reveal any evaluations of the impact of this change. It looks like the main problem for tx is that the mempool pointer is in the second cacheline. I think the new mbuf structure is too bloated. It forces you to pay for features that you don't need or don't want. I understand that it needs to support all possible filters and offload features. But it's kind of hard to justify 25% difference in performance for a framework that sets performance above everything (Does it? I Picked that up from the discussion in the "Beyond DPDK 2.0" thread). I've counted 56 bytes in use in the first cacheline in v2.0.0. Would it be possible to move the pool pointer and tx offload fields to the first cacheline? We would just need to free up 8 bytes. One candidate would be the seqn field, does it really have to be in the first cache line? Another candidate is the size of the ol_flags field? Do we really need 64 flags? Sharing bits between rx and tx worked fine. I naively tried to move the pool pointer into the first cache line in the v2.0.0 tag and the performance actually decreased, I'm not yet sure why this happens. There are probably assumptions about the cacheline locations and prefetching in the code that would need to be adjusted. Another possible solution would be a more dynamic approach to mbufs: the mbuf struct could be made configurable to fit the requirements of the application. This would probably require code generation or a lot of ugly preprocessor hacks and add a lot of complexity to the code. The question would be if DPDK really values performance above everything else. Paul P.S.: I'm kind of disappointed by the lack of regression tests for the performance. I think that such tests should be an integral part of a framework with the explicit goal to be fast. For example, the main page at dpdk.org claims a performance of "usually less than 80 cycles" for a rx or tx operation. This claim is no longer true :( Touching the layout of a core data structure like the mbuf shouldn't be done without carefully evaluating the performance impacts. But this discussion probably belongs in the "Beyond DPDK 2.0" thread. P.P.S.: Benchmarking an rx-only application (e.g. traffic analysis) would also be interesting, but that's not really on my todo list right now. Mixed rx/tx like forwarding is also affected as discussed in my last thread [1]). [1] http://dpdk.org/ml/archives/dev/2015-April/016921.html [2] https://github.com/emmericp/MoonGen [3] https://github.com/emmericp/dpdk-tx-performance [4] https://gist.github.com/emmericp/02c5885908c3cb5ac5b7 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] TX performance regression caused by the mbuf cachline split 2015-05-11 0:14 [dpdk-dev] TX performance regression caused by the mbuf cachline split Paul Emmerich @ 2015-05-11 9:13 ` Luke Gorrie 2015-05-11 10:16 ` Paul Emmerich 2015-05-11 22:32 ` Paul Emmerich 1 sibling, 1 reply; 10+ messages in thread From: Luke Gorrie @ 2015-05-11 9:13 UTC (permalink / raw) To: Paul Emmerich; +Cc: dev Hi Paul, On 11 May 2015 at 02:14, Paul Emmerich <emmericp@net.in.tum.de> wrote: > Another possible solution would be a more dynamic approach to mbufs: Let me suggest a slightly more extreme idea for your consideration. This method can easily do > 100 Mpps with one very lightly loaded core. I don't know if it works for your application or not but I share it just in case. Background: Load generators are specialist applications and can benefit from specialist transmit mechanisms. You can instruct the NIC to send up to 32K packets with one operation: load the address of a descriptor list into the TDBA register (Transmit Descriptor Base Address). The descriptor list is a simple series of 64-bit values: addr0, flags0, addr1, flags1, ... etc. It is easy to construct by hand. The NIC can also be made to play the packets in a loop. You just have to periodically reset the DMA cursor to make all the packets valid again. That is a simple register poke: TDT = TDH-1. We do this routinely when we want to generate a large amount of traffic with few resources, typically when generating load using spare capacity of a device under test. (I have sample code but it is not based on DPDK.) If you want all of your packets to be unique then you have to be a bit more clever. For example you could poll to see the DMA progress: let half the packets be sent, then rewrite those while the other half are sent, and so on. Kind of like the way video games tracked the progress of the display scan beam to update parts of the frame buffer that were not being DMA'd. This method may impose other limitations that are not acceptable for your application of course. But if not then it can drastically reduce the number of instructions and cache footprint required to generate load. You don't have to touch mbufs or descriptors at all. You just update the payload and update the DMA register every millisecond or so. Cheers, -Luke ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] TX performance regression caused by the mbuf cachline split 2015-05-11 9:13 ` Luke Gorrie @ 2015-05-11 10:16 ` Paul Emmerich 0 siblings, 0 replies; 10+ messages in thread From: Paul Emmerich @ 2015-05-11 10:16 UTC (permalink / raw) To: Luke Gorrie; +Cc: dev Hi Luke, thanks for your suggestion, I actually looked at how your packet generator in SnabbSwitch works before and it's quite clever. But unfortunately that's not what I'm looking for. I'm looking for a generic solution that works with whatever NIC is supported by DPDK and I don't want to write NIC-specific transmit logic. I don't want to maintain, test, or debug drivers. That's why I chose DPDK in the first place. The DPDK drivers (used to) hit a sweet spot for the performance. I can usually load about two 10 Gbit/s ports on a reasonably sized CPU core without worrying about writing my own device drivers*. This allows for packet generation at interesting packet rates on low-end servers (e.g. servers with Xeon E3 1230 v2 CPUs and dual-port NICs). Servers with more ports usually also have the necessary CPU power to handle it. I also don't want to be limited to packet generation in the long run. For example, I have a student who is working on an IPSec offloading application and another student working on a proof-of-concept router. Paul *) yes, I still need some NIC-specific low-level code (timestamping) and a small patch in the DPDK drivers (flag to disable CRC offloading on a per-packet basis) for some features of my packet generator. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] TX performance regression caused by the mbuf cachline split 2015-05-11 0:14 [dpdk-dev] TX performance regression caused by the mbuf cachline split Paul Emmerich 2015-05-11 9:13 ` Luke Gorrie @ 2015-05-11 22:32 ` Paul Emmerich 2015-05-11 23:18 ` Paul Emmerich 1 sibling, 1 reply; 10+ messages in thread From: Paul Emmerich @ 2015-05-11 22:32 UTC (permalink / raw) To: dev Paul Emmerich: > I naively tried to move the pool pointer into the first cache line in > the v2.0.0 tag and the performance actually decreased, I'm not yet sure > why this happens. There are probably assumptions about the cacheline > locations and prefetching in the code that would need to be adjusted. This happens because the next-pointer in the mbuf is touched almost everywhere, even for mbufs with only one segment because it is used to determine if there is another segment (instead of using the nb_segs field). I guess a solution for me would be to use a custom layout that is optimized for tx. I can shrink ol_flags to 32 bits and move the seqn and hash fields to the second cache line. A quick-and-dirty test shows that this even gives me a slightly higher performance than DPDK 1.7 in the full-featured tx path. This is probably going to break the vector rx/tx path, but I can't use that anyways since I always need offloading features (timestamping and checksums). I'll have to see how this affects the rx path. But I value tx performance over rx performance. My rx logic is usually very simple. This solution is kind of ugly. I would prefer to be able to use an unmodified version of DPDK :/ By the way, I think there is something wrong with this assumption in commit f867492346bd271742dd34974e9cf8ac55ddb869: > The general approach that we are looking to take is to focus the first > cache line on fields that are updated on RX , so that receive only deals > with one cache line. I think this might be wrong due to the next pointer. I'll probably build a simple rx-only benchmark in a few weeks or so. I suspect that it will also be significantly slower. But that should be fixable. Paul ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] TX performance regression caused by the mbuf cachline split 2015-05-11 22:32 ` Paul Emmerich @ 2015-05-11 23:18 ` Paul Emmerich 2015-05-12 0:28 ` Marc Sune 2015-05-13 9:03 ` Ananyev, Konstantin 0 siblings, 2 replies; 10+ messages in thread From: Paul Emmerich @ 2015-05-11 23:18 UTC (permalink / raw) To: dev Found a really simple solution that almost restores the original performance: just add a prefetch on alloc. For some reason, I assumed that this was already done since the troublesome commit I investigated mentioned something about prefetching... I guess the commit referred to the hardware prefetcher in the CPU. Adding an explicit prefetch command in the mbuf alloc function gives a throughput of 12.7/10.35 Mpps in my benchmark with the simple/full-featured tx path. DPDK 1.7.1 was at 14.1/10.7 Mpps. I guess I can live with that, since I'm primarily interested in the full-featured path and the drop from 10.7 to ~10.4 was due to another change. Patch: https://github.com/dpdk-org/dpdk/pull/2 I also sent an email to the mailing list. I also think that the rx-path could also benefit from prefetching somewhere. Paul ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] TX performance regression caused by the mbuf cachline split 2015-05-11 23:18 ` Paul Emmerich @ 2015-05-12 0:28 ` Marc Sune 2015-05-12 0:38 ` Marc Sune 2015-05-13 9:03 ` Ananyev, Konstantin 1 sibling, 1 reply; 10+ messages in thread From: Marc Sune @ 2015-05-12 0:28 UTC (permalink / raw) To: dev On 12/05/15 01:18, Paul Emmerich wrote: > Found a really simple solution that almost restores the original > performance: just add a prefetch on alloc. For some reason, I assumed > that this was already done since the troublesome commit I investigated > mentioned something about prefetching... I guess the commit referred > to the hardware prefetcher in the CPU. > > Adding an explicit prefetch command in the mbuf alloc function gives a > throughput of 12.7/10.35 Mpps in my benchmark with the > simple/full-featured tx path. > > DPDK 1.7.1 was at 14.1/10.7 Mpps. I guess I can live with that, since > I'm primarily interested in the full-featured path and the drop from > 10.7 to ~10.4 was due to another change. Maybe a stupid question; Does the performance of v1.7.1 also improve if you backport this patch to it? Marc > > Patch: https://github.com/dpdk-org/dpdk/pull/2 > I also sent an email to the mailing list. > > I also think that the rx-path could also benefit from prefetching > somewhere. > > > Paul > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] TX performance regression caused by the mbuf cachline split 2015-05-12 0:28 ` Marc Sune @ 2015-05-12 0:38 ` Marc Sune 0 siblings, 0 replies; 10+ messages in thread From: Marc Sune @ 2015-05-12 0:38 UTC (permalink / raw) To: dev On 12/05/15 02:28, Marc Sune wrote: > > > On 12/05/15 01:18, Paul Emmerich wrote: >> Found a really simple solution that almost restores the original >> performance: just add a prefetch on alloc. For some reason, I assumed >> that this was already done since the troublesome commit I >> investigated mentioned something about prefetching... I guess the >> commit referred to the hardware prefetcher in the CPU. >> >> Adding an explicit prefetch command in the mbuf alloc function gives >> a throughput of 12.7/10.35 Mpps in my benchmark with the >> simple/full-featured tx path. >> >> DPDK 1.7.1 was at 14.1/10.7 Mpps. I guess I can live with that, since >> I'm primarily interested in the full-featured path and the drop from >> 10.7 to ~10.4 was due to another change. > > Maybe a stupid question; > > Does the performance of v1.7.1 also improve if you backport this patch > to it? Self answered... split was done in 1.8, so it is indeed stupid. Marc > > Marc > >> >> Patch: https://github.com/dpdk-org/dpdk/pull/2 >> I also sent an email to the mailing list. >> >> I also think that the rx-path could also benefit from prefetching >> somewhere. >> >> >> Paul >> > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] TX performance regression caused by the mbuf cachline split 2015-05-11 23:18 ` Paul Emmerich 2015-05-12 0:28 ` Marc Sune @ 2015-05-13 9:03 ` Ananyev, Konstantin 2016-02-15 19:15 ` Paul Emmerich 1 sibling, 1 reply; 10+ messages in thread From: Ananyev, Konstantin @ 2015-05-13 9:03 UTC (permalink / raw) To: Paul Emmerich; +Cc: dev Hi Paul, > -----Original Message----- > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Paul Emmerich > Sent: Tuesday, May 12, 2015 12:19 AM > To: dev@dpdk.org > Subject: Re: [dpdk-dev] TX performance regression caused by the mbuf cachline split > > Found a really simple solution that almost restores the original > performance: just add a prefetch on alloc. For some reason, I assumed > that this was already done since the troublesome commit I investigated > mentioned something about prefetching... I guess the commit referred to > the hardware prefetcher in the CPU. > > Adding an explicit prefetch command in the mbuf alloc function gives a > throughput of 12.7/10.35 Mpps in my benchmark with the > simple/full-featured tx path. > > DPDK 1.7.1 was at 14.1/10.7 Mpps. I guess I can live with that, since > I'm primarily interested in the full-featured path and the drop from > 10.7 to ~10.4 was due to another change. > > Patch: https://github.com/dpdk-org/dpdk/pull/2 > I also sent an email to the mailing list. > > I also think that the rx-path could also benefit from prefetching somewhere. > Before start to discuss your findings, there is one thing in your test app that looks strange to me: You use BATCH_SIZE==64 for TX packets, but your mempool cache_size==32. This is not really a good choice, as it means that for each iteration your mempool cache will be exhausted, and you'll endup doing ring_dequeue(). I'd suggest you use something like ' 2 * BATCH_SIZE' for mempools cache size, that should improve your numbers (at least it did to me). About the patch: So from what you are saying - the reason for the drop is not actually the TX path, but rte_pktmbuf_alloc()->rte_pktmbuf_reset(). That makes sense - pktmbuf_reset() now has to update 2 cache line instead of one. From other side - rte_pktmbuf_alloc() was never considered as a fastest path (our RX/TX roitinies don't use it) - so we never put a big effort in trying to optimise it. Though, I am really not a big fan of manual prefetching. Its particular behaviour may vary from one cpu to another, and is real effect is sort of hard to predict, in some cases can even cause a performance degradation. Let say on my IVB box, your patch didn't show any difference at all. So I think that 'prefetch' should be used only when it really gives great performance boost and same results can't be achieved by other methods. For that particular case - at least that 'prefetch' should be moved from __rte_mbuf_raw_alloc() to rte_pktmbuf_alloc(), to avoid any negative impact on RX path. Though, I suppose that scenario might be improved without manual 'prefetch' - by reordering code a bit. Below are 2 small patches, that introduce rte_pktmbuf_bulk_alloc() and modifies your test app to use it. Could you give it a try and see would it help to close a gap between 1.7.1 and 2.0? I don't have box with the same off-hand, but on my IVB box results are quite promising: on 1.2 GHz for simple_tx there is practically no difference in results (-0.33%), for full_tx the drop reduced to 2%. That's comparing DPDK1.7.1+testpapp with cache_size=2*batch_size vs latest DPDK+ testpapp with cache_size=2*batch_size+bulk_alloc. Thanks Konstantin patch1: diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h index ab6de67..23d79ca 100644 --- a/lib/librte_mbuf/rte_mbuf.h +++ b/lib/librte_mbuf/rte_mbuf.h @@ -810,6 +810,45 @@ static inline struct rte_mbuf *rte_pktmbuf_alloc(struct rte_mempool *mp) return (m); } +static inline int +rte_pktmbuf_bulk_alloc(struct rte_mempool *mp, struct rte_mbuf **m, uint32_t n) +{ + int32_t rc; + uint32_t i; + + rc = rte_mempool_get_bulk(mp, (void **)m, n); + + if (rc == 0) { + i = 0; + switch (n % 4) { + while (i != n) { + case 0: + RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(m[i]) == 0); + rte_mbuf_refcnt_set(m[i], 1); + rte_pktmbuf_reset(m[i]); + i++; + case 3: + RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(m[i]) == 0); + rte_mbuf_refcnt_set(m[i], 1); + rte_pktmbuf_reset(m[i]); + i++; + case 2: + RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(m[i]) == 0); + rte_mbuf_refcnt_set(m[i], 1); + rte_pktmbuf_reset(m[i]); + i++; + case 1: + RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(m[i]) == 0); + rte_mbuf_refcnt_set(m[i], 1); + rte_pktmbuf_reset(m[i]); + i++; + } + } + } + + return rc; +} + /** * Attach packet mbuf to another packet mbuf. * patch2: diff --git a/main.c b/main.c index 2aa9fcf..749c52c 100644 --- a/main.c +++ b/main.c @@ -71,7 +71,7 @@ static struct rte_mempool* make_mempool() { static int pool_id = 0; char pool_name[32]; sprintf(pool_name, "pool%d", __sync_fetch_and_add(&pool_id, 1)); - return rte_mempool_create(pool_name, NB_MBUF, MBUF_SIZE, 32, + return rte_mempool_create(pool_name, NB_MBUF, MBUF_SIZE, 2 * BATCH_SIZE, sizeof(struct rte_pktmbuf_pool_private), rte_pktmbuf_pool_init, NULL, rte_pktmbuf_init, NULL, @@ -113,13 +113,21 @@ static uint32_t send_pkts(uint8_t port, struct rte_mempool* pool) { // alloc bufs struct rte_mbuf* bufs[BATCH_SIZE]; uint32_t i; + int32_t rc; + + rc = rte_pktmbuf_bulk_alloc(pool, bufs, RTE_DIM(bufs)); + if (rc < 0) { + RTE_LOG(ERR, USER1, + "%s: rte_pktmbuf_alloc(%zu) returns error code: %d\n", + __func__, RTE_DIM(bufs), rc); + return 0; + } + for (i = 0; i < BATCH_SIZE; i++) { - struct rte_mbuf* buf = rte_pktmbuf_alloc(pool); - rte_pktmbuf_data_len(buf) = 60; - rte_pktmbuf_pkt_len(buf) = 60; - bufs[i] = buf; + rte_pktmbuf_data_len(bufs[i]) = 60; + rte_pktmbuf_pkt_len(bufs[i]) = 60; // write seq number - uint64_t* pkt = rte_pktmbuf_mtod(buf, uint64_t*); + uint64_t* pkt = rte_pktmbuf_mtod(bufs[i], uint64_t*); pkt[0] = seq++; } // send pkts ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] TX performance regression caused by the mbuf cachline split 2015-05-13 9:03 ` Ananyev, Konstantin @ 2016-02-15 19:15 ` Paul Emmerich 2016-02-19 12:31 ` Olivier MATZ 0 siblings, 1 reply; 10+ messages in thread From: Paul Emmerich @ 2016-02-15 19:15 UTC (permalink / raw) To: Ananyev, Konstantin; +Cc: dev Hi, here's a kind of late follow-up. I've only recently found the need (mostly for the better support of XL710 NICs (which I still dislike but people are using them...)) to seriously address DPDK 2.x support in MoonGen. On 13.05.15 11:03, Ananyev, Konstantin wrote: > Before start to discuss your findings, there is one thing in your test app that looks strange to me: > You use BATCH_SIZE==64 for TX packets, but your mempool cache_size==32. > This is not really a good choice, as it means that for each iteration your mempool cache will be exhausted, > and you'll endup doing ring_dequeue(). > I'd suggest you use something like ' 2 * BATCH_SIZE' for mempools cache size, > that should improve your numbers (at least it did to me). Thanks for pointing that out. However, my real app did not have this bug and I also saw the performance improvement there. > Though, I suppose that scenario might be improved without manual 'prefetch' - by reordering code a bit. > Below are 2 small patches, that introduce rte_pktmbuf_bulk_alloc() and modifies your test app to use it. > Could you give it a try and see would it help to close a gap between 1.7.1 and 2.0? > I don't have box with the same off-hand, but on my IVB box results are quite promising: > on 1.2 GHz for simple_tx there is practically no difference in results (-0.33%), > for full_tx the drop reduced to 2%. > That's comparing DPDK1.7.1+testpapp with cache_size=2*batch_size vs > latest DPDK+ testpapp with cache_size=2*batch_size+bulk_alloc. The bulk_alloc patch is great and helps. I'd love to see such a function in DPDK. I agree that this is a better solution than prefetching. I also can't see a difference with/without prefetching when using bulk alloc. Paul ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] TX performance regression caused by the mbuf cachline split 2016-02-15 19:15 ` Paul Emmerich @ 2016-02-19 12:31 ` Olivier MATZ 0 siblings, 0 replies; 10+ messages in thread From: Olivier MATZ @ 2016-02-19 12:31 UTC (permalink / raw) To: Paul Emmerich, Ananyev, Konstantin; +Cc: dev Hi Paul, On 02/15/2016 08:15 PM, Paul Emmerich wrote: > The bulk_alloc patch is great and helps. I'd love to see such a function > in DPDK. > A patch has been submitted by Huawei. I guess it will be integrated soon. See http://dpdk.org/dev/patchwork/patch/10122/ Regards, Olivier ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2016-02-19 12:31 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-05-11 0:14 [dpdk-dev] TX performance regression caused by the mbuf cachline split Paul Emmerich 2015-05-11 9:13 ` Luke Gorrie 2015-05-11 10:16 ` Paul Emmerich 2015-05-11 22:32 ` Paul Emmerich 2015-05-11 23:18 ` Paul Emmerich 2015-05-12 0:28 ` Marc Sune 2015-05-12 0:38 ` Marc Sune 2015-05-13 9:03 ` Ananyev, Konstantin 2016-02-15 19:15 ` Paul Emmerich 2016-02-19 12:31 ` Olivier MATZ
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).