Hi Andrew, This is the output that I see in the terminal when this failure occurs, after the test agent binaries build and the test engine starts: Platform default build - pass Simple RCF consistency check succeeded --->>> Starting Logger...done --->>> Starting RCF...rcf_net_engine_connect(): Connection timed out iol-dts-tester.dpdklab.iol.unh.edu:23571 Then, it hangs here until I kill the "te_rcf" and "te_tee" processes. I let it hang for around 9 minutes. On the tester host (which appears to be the Peer agent), there are four processes that I see running, which look like the test agent processes. ta.Peer is an empty file. I've attached the log.txt from this run. - Adam On Thu, Aug 24, 2023 at 4:22 AM Andrew Rybchenko < andrew.rybchenko@oktetlabs.ru> wrote: > Hi Adam, > > Yes, TE_RCFUNIX_TIMEOUT is in seconds. I've double-checked that it goes to > 'copy_timeout' in ts-conf/rcf.conf. > Description in in doc/sphinx/pages/group_te_engine_rcf.rst says that > copy_timeout is in seconds and implementation in lib/rcfunix/rcfunix.c > passes the value to select() tv_sec. Theoretically select() could be > interrupted by signal, but I think it is unlikely here. > > I'm not sure that I understand what do you mean by RCF connection timeout. > Does it happen on TE startup when RCF starts test agents. If so, > TE_RCFUNIX_TIMEOUT could help. Or does it happen when tests are in > progress, e.g. in the middle of a test. If so, TE_RCFUNIX_TIMEOUT is > unrelated and most likely either host with test agent dies or test agent > itself crashes. It would be easier for me if classify it if you share text > log (log.txt, full or just corresponding fragment with some context). Also > content of ta.DPDK or ta.Peer file depending on which agent has problems > could shed some light. Corresponding files contain stdout/stderr of test > agents. > > Andrew. > > On 8/23/23 17:45, Adam Hassick wrote: > > Hi Andrew, > > I've set up a test rig repository here, and have created configurations > for our development testbed based off of the examples. > We've been able to get the test suite to run manually on Mellanox CX5 > devices once. > However, we are running into an issue where, when RCF starts, the RCF > connection times out very frequently. We aren't sure why this is the case. > It works sometimes, but most of the time when we try to run the test > engine, it encounters this issue. > I've tried changing the RCF port by setting "TE_RCF_PORT= number>" and rebooting the testbed machines. Neither seems to fix the issue. > > It also seems like the timeout takes far longer than 60 seconds, even when > running "export TE_RCFUNIX_TIMEOUT=60" before I try to run the test suite. > I assume the unit for this variable is seconds? > > Thanks, > Adam > > On Mon, Aug 21, 2023 at 10:19 AM Adam Hassick > wrote: > >> Hi Andrew, >> >> Thanks, I've cloned the example repository and will start setting up a >> configuration for our development testbed today. I'll let you know if I run >> into any difficulties or have any questions. >> >> - Adam >> >> On Sun, Aug 20, 2023 at 4:40 AM Andrew Rybchenko < >> andrew.rybchenko@oktetlabs.ru> wrote: >> >>> Hi Adam, >>> >>> I've published https://github.com/ts-factory/ts-rigs-sample. Hopefully >>> it will help to define your test rigs and successfully run some tests >>> manually. Feel free to ask any questions and I'll answer here and try to >>> update documentation. >>> >>> Meanwhile I'll prepare missing bits for steps (2) and (3). >>> Hopefully everything is in place for step (4), but we need to make steps >>> (2) and (3) first. >>> >>> Andrew. >>> >>> On 8/18/23 21:40, Andrew Rybchenko wrote: >>> >>> Hi Adam, >>> >>> > I've conferred with the rest of the team, and we think it would be >>> best to move forward with mainly option B. >>> >>> OK, I'll provide the sample on Monday for you. It is almost ready right >>> now, but I need to double-check it before publishing. >>> >>> Regards, >>> Andrew. >>> >>> On 8/17/23 20:03, Adam Hassick wrote: >>> >>> Hi Andrew, >>> >>> I'm adding the CI mailing list to this conversation. Others in the >>> community might find this conversation valuable. >>> >>> We do want to run testing on a regular basis. The Jenkins integration >>> will be very useful for us, as most of our CI is orchestrated by Jenkins. >>> I've conferred with the rest of the team, and we think it would be best >>> to move forward with mainly option B. >>> If you would like to know anything about our testbeds that would help >>> you with creating an example ts-rigs repo, I'd be happy to answer any >>> questions you have. >>> >>> We have multiple test rigs (we call these "DUT-tester pairs") that we >>> run our existing hardware testing on, with differing network hardware and >>> CPU architecture. I figured this might be an important detail. >>> >>> Thanks, >>> Adam >>> >>> On Thu, Aug 17, 2023 at 11:44 AM Andrew Rybchenko < >>> andrew.rybchenko@oktetlabs.ru> wrote: >>> >>>> Greatings Adam, >>>> >>>> I'm happy to hear that you're trying to bring it up. >>>> >>>> As I understand the final goal is to run it on regular basis. So, we >>>> need to make it properly from the very beginning. >>>> Bring up of all features consists of 4 steps: >>>> >>>> 1. Create site-specific repository (we call it ts-rigs) which contains >>>> information about test rigs and other site-specific information like where >>>> to send mails, where to store logs etc. It is required for manual execution >>>> as well, since test rigs description is essential. I'll return to the topic >>>> below. >>>> >>>> 2. Setup logs storage for automated runs. Basically it is a disk space >>>> plus apache2 web server with few CGI scripts which help a lot to save disk >>>> space. >>>> >>>> 3. Setup Bublik web application which provides web interface to view >>>> testing results. Same as https://ts-factory.io/bublik >>>> >>>> 4. Setup Jenkins to run tests on regularly, save logs in log storage >>>> (2) and import it to bublik (3). >>>> >>>> Last few month we spent on our homework to make it simpler to bring up >>>> automated execution using Jenkins - >>>> https://github.com/ts-factory/te-jenkins >>>> Corresponding bits in dpdk-ethdev-ts will be available tomorrow. >>>> >>>> Let's return to the step (1). >>>> >>>> Unfortunately there is no publicly available example of the ts-rigs >>>> repository since sensitive site-specific information is located there. But >>>> I'm ready to help you to create it for UNH. I see two options here: >>>> >>>> (A) I'll ask questions and based on your answers will create the first >>>> draft with my comments. >>>> >>>> (B) I'll make a template/example ts-rigs repo, publish it and you'll >>>> create UNH ts-rigs based on it. >>>> >>>> Of course, I'll help to debug and finally bring it up in any case. >>>> >>>> (A) is a bit simpler for me and you, but (B) is a bit more generic and >>>> will help other potential users to bring it up. >>>> We can combine (A)+(B). I.e. start from (A). What do you think? >>>> >>>> Thanks, >>>> Andrew. >>>> >>>> On 8/17/23 15:18, Konstantin Ushakov wrote: >>>> >>>> Greetings Adam, >>>> >>>> >>>> Thanks for contacting us. I copy Andrew who would be happy to help >>>> >>>> Thanks, >>>> Konstantin >>>> >>>> On 16 Aug 2023, at 21:50, Adam Hassick >>>> wrote: >>>> >>>>  >>>> Greetings Konstantin, >>>> >>>> I am in the process of setting up the DPDK Poll Mode Driver test suite >>>> as an addition to our testing coverage for DPDK at the UNH lab. >>>> >>>> I have some questions about how to set the test suite arguments. >>>> >>>> I have been able to configure the Test Engine to connect to the hosts >>>> in the testbed. The RCF, Configurator, and Tester all begin to run, however >>>> the prelude of the test suite fails to run. >>>> >>>> https://ts-factory.io/doc/dpdk-ethdev-ts/index.html#test-parameters >>>> >>>> The documentation mentions that there are several test parameters for >>>> the test suite, like for the IUT test link MAC, etc. These seem like they >>>> would need to be set somewhere to run many of the tests. >>>> >>>> I see in the Test Engine documentation, there are instructions on how >>>> to create new parameters for test suites in the Tester configuration, but >>>> there is nothing in the user guide or in the Tester guide for how to set >>>> the arguments for the parameters when running the test suite that I can >>>> find. I'm not sure if I need to write my own Tester config, or if I should >>>> be setting these in some other way. >>>> >>>> How should these values be set? >>>> >>>> I'm also not sure what environment variables/arguments are strictly >>>> necessary or which are optional. >>>> >>>> Regards, >>>> Adam >>>> >>>> -- >>>> *Adam Hassick* >>>> Senior Developer >>>> UNH InterOperability Lab >>>> ahassick@iol.unh.edu >>>> iol.unh.edu >>>> +1 (603) 475-8248 >>>> >>>> >>>> >>> >>> -- >>> *Adam Hassick* >>> Senior Developer >>> UNH InterOperability Lab >>> ahassick@iol.unh.edu >>> iol.unh.edu >>> +1 (603) 475-8248 >>> >>> >>> >>> >> >> -- >> *Adam Hassick* >> Senior Developer >> UNH InterOperability Lab >> ahassick@iol.unh.edu >> iol.unh.edu >> +1 (603) 475-8248 >> > > > -- > *Adam Hassick* > Senior Developer > UNH InterOperability Lab > ahassick@iol.unh.edu > iol.unh.edu > +1 (603) 475-8248 > > > -- *Adam Hassick* Senior Developer UNH InterOperability Lab ahassick@iol.unh.edu iol.unh.edu +1 (603) 475-8248