[ImageJ-devel] Jenkins' jocoserious jeremiads

Melissa Linkert melissa at glencoesoftware.com
Tue Feb 7 08:58:53 CST 2012


Hi Johannes (and other Jenkins enthusiasts),

> > I have no idea why that job was taking so long, but I've now killed all
> > of the Bio-Formats testing jobs in the queue for tonight so that all of
> > the daily build jobs can run.  I'll investigate what the cause might
> > have been in the morning, as I do not have enough coffee in front of me
> > at the moment to reliably diagnose the problem.
> 
> It looked to me as if Jenkins was stuck somewhere. The progress which
> supposedly advances whenever a test was performed did not advance at all
> in the half hour I looked.
> 
> So I guess that there was something really awkward, such as an OOM
> infinite loop, or some such.
> 
> In hindsight, I should have fired up JVisualVM and looked at the threads,
> so that's what I will do next time when it happens again.

In looking at the logs for:

http://dev.loci.wisc.edu:8080/job/Bio-Formats%20full%20repository%20data%20test/256/

there are a *lot* of these exceptions:

https://gist.github.com/1759740

(in short: parsing of an XML file fails because the connection is
reset while retrieving the corresponding DTD)

I can only reproduce these locally if I completely switch off all network
connections while the XML file is being parsed, so that suggests to me that
the Jenkins server suffered network connectivity issues at some point on
the evening of February 2.  Logging stopped at 20:48:07 on February 2, and
in such a place as to suggest that an infinite loop was entered during the
same XML parsing/DTD retrieval logic as caused the exception above.

For now what I've done is:

  0) Filed a ticket to fix up Bio-Formats' XML parsing to either not fetch
     the DTD at all, or fail gracefully if it is not retrieved in some
     specific amount of time (I haven't decided which yet):

     http://trac.openmicroscopy.org.uk/ome/ticket/8012

  1) Disabled the 'Bio-Formats full repository test' job until further notice.

Hopefully having (0) resolved will be sufficient for now, with the knowledge
that everything under the "Bio-Formats" tab will eventually be moved to OME's
Jenkins server.  If, however, my assumptions are correct and there are
indeed connectivity issues on that server, then it's quite possible that
other (non-Bio-Formats) jobs will get stuck at some point as well.

Of course, if anyone has a different idea of what is going on (or a better
idea of how to solve the problem) then I would be happy to hear it.

Regards,
-Melissa

On Tue, Feb 07, 2012 at 05:10:45AM +0100, Johannes Schindelin wrote:
> Hi Melissa,
> 
> On Mon, 6 Feb 2012, Melissa Linkert wrote:
> 
> > > it seems that Jenkins is busy with testing Bio-Formats... for more
> > > than 3 days now...
> > 
> > Thanks for pointing this out.
> > 
> > I have no idea why that job was taking so long, but I've now killed all
> > of the Bio-Formats testing jobs in the queue for tonight so that all of
> > the daily build jobs can run.  I'll investigate what the cause might
> > have been in the morning, as I do not have enough coffee in front of me
> > at the moment to reliably diagnose the problem.
> 
> It looked to me as if Jenkins was stuck somewhere. The progress which
> supposedly advances whenever a test was performed did not advance at all
> in the half hour I looked.
> 
> So I guess that there was something really awkward, such as an OOM
> infinite loop, or some such.
> 
> In hindsight, I should have fired up JVisualVM and looked at the threads,
> so that's what I will do next time when it happens again.
> 
> Ciao,
> Dscho




More information about the ImageJ-devel mailing list