[ImageJ-devel] Tons of "back to normal" Jenkins messages

Johannes Schindelin schindelin at wisc.edu
Sat May 11 12:01:42 CDT 2013


Hi all,

those of you who subscribed to imagej-builds will have gotten a flurry of
"Build is back to normal" messages. While this is soothing, the absence of
mails claiming that the builds failed might be discomforting.

The reason those mails were not sent is actually the same reason as why
those builds failed in the first place: Jenkins' JVM ran out of PermSpace.

So what is PermSpace? This is the area of the memory in which the JVM
stores things that are supposed never to be unloaded from memory. Like
class definitions. In Java 1.7, you can actually mark a URLClassLoader as
no longer required, but in Java 1.6 this is not the case. There are
workarounds with the garbage collector going around on demand to clean
those class definitions which fell out of use when their corresponding
class loaders get garbage collected. Contrary to my belief, this
functionality is opt-in, not opt-out in Java 1.6.

But why did Jenkins run out of PermSpace to begin with?

Alas, this is my fault. In my endeavor to show the changes of the
Stable-Fiji job (i.e. the changes in the uploaded files), I had to fake a
Subversion changelog (the Changes in Jenkins are tied very tightly to an
SCM, even if just parsing the persisted list of changes). To make that
happen, I had to instantiate a class loader because the Groovy Postbuild
script generating those Changes did not have a default class loader that
knew about the Subversion ChangeLog classes.

So I punted and built a URLClassLoader myself. Everytime the Stable-Fiji
job ran, which apparently was quite, quite often.

I should have researched a little more back then and found the place in
Jenkins where it loads its plugins itself. I did that now and fixed the
Stable-Fiji job.

However, I could be all wrong and the PermSpace issue is not at all caused
by that URLClassLoader thingie, but rather by something as mundane as the
mere fact that we run a gazillion jobs on Jenkins. Well, maybe not a
gazillion. At the moment the tally is 170. Quite something, still.

So let's be wary and if we find that Jenkins is stuck (i.e. its CPU load
on dev is 99-100%), let's first inspect the respective /proc/<pid>/fd/ to
find the (probably deleted by the logrotator) log and cat it into a file
for future inspection. If the issue is PermSpace again, we might need to
simply increase the PermSpace with -XX:MaxPermSize=<memory> and/or require
the surefire plugin to fork by passing the -Dsurefire.forkMode=once
property (since a couple of recent unit tests in ImageJ/Fiji actually use
custom ClassLoaders).

All the current work on this issue can be found here:

http://trac.imagej.net/ticket/1863

The bug is marked as resolved because I expect the Stable-Fiji
configuration to have been the culprit, but if the issue arises again,
please reopen that ticket instead of adding a new one.

Thank you for your attention,
Dscho



More information about the ImageJ-devel mailing list