[ImageJ-devel] Lock-free bit fields, was Re: What's left to do for ImgLib2

Thu Oct 30 05:49:43 CDT 2014

Hi Tobias,

first of all: I am really happy that this discussion is now open, enabling
us to benefit from the entire expertise available in the ImageJ universe.

I would like to use the opportunity to provide a bit of background for
those readers who did not benefit from the extensive in-person discussions
at the hackathon, in particular because there is no public summary
available yet:

At the hackathon [*1*] the main goal was to get ImgLib2 out of beta (and
there was a *lot* of progress in that regard), and in the process a couple
of last-minute changes were introduced, amongst others a memory
optimization of the bit type containers. In particular, it changed from:

https://github.com/imglib/imglib2/blob/ffeee80%5E/core/src/main/java/net/imglib2/type/logic/BitType.java

to

https://github.com/imglib/imglib2/blob/fc0d3ebcd/src/main/java/net/imglib2/type/logic/BitType.java

That is, the BitAccess was replaced by a LongAccess. The BitAccess was
implemented by

https://github.com/imglib/imglib2/blob/ffeee80%5E/core/src/main/java/net/imglib2/img/basictypeaccess/array/BitArray.java

Unfortunately, after upgrading imagej-ops to the ImgLib2 release, the
regression tests started failing, and Tobias offered an explanation:

On Thu, 30 Oct 2014, Tobias Pietzsch wrote:

> The test uses
> 		final Img<BitType> out = bitmap();
> 
> I bet that for the new BitType from the Fractions branch nobody
> considered the possibility that two cursors might simultaneously write
> to bits of the same underlying long value.

As the problem is intermittent and changes between test runs even on the
same computer, this is quite likely, especially given that the original
BitArray used locking explicitly:

https://github.com/imglib/imglib2/blob/ffeee80%5E/core/src/main/java/net/imglib2/img/basictypeaccess/array/BitArray.java#L85-L91

That the BitType is now thread-unsafe is therefore a regression introduced
just recently.

> One solution would be to use locks to synchronize all accesses to the
> underlying long[] array (this is for BitType, Unsigned12BitType, etc).
> However, I fear that this will slow down things considerably.

I agree that this would slow down operations as compared to the current
code (at the price of correctness *grins*), but it would not slow down
operations as compared to the BitArray which was used previously.

> Is anyone familiar enough with the Java Memory Model to provide an
> educated guess as to whether a lock-free approach would be feasible?

The best online resource on this issue I am aware of is
http://www.vogella.com/tutorials/JavaConcurrency/article.html#nonblocking
(while the best offline resource is "Java Concurrency In Practice").

It talks about the most common lock-free primitive, available in Java
since version 5.0: compare-and-swap (CAS). Unfortunately, this technique
would require us to switch to a different data type, as the operation is
not available on primitive types, let alone primitive type arrays.

Theoretically, we could paper over this issue by using the Unsafe class.
However, this class is marked as internal API for a good reason, and it
would not be advisable to make use of it to work around a fundamental
design issue.

> By lock-free I mean setting the value and then checking whether the
> value is actually what was expected (and if not, retry).

A naïve implementation of this technique could easily result in a very
nasty ping-pong effect: if one thread tries to clear a bit and the next
thread tries to set it, it is very to run into a trap when not leaving a
way for one thread to win.

The safest way to resolve this issue is to reinstate the lock-on-write
method that was in place earlier, i.e. surround the lines

https://github.com/imglib/imglib2/blob/fc0d3ebcd9256e60961180952c0383c47e63d111/src/main/java/net/imglib2/type/logic/BitType.java#L133-L137

with a `synchronized (dataAccess) { ... }` guard.

An alternative might be to give up lock-freedom in favor of wait-freedom
[*2*]. In this regard, a more performant version might be to change

	// Clear the bits first, then or the masked value
	if ( value )
		dataAccess.setValue(i1, (dataAccess.getValue(i1) | (1l << shift) ) );
	else
		dataAccess.setValue(i1, (dataAccess.getValue(i1) & ~(1l << shift)) );

to use Optimistic Concurrency Control [*3*]:

	final long original = dataAccess.getValue(i1);
	if ( value ) {
		final long newValue = original | (1l << shift);
		dataAccess.setValue(i1, newValue);
		if ( newValue != dataAccess.getValue( i1 ) ) {
			synchronized (dataAccess) {
				dataAccess.setValue( i1, dataAccess.getValue(i1) | (1l << shift) );
			}
		}
	} else {
		final long newValue = original & ~(1l << shift);
		dataAccess.setValue(i1, newValue);
		if ( newValue != dataAccess.getValue( i1 ) ) {
			synchronized (dataAccess) {
				dataAccess.setValue( i1, dataAccess.getValue(i1) & ~(1l << shift) );
			}
		}
	}

However, apart from being ugly, it is a little bit too complicated to be
verified as correct easily even by myself.

As ImgLib2 has yet to attract any concurrency expert, I would be *really*
reluctant to destabilize ImgLib2 at this criticial time, and would rather
leave this for a future improvement at a time when we have concurrency
experts in our ranks.

Correctness trumps speed.

Ciao,
Johannes

Footnote *1*: The best public information so far is:
http://imagej.net/pipermail/imagej-devel/2014-October/002280.html

Footnote *2*:
https://en.wikipedia.org/wiki/Lock-free_and_wait-free_algorithms

Footnote *3*: https://en.wikipedia.org/wiki/Optimistic_concurrency_control