[ImageJ-devel] [fiji-devel] Lock-free bit fields, was Re: What's left to do for ImgLib2

Thu Oct 30 13:44:06 CDT 2014

Hi Tobias,

woudn't the easiest be to rename the current BitType instances as 
Unsynchronized+name, and then the actual class extend the corresponding 
unsynchronized class, with one method overriden to synchronize access to 
the pixels?

This way one gets both: the default is safe (synchronized), and if one 
knows what one is doing, one can get the Unsynchronized* version to 
avoid the cost.

Albert

On 10/30/2014 02:39 PM, Tobias Pietzsch wrote:
> Hi Stephan,
>
> Getting the ‘unsafe’ interval for a specific location is certainly possible. But how would that be effectively used in an algorithm if the interval changes from location to location?
> Alternatively, RandomAccessibles and IterableIntervals could offer methods to chop them up into ‘safe’ parts for multithreading. However there are many different ’safe' subdivision and it depends on the algorithm which one is preferrable. Also these subdivisions (as well as the ‘unsafe’ interval) would need to be propagated correctly through Views and RealViews which might get rather involved.
> I’m happy to discuss ideas in this direction, but I don’t think it is a viable short-term solution.
>
> For practical reasons, I would stick with “You are safe as long as multiple threads write to different pixels”.
> This is the contract that we have been implicitly following all along. A lot of code relies on it. Even if we come up with a nice alternative, we do not have the man-power to fix all code that relies on the old contract and that we would break along the way. Therefore my preferred short-term solution is to synchronize( dataAccess ){…} the fractioned-type writes, as Johannes suggested.
>
> best regards,
> Tobias
>
> On 30 Oct 2014, at 18:57, Stephan Saalfeld <saalfelds at janelia.hhmi.org> wrote:
>
>> Hi Tobias,
>>
>> I agree that the constraint is easier if the fraction reduces to an
>> integer.  However, it's not exactly the same for fraction=1 or
>> fraction>1 either.  It would be great if we could identify a common
>> scheme that covers all cases without much interference.
>>
>> Is may be a disk-based, memory cached CellImg the same thing as a
>> fractioned NativeImg?  Writing into different pixels in the same cell
>> may lead to confusing results when written to disk.
>>
>> What about a method in RandomAccess that returns an `unsafe' interval
>> for its location?  Generally, that would be (1^n), in case of fraction
>> types, it would be the box surrounding all pixels served by the same
>> primitive type (which is horrible at the end of a row or cell-row where
>> pixels in the next row are affected), and in case of cached cells it
>> could be the cell.
>>
>> With a method of this flavor, we can make educated decisions on
>> construction time of the multi-threaded code that, internally, would not
>> synchronize, i.e. be fast.
>>
>> Best,
>> Stephan
>>
>>
>>
>>
>> On Thu, 2014-10-30 at 18:29 +0100, Tobias Pietzsch wrote:
>>> Hi Stephan,
>>>
>>> I think it would be nice to have getLock() but I also think it will be rarely needed in practice.
>>>
>>> We must be careful not to conflate two problems here:
>>>
>>> The first one is that writes to e.g. ComplexType are not atomic and therefore strange things may happen if two ComplexTypes are used that actually refer to the same ComplexType pixel value in the image.
>>> As Albert suggested, algorithms that need this feature need to take special care to synchronize access.
>>> However, for many parallelizable algorithms this is not actually a problem. In most image-to-image operations (e.g. FFT, convolution, etc…) every output pixel is written only once by only one thread. Threads maybe read the same input pixels, but reading is fine.
>>> The getLock() method would be a welcome addition for those algorithms that do not follow this pattern and need to synchronize.
>>>
>>> The second problem is different. For BitType, writes to BitType pixels at different locations in the image may influence each other. And this should be avoided by default in my opinion.
>>>
>>> I think: “You are safe as long as multiple threads write to different pixels” is a good contract to have.
>>> Diverging from that with BitType, Unsigned12BitType, etc would add overhead for many algorithms that is in most cases not required (e.g. for FloatType, ComplexDoubleType, etc. the synchronization overhead would be wasted).
>>>
>>> best regards,
>>> Tobias
>>>
>>>
>>>
>>> On 30 Oct 2014, at 16:18, Stephan Saalfeld <saalfelds at janelia.hhmi.org> wrote:
>>>
>>>> Thanks for the articles!
>>>>
>>>> I have more comments on the matter.  In fact, all types have the same
>>>> problem.  Even for a non-native ComplexType read and write would not be
>>>> atomic and thus not thread-safe.  The problem is that, for non-native
>>>> types, it is sufficient for multi-threaded code to synchronize on the
>>>> type instance itself.  For native types (e.g. ComplexDoubleType) and for
>>>> other proxy mechanisms such as Composites or ReadWriteConverters, this
>>>> doesn't work.  How about a getLock() (or getMonitor()) method as part of
>>>> Type whose purpose is to return a lock that enables synchronization on
>>>> that particular's type content.  Should that lock be constant for a
>>>> type's lifetime?  Proxy types for which access is atomic could return
>>>> themselves, just as Types that actually contain their content.
>>>>
>>>> I like Tobias' proposal with a Hash of locks for NativeTypes, something
>>>> similar is necessary for other writable proxies.
>>>>
>>>> Best,
>>>> Stephan
>>>>
>>>>
>>>>
>>>> On Thu, 2014-10-30 at 14:51 +0100, Adrian Daerr wrote:
>>>>> Hi,
>>>>>
>>>>>>> By lock-free I mean setting the value and then checking whether the
>>>>>>> value is actually what was expected (and if not, retry).
>>>>>>
>>>>>> A naïve implementation of this technique could easily result in a very
>>>>>> nasty ping-pong effect: if one thread tries to clear a bit and the next
>>>>>> thread tries to set it, it is very to run into a trap when not leaving a
>>>>>> way for one thread to win.
>>>>>>
>>>>>> The safest way to resolve this issue is to reinstate the lock-on-write
>>>>>> method that was in place earlier
>>>>> [..]
>>>>>>
>>>>>> An alternative might be to give up lock-freedom in favor of wait-freedom
>>>>>> [*2*]. In this regard, a more performant version might be
>>>>> [..]
>>>>>> to use Optimistic Concurrency Control [*3*]:
>>>>>
>>>>>> 	final long original = dataAccess.getValue(i1);
>>>>>> 	if ( value ) {
>>>>>> 		final long newValue = original | (1l << shift);
>>>>>> 		dataAccess.setValue(i1, newValue);
>>>>>> 		if ( newValue != dataAccess.getValue( i1 ) ) {
>>>>>> 			synchronized (dataAccess) {
>>>>>> 				dataAccess.setValue( i1, dataAccess.getValue(i1) | (1l << shift) );
>>>>>> 			}
>>>>>> 		}
>>>>>> 	}
>>>>> [snip]
>>>>>
>>>>> Hum, I do not if this is really a comparable situation, but it looks a
>>>>> lot like the double-checked locking (DCL) idiom, which is broken [1].
>>>>>
>>>>> FWIW,
>>>>> cheers and good luck,
>>>>> Adrian
>>>>>
>>>>>
>>>>> [1]
>>>>> TL;DR : You cannot have as-if-serial semantics across threads unless you
>>>>> use synchronized.
>>>>>
>>>>> "Double-checked locking: Clever, but broken
>>>>> Do you know what synchronized really means?" By Brian Goetz
>>>>> http://www.javaworld.com/article/2074979/java-concurrency/double-checked-locking--clever--but-broken.html
>>>>>
>>>>> and its follow-up article
>>>>>
>>>>> "Can double-checked locking be fixed?
>>>>> No matter how you rig it, double-checked locking still fails" (also by
>>>>> Brian Goetz)
>>>>> http://www.javaworld.com/article/2075306/java-concurrency/can-double-checked-locking-be-fixed-.html
>>>>
>>>
>>
>