Double-checked locking without volatile

First things first: what you are trying to do is dangerous at best. I am getting a bit nervous when people try to cheat with finals. Java language provides you with volatile as the go-to tool to deal with inter-thread consistency. Use it.

Anyhow, the relevant approach is described in "Safe Publication and Initialization in Java" as:

public class FinalWrapperFactory {
  private FinalWrapper wrapper;

  public Singleton get() {
    FinalWrapper w = wrapper;
    if (w == null) { // check 1
      synchronized(this) {
        w = wrapper;
        if (w == null) { // check2
          w = new FinalWrapper(new Singleton());
          wrapper = w;
        }
      }
    }
    return w.instance;
  }

  private static class FinalWrapper {
    public final Singleton instance;
    public FinalWrapper(Singleton instance) {
      this.instance = instance;
    }
  }
}

It layman's terms, it works like this. synchronized yields the proper synchronization when we observe wrapper as null -- in other words, the code would be obviously correct if we drop the first check altogether and extend synchronized to the entire method body. final in FinalWrapper guarantees iff we saw the non-null wrapper, it is fully constructed, and all Singleton fields are visible -- this recovers from the racy read of wrapper.

Note that it carries over the FinalWrapper in the field, not the value itself. If instance were to be published without the FinalWrapper, all bets would be off (in layman terms, that's premature publication). This is why your Publisher.publish is disfunctional: just putting the value through final field, reading it back, and publishing it unsafely is not safe -- it's very similar to just putting the naked instance write out.

Also, you have to be careful to make a "fallback" read under the lock, when you discover the null wrapper, and use its value. Doing the second (third) read of wrapper in return statement would also ruin the correctness, setting you up for a legitimate race.

EDIT: That entire thing, by the way, says that if the object you are publishing is covered with final-s internally, you may cut the middleman of FinalWrapper, and publish the instance itself.

EDIT 2: See also, LCK10-J. Use a correct form of the double-checked locking idiom, and some discussion in comments there.


In short

The version of the code without the volatile or the wrapper class is dependent on the memory model of the underlying operating system that the JVM is running on.

The version with the wrapper class is a known alternative known as the Initialization on Demand Holder design pattern and relies upon the ClassLoader contract that any given class is loaded at most once, upon first access, and in a thread-safe way.

The need for volatile

The way developers think of code execution most of the time is that the program is loaded into main memory and directly executed from there. The reality, however, is that there are a number of hardware caches between main memory and the processor cores. The problem arises because each thread might run on separate processors, each with their own independent copy of the variables in scope; while we like to logically think of field as a single location, the reality is more complicated.

To run through a simple (though perhaps verbose) example, consider a scenario with two threads and a single level of hardware caching, where each thread has their own copy of field in that cache. So already there are three versions of field: one in main memory, one in the first copy, and one in the second copy. I'll refer to these as fieldM, fieldA, and fieldB respectively.

  1. Initial state
    fieldM = null
    fieldA = null
    fieldB = null
  2. Thread A performs the first null-check, finds fieldA is null.
  3. Thread A acquires the lock on this.
  4. Thread B performs the first null-check, finds fieldB is null.
  5. Thread B tries to acquire the lock on this but finds that it's held by thread A. Thread B sleeps.
  6. Thread A performs the second null-check, finds fieldA is null.
  7. Thread A assigns fieldA the value fieldType1 and releases the lock. Since field is not volatile this assignment is not propagated out.
    fieldM = null
    fieldA = fieldType1
    fieldB = null
  8. Thread B awakes and acquires the lock on this.
  9. Thread B performs the second null-check, finds fieldB is null.
  10. Thread B assigns fieldB the value fieldType2 and releases the lock.
    fieldM = null
    fieldA = fieldType1
    fieldB = fieldType2
  11. At some point, the writes to cache copy A are synched back to main memory.
    fieldM = fieldType1
    fieldA = fieldType1
    fieldB = fieldType2
  12. At some later point, the writes to cache copy B are synched back to main memory overwriting the assignment made by copy A.
    fieldM = fieldType2
    fieldA = fieldType1
    fieldB = fieldType2

As one of the commenters on the question mentioned, using volatile ensures writes are visible. I don't know the mechanism used to ensure this -- it could be that changes are propagated out to each copy, it could be that the copies are never made in the first place and all accesses of field are against main memory.

One last note on this: I mentioned earlier that the results are system dependent. This is because different underlying systems may take less optimistic approaches to their memory model and treat all memory shared across threads as volatile or may perhaps apply a heuristic to determine whether a particular reference should be treated as volatile or not, though at the cost of performance of synching to main memory. This can make testing for these problems a nightmare; not only do you have to run against a enough large sample to try to trigger the race condition, you might just happen to be testing on a system which is conservative enough to never trigger the condition.

Initialization on Demand holder

The main thing I wanted to point out here is that this works because we're essentially sneaking a singleton into the mix. The ClassLoader contract means that while there can many instances of Class, there can be only a single instance of Class<A> available for any type A, which also happens to be loaded on first when first reference / lazily-initialized. In fact, you can think of any static field in a class's definition as really being fields in a singleton associated with that class where there happens to be increased member access privileges between that singleton and instances of the class.


Quoting The "Double-Checked Locking is Broken" Declaration mentioned by @Kicsi, the very last section is:

Double-Checked Locking Immutable Objects

If Helper is an immutable object, such that all of the fields of Helper are final, then double-checked locking will work without having to use volatile fields. The idea is that a reference to an immutable object (such as a String or an Integer) should behave in much the same way as an int or float; reading and writing references to immutable objects are atomic.

(emphasis is mine)

Since FieldHolder is immutable, you indeed don't need the volatile keyword: other threads will always see a properly-initialized FieldHolder. As far as I understand it, the FieldType will thus always be initialized before it can be accessed from other threads through FieldHolder.

However, proper synchronization remains necessary if FieldType is not immutable. By consequent I'm not sure you would have much benefit from avoiding the volatile keyword.

If it is immutable though, then you don't need the FieldHolder at all, following the above quotation.