Review: Java Concurrency in Practice

I was recently assigned to a new project at work, which requires some concurrent programming. I’ve long put off investing in any formal Java programming texts, partly out of thriftiness and partially because none of the professional programming I’ve done to date required a formal education/reading of relevant texts to avoid writing completely incorrect code. Usually in line-of-business application development, a sub-optimal solution is not completely incorrect; at worst, it wastes CPU cycles. However, when it comes to concurrent programming, there’s three options:

It’s not thread-safe and very obviously not so. Nobody has any misconceptions about whether this routine will perform properly in a concurrent context.
It’s definitely thread-safe, and you know you can trust it. The set of such programs is actually fairly small; you usually trust that your Java Virtual Machine is bug-free, and that its standard libraries are bug-free. Beyond that it’s a bit of a toss up.
It may or may not be thread-safe, and even if it thinks it’s thread-safe, it very well may not be, in subtle ways, which are hard to reason about.

The problems with concurrent programming are that even the smallest mistake may cause byzantine complications of horrifying consequence, which are difficult or impossible to clearly reason about after the fact. Compared with costing my firm millions of dollars in lost revenue, paying $20 for a reference book seems like the right thing to do. So I went down to amazon.com and bought myself a copy of Java Concurrency in Practice , henceforth referred to as “JCP”.

I had high expectations, because this book is reputed to be the bible of writing safe concurrent programs, and is practically considered required reading for many jobs. There’s even a Rich Hickey video wherein he describes how JCP is a shocking read for those who’ve never picked it up, and echoes the requirement to read it:

Review

Suffice it to say, this book deserves the reputation it has. Brian Goetz manages to be interesting while covering extremely dry material, and he neither skimps on information nor belabours the point. The book is helpfully divided into four parts, which I found to be a useful demarcation:

“Fundamentals”, which should be required reading for all Java programmers.
“Structuring Concurrent Applications”, useful for defining application-wide or framework-wide concurrency concepts.
“Liveness, Performance, and Testing”, which should be read by all Java programmers, but is less critical than reading the fundamentals.
“Advanced Topics”, which can be considered an expansion of “Fundamentals”; should be read by anyone writing concurrent libraries or algorithms, but is critical reading than “Fundamentals” for general-purpose concepts.

As a languages nerd who also takes a keen interest in low-level details, I basically jumped straight from “Fundamentals” to “Advanced Topics” before working my way back through the other chapters. That said, the order is less important than ensuring the right concepts are learned.

Before I opened this book, I didn’t realize what a can of worms the Java Memory Model (JMM for short) can be. The JMM is a bit devilish in that it makes all sorts of wonderful guarantees about the performance-safety tradeoff of concurrent Java programs, with the caveat that this tradeoff relies on tricky contextual semantics. If a Java programmer doesn’t know about the contract of the JMM, s/he may violate concurrency safety in subtle ways, which are never flagged by the compiler. Only by learning the fundamentals can one safeguard themselves against doing terrible things; even having read it, it’s still all-too-easy to make critical mistakes.

This was an easy enough read for a Saturday afternoon to get a firm feel for the basics, and it changed my life (okay fine, my Java programming life). Recommending this book is not a strong enough term; I will chase you down in the street and chuck this book at your head, if you tell me that you haven’t read it but believe you can write thread-safe Java programs.

Appendix: Unsafe idioms that don’t look obviously wrong

The book has a number of helpfully demarcated examples, indicating safe, dodgy, and then not-so-obviously wrong ways to do things. Many of these are not obvious at first glance, so I thought I’d enumerate them for reference.

Double-checked locking

There’s a common but incredibly unsafe idiom in Java code, to do the following:

public class DoubleCheckedLocking {
  private static Resource resource;

  public static Object getInstance() {
    if (resource == null) {
      synchronized (DoubleCheckedLocking.class) {
        if (resource == null)
          // BAD! Without a memory barrier, the object will not be
          // safe to read on threads other than the constructing thread.
          resource = new Resource();
      }
    }
    return resource;
  }
}

This idiom can be made safe by introducing a volatile boolean flag for using as a memory barrier, as indicated by Guava’s Suppliers.memoize(...) function:

static class NonSerializableMemoizingSupplier<T> implements Supplier<T> {
  volatile Supplier<T> delegate;
  volatile boolean initialized;
  // "value" does not need to be volatile; visibility piggy-backs
  // on volatile read of "initialized".
  transient T value;

  MemoizingSupplier(Supplier<T> delegate) {
    this.delegate = checkNotNull(delegate);
  }

  @Override
  public T get() {
    // A 2-field variant of Double Checked Locking.
    if (!initialized) {
      synchronized (this) {
        if (!initialized) {
          T t = delegate.get();
          value = t;
          initialized = true;
          return t;
        }
      }
    }
    return value;
  }
  // ...
}

Improper atomicity delegation

Many developers mistakenly rely upon individual atomic operations and forget that per-class atomicity requires transactional atomicity, not merely atomicity of independent components. JCP has a great example of what not to do: a “NumberRange” class which is intended to be thread-safe for testing whether an integer is a prescribed range. (These days everyone knows to do this using immutable objects, but some people may not see the mistake either.)

public class NumberRange {
  private final AtomicInteger lower = new AtomicInteger(0);
  private final AtomicInteger upper = new AtomicInteger(0);

  public void setLower(int i) {
    lower.set(i);
  }
  public void setUpper(int i) {
    upper.set(i);
  }
  public boolean isInRange(int i) {
    // BAD! The two "get" calls happen separately
    // and the range can change between getting one
    // and getting the other.
    return (i >= lower.get() && i <= upper.get());
  }
}

A safer (and more performant/sane etc.) way to do this is to do as done in the Kotlin standard library for the equivalent class, IntRange. This is safe because the final properties mean that things can’t change in the middle of the operation, and that memory safety is guaranteed.

public class IntRange(start: Int, endInclusive: Int) /* ... */ {
  override val start: Int get() = first
  override val endInclusive: Int get() = last
  override fun contains(value: Int) = first <= value && value <= last
}

Leaking synchronized resources to unsynchronized contexts

This one took me by surprise, even though it should not have. It’s possible to accidentally “leak” a guarded mutable resource in an unsafe way, via things as innocuous as String concatenation:

public class HiddenIterator {
  private final Set<Integer> set = new HashSet<>();

  public synchronized void add(Integer i) { set.add(i); }
  public synchronized void remove(Integer i) { set.remove(i); }
  public void doStuff() {
    // Do some stuff...

    // BAD! Concatenating the set calls toString on it,
    // which requires iterating it, which is not thread-safe
    // because we allow modification in the middle of iteration!
    System.out.println("DEBUG: Current set is " + set);
  }
}

The answer here is that everything a class does with its internal state needs to be wrapped, including otherwise innocuous looking methods like equals, hashCode, and toString!

March 17, 2018

6 minute read

Review

Appendix: Unsafe idioms that don’t look obviously wrong

Double-checked locking

Improper atomicity delegation

Leaking synchronized resources to unsynchronized contexts