I recently had the drive/opportunity to deep-dive on how Clojure’s namespaces
function and how they provide a simple abstraction using the concept of Clojure’s
Vars”. Here is a deep-dive on how they work. This is a two-part
series. The next part of the series is available at
Clojure and the Esoteric Mysteries of Namespaces.
Vars: A Simplified Model of Variables
One of Clojure’s essential motivations is to provide a hosted runtime for
easily concurrent programs, wherein most of the challenges of locking and
thread-safety are provided “for free” (at least in the sense of
the programmer not having to worry about these low-level concepts). To that end,
Clojure implements its variables differently than most other languages.
In your typical programming runtime, variables describe locations in memory
containing primitive or structured data. These can be anything from primitive
numerical types such as integers and strings to structured types. This closely
reflects how the machine itself views data (as locations in memory containing
raw values), but presents challenges for concurrent programming:
- Global variables are easy enough for reading, but if an update to a global
variable needs to occur, a lot of locking needs to occur.
- If multiple parallel threads of execution want to read or write variables
(whether local or global) concurrently, some threads may have stale data.
- If updating structure data like a
struct or a
class object, and the update
is not done atomically (in one clean shot), some threads of execution can read
inconsistent state from the
When writing concurrent programs, there is a tension between the need for
variables to be accessed or even updated from multiple isolated contexts
(threads) concurrently, which requires indirections and locking, and for them
to perform efficiently, which suffers from indirection. Generally, there is an
implicit assumption that it is only acceptable to use concurrency in programs
which can accept a minor amount of locking and indirection; if this was not so,
then concurrency would not be acceptable.
Clojure directly addresses this tension through the use of a clever data
structure called the
Var, defined in
In a nutshell, the
Var works as follows:
- All “variables” in the Clojure runtime are instances of
Vars support two modes of operation, one fast and global, the other slower
- Ordinarily, a
Var object contains a value and some basic locking primitives.
Whenever a Clojure program asks for a value (from a
Var) which has not
been modified, that value is dereferenced cheaply, with a quick return path.
- The basic value, called the “root” value, can be atomically
swapped out for another value at any time. This is considered a global
update. In practice it is rarely required, because —
Var can be declared in advance as being dynamic. If and when a
is dynamic, a local thread of execution may begin declaring local
overrides for the value of the
Var. This is called a
Within those thread-local overrides, the value of the
Var can be easily
tweaked using functions like
The moment that any thread anywhere in the program begins
binding on a
Var globally switches from its fast-lookup
execution mode to a dynamic, thread-local stack-based lookup mode.
This is a one-way only change, and cannot (currently) be reverted.
- This thread-local stack-based lookup mode allows any thread to create a
stack of alternate values “on top of” the global definition.
From within any thread of execution that has local
bindings, only those
local bindings are seen. The stack can be made larger by successive calls
binding, and the stack shrinks whenever a
binding is exited. A
dynamically bound value can be modified atomically without changing the
stack size by atomically swapping the value at the top of the stack.
- Even once a
Var has all bindings in all threads eliminated, it is still
stuck in a slower, dynamic, thread-local mode of operation. This simplifies
program execution (because otherwise, safely deciding when all threads have
abandoned their stacks is quite challenging).
This is a lot to grasp, so some examples may be useful.
Non-Dynamic Var Usage
Although this looks like a vanilla variable declaration in any programming
language, it actually creates an instance of
user=> (def my-variable 5)
#'user/my-variable is a bit deep. It means the following:
- The fully qualified name of this variable is “user/my-variable”. It
means that the
my-variable lives within the
#' prefix is a Clojure shorthand meaning that the “value”
referenced is, in fact, the
Var reference (the box containing the value 5),
not the value itself (which is 5).
Issuing Global Var Updates
Continuing the example above, we can atomically and globally swap the value
my-variable by taking the existing
Var and telling it to safely replace
the old value with the new one.
;; inc is short for increment by 1
user=> (alter-var-root #'my-variable inc)
This update is global and atomic. Every thread sees the new value at the same time.
Creating a Dynamic Var
If a Var is not marked as dynamic, it cannot be used for thread-local usage.
We can achieve dynamism simply by annotating the
Var at declaration time:
user=> (def ^:dynamic new-var 0)
In addition to being able to alter the root value of the
Var, we may also
create thread-local bindings (entering the second mode of operation):
;; We shadow the original definition, but the original is still there somewhere
user=> (binding [new-var new-var] (var-set #'new-var (inc new-var)) new-var)
Within this thread-local context, we were able to (locally) replace one value
with another. Globally, however, the value stayed the same.
We cannot, however, attempt to call
var-set or the like on the global
var-set and its ilk can only modify values at the top of
a non-empty stack of thread-local modifications.
I learned quite a bit about this, but primarily by reading the source code of
Clojure. I’ll compile a list of references below, all within the source of