Purity injection in Elixir

If you came to Elixir from Ruby, like I did, you have probably been looking for a way to do dependency injection in Elixir. I know I did. I also know it's not that simple and I was never really satisfied with the results. After few years of looking at different options I arrived to perhaps surprising conclusion: I don't need it (at least not most of the times).

How so?

Dependency injection is especially useful in testing. It allows to swap a dependency of code which is slow or unstable with something faster and more predictable. Is DI the only means of ding it, though?

Thinking (more) functional

Elixir is a functional language, even if you Haskell friends do not necessarily agree with this. In recent months, I have begun to understand how helpful is is to step back from object-oriented thinking (inherited - pun intended - from Ruby) and attempt to approach problems from more functional perspective.

In functional languages there's always talk about pure vs impure functions. Ideally, all functions should be pure, but software exists in a very impure context, so it's not feasible. But that does not mean we should give up on purity. On the contrary, it's generally a good idea to avoid letting impurity seep into every part of our code and to attempt to write as much pure code as we can.

Just for the sake of alignment, I'm talking about pure functions, which are functions with the following properties:

  • For the same set of arguments they always return the same value
  • They don't mutate any kind of a global state (produce side-effects)

We will be mostly concentrating an the first point. It seems simple. After all, if you want to calculate a total order price from order items, you just make some multiplication and addition - that's pure. But there are always some dark corners of the codebase where this is much, much harder. Now let's examine one of these.

Generating order number

In one of the projects I was working on we had to write an order numbers generator. For each incoming order, it was supposed to create a new set of letters and digits, which is:

  • Unique for given tenant
  • Human-friendly (i.e. no "O vs 0")
  • Hard to guess (i.e. no monotonic sequences)

The first attempt of implementation was like this:

defmodule OrderNumber do
  def generate(tenant_id) do
    candidate = 
      :crypto.strong_rand_bytes(4) 
      |> Base.encode32(padding: false) 
      |> replace_ambiguous_characters()

    query = 
      from o in Order, 
      where: o.tenant_id == ^tenant_id and o.order_id == ^candidate

    if Repo.exists?(query), do: generate(tenant_id), else: candidate
  end
end

This looks good, until you try to test it. Basically, the only thing you can test here is that it returns some sort of 7-characters long string devoid of letters O and I (they have been converted to numbers 0 and 1 respectively by replace_ambiguous_characters/1 function). But is it a good test? Are Os and Is missing because we have replaced them or because they weren't included in the initial random string? We need more control over the execution in order to increase tests reliability.

In a classic OOP-minded dependency injection we would try to pass in some dependencies, i.e. nouns. In this case, the candidates are: random number generator and the repository. Let's try this:

defmodule OrderNumber do
  def generate(tenant_id, rng \\ :crypto, repo \\ Repo) do
    candidate = 
      rng.strong_rand_bytes(4) 
      |> Base.encode32(padding: false) 
      |> replace_ambiguous_characters()

    query = 
      from o in Order, 
      where: o.tenant_id == ^tenant_id and o.order_id == ^candidate

    if repo.exists?(query), do: generate(tenant_id), else: candidate
  end
end

Okay, this wasn't that bad. But now we need to craft some special modules to pass in as dependencies in tests. Elixir does not make it easy for us. You basically have to define them upfront - and the worst part: you have to name them.

What if instead of injecting nouns (dependencies) we inject verbs (functions)? After all, functions should be first-class citizens of a functional code. While I know there are some discussions about the dot-notation ruining it for Elixir, we still can do it relatively easy. Let's see this in action.

defmodule OrderNumber do
  def generate(tenant_id, opts \\ []) do
    generate_random = Keyword.get(opts, :generate_random, fn -> 
      :crypto.strong_rand_bytes(4) 
      |> Base.encode32(padding: false) 
    end)

    check_existence = Keyword.get(opts, :check_existence, fn tenant_id, candidate ->
      from(o in Order, 
      where: o.tenant_id == ^tenant_id and o.order_id == ^candidate)
      |> Repo.exists?()
    end)

    candidate = 
      generate_random.()
      |> replace_ambiguous_characters()

    if check_existence.(tenant_id, candidate),
      do: generate(tenant_id, opts), else: candidate
  end
end

With that, we can easily test if the ambiguous characters (O, I) are replaced by simply passing a function returning a test-worthy string:

test "replace Os and Is" do
  generator = fn -> "ABCO0I1" end
  assert OrderNumber.generate(@tenant_id, generate_random: generator) == "ABC0011"
end

It is a bit more tricky with the remaining condition - to generate the candidate again if the order number is already taken. We can solve it in two ways: create an existence checker that returns true when called the first time and false the second time or create a "random" generator that returns values we want in sequence. Either way, we need to introduce a "controlled impurity", i.e. function will modify external state, but unlike a database, this state will be local to the test run. Personally I strongly favour the second option, as it tests cooperation between the two injected functions as well.

test "generate another candidate when first one is already taken" do
  {:ok, agent} = Agent.start_link(fn -> {0, ["ABC123", "XYZ555"]} end)
  generator = fn ->
    Agent.update(agent, fn {idx, list} -> {idx + 1, list} end)
    Agent.get(agent, fn {idx, list} -> Enum.at(list, idx - 1) end)
  end

  exists = fn _, number -> number == "ABC123" end

  assert OrderNumber.generate(
    @tenant_id, 
    generate_random: generator, 
    check_existence: exists
  ) == "XYZ555"
end

The question remains: is OrderNumber.generate/2 pure now? Since we inverted the control, it depends on the caller. By default it is not pure, calling random number generator and the database. However, by passing in pure (or "controlled pure") functions as opts we can make it pure, which is super-useful for testing.

Final touches

Just for the sake of the code readability, I recommend to make some more changes to OrderNumber module. The generate functions does not need to include the meaty details of calling the database or generating random strings, so we can extract these are private functions. With that, the main function looks like this:

defmodule OrderNumber do
  def generate(tenant_id, opts \\ []) do
    generate_random = Keyword.get(opts, :generate_random, &generate_random/0)
    already_taken? = Keyword.get(opts, :check_existence, &already_taken?/2)

    candidate = 
      generate_random.()
      |> replace_ambiguous_characters()

    if already_taken?.(tenant_id, candidate),
      do: generate(tenant_id, opts), else: candidate
  end
end

With that the low-level concerns about database structure or RNG method chosen are hidden and the function istelf is much better at simply telling what it does, step by step.

Summary

By adjusting the mindset to stop thinking about dependencies and start to think about behaviours (functions) we were able to extract the impure parts of the number generator function. Then, by making them injectable we transformed the impure function to the one into which we can inject purity in tests, making it essentially pure and thus much easier to test. We also didn't need any fancy tool like Mox, Mimic or Rewire to define replacement modules for us. The code is hopefully understandable and uses only built-in Elixir idioms, without macros.