Better Caching in Elixir with Nebulex

May 16, 2019

In my last post I wrote about using ETS for caching in your Elixir app, and while ETS does remain a popular approach to in-memory caching for simple, one-off applications, it will eventually lead you down a rabbit hole of confusing bugs and unexpected results once you begin scaling your app beyond a single node.

This doesn’t mean that ETS can’t work in distributed scenarios at all, it can. But the legwork required and the possibility of not implementing things the right way™️, is very high and that is something you don’t want to do when you’re scaling your app. The general consensus is to use :mnesia in ram-only mode (check out Memento!), but even then it’s up to the developer to handle some implementation details (like network partitions).

Nebulex

If it were not for Nebulex, I would probably be writing a post about implementing distributed caches using Mnesia/Memento. But it exists, and it implements a system of multi-level caches in a distributed system supporting a wide range of scenarios and custom strategies, while still giving excellent reliability and support out of the box. For an in-depth guide to Nebulex, I suggest this excellent article on Erlang Battleground – But in this post, I’ll only cover how to get quickly started with Nebulex for a multi-node application.

We’ll be defining a 2-level cache, so let’s add configs for both of them:

# config/config.exs

config :my_app, MyApp.Cache.Local,
  gc_interval: 86_400

config :my_app, MyApp.Cache.Distributed,
  local: MyApp.Cache.Local,
  node_selector: Nebulex.Adapters.Dist

Once we have that out of the way, we can define our Cache module:

defmodule MyApp.Cache do
  defmodule Local do
    use Nebulex.Cache, otp_app: :my_app, adapter: Nebulex.Adapters.Local
  end

  defmodule Distributed do
    use Nebulex.Cache, otp_app: :my_app, adapter: Nebulex.Adapters.Dist
  end
end

Caching Helpers

I like to go a step further and add some methods to hide the internal API calls, as well as some helpers from my previous post, to make the Cache interface much more pleasant to use:

defmodule MyApp.Cache do
  @default_ttl 5 * 60


  # Cache Levels
  # ------------


  defmodule Local do
    use Nebulex.Cache, otp_app: :my_app, adapter: Nebulex.Adapters.Local
  end

  defmodule Distributed do
    use Nebulex.Cache, otp_app: :my_app, adapter: Nebulex.Adapters.Dist
  end


  # Public API
  # ----------


  @doc """
  Takes a resolver function whose value is only cached if it
  returns an `{:ok, any()}` tuple
  """
  def resolve(type, key, opts \\ [], resolver) when is_function(resolver, 0) do
    Distributed.transaction(fn ->
      case get(type, key) do
        nil ->
          with {:ok, result} <- resolver.() do
            {:ok, set(type, key, result, opts)}
          end

        result ->
          {:ok, result}
      end
    end)
  end


  @doc "Get an item from the cache"
  def get(type, key), do: Distributed.get(name(type, key))


  @doc "Put an item in the cache"
  def set(type, key, value, opts \\ []) do
    name = name(type, key)
    opts = Keyword.put_new(opts, :ttl, @default_ttl)

    Distributed.set(name, value, opts)
  end


  @doc "Delete an item from the cache"
  def delete(type, key), do: Distributed.delete(name(type, key))


  @doc "Clear all cached items"
  defdelegate flush, to: Distributed


  # Private Helpers
  # ---------------


  defp name(type, key), do: "#{type}:#{key}"
end

get/2 and set/4 are pretty straightforward:

# Put in cache with a timeout of 10 minutes
{:ok, result} = XYZ.expensive_process(user, args)
Cache.set(:an_expensive_call, user.id, result, ttl: 10 * 60)

# Get it back
Cache.get(:an_expensive_call, user.id)

But most of the time you want to cache the result of a piece of code immediately, and not run it at all if it’s still valid in the cache. That’s where the resolve/4 method above comes in. Suppose you often have to make a web request to fetch a company’s list of users from Slack which might change once every few days:

def fetch_users_from_slack(company) do
  Cache.resolve(:slack_users, [ttl: 60 * 60], company.id, fn ->
    with {:ok, response} <- SlackAPI.get("users.list", company.token)
      {:ok, response["users"]}
    end
  end)
end

You could accomplish a lot more with the resolve/4 helper, easily caching different parts of your application. Combined with Nebulex’s multi-level caching framework in distributed applications, this makes the whole process a breeze.