Sheharyar Naseer

Writing a Simple Blockchain in Elixir


With all the hype about Bitcoin and other cryptocurrencies in recent years, I wanted to understand their underlying technology, i.e. Blockchain, better. So of course, what better way to learn than write my own. After reading up a bit on the internals of Bitcoin, the basic structure of blockchains and related data structures such as Merkle Trees, I decided to write a very simple version in Elixir.

I went with Elixir for a couple of reasons:

  1. To get better at it
  2. See how Erlang’s distributed model could help me with replication
  3. I just love the damn language too much

The original blockchain I wrote is a bit complicated since I end up adding too many features (I wanted to experiment with node synchronization, RPC calls, cryptographic validations using Ecto, and file storage as an application). So for the purposes of this article, we are going to write a blockchain that simply stores some text in each block (but you can check out the actual project here).

The Basics

In the simplest of terms, a blockchain is basically a list of “blocks” where each block object/struct stores a cryptographic hash of the previous block. A blockchain is valid only if the hash of each block in the chain matches the stored hash in the next one. Cryptographically, this means that if the tiniest bit of data changes in any block, the hashes of all consequent blocks would change drastically and as a result invalidate the blockchain. So, a blockchain must always be append-only, i.e. no previous data can be changed or removed.

Structure

Each block in the chain must store a couple of items; such as the data being stored and the timestamp it was created on. We also need to store the has of the previous block, and for good measure let’s store the hash of the current block as well. Although we should also use Public-Key Cryptography and store the signature of the author (which can be verified later), we’re going to ignore that part and focus on understanding the basics (maybe we can do that in another blog post in the future). Our struct should look something like this:

%Block{
  data: some_text,
  timestamp: naive_datetime,
  prev_hash: hash_of_the_previous_block,
  hash: hash_of_the_current_block,
}

So, let’s create a Block module, with two methods; new/2 that takes the data that’s going to go in the block along with the hash of the previous block, and a zero/0 method that builds the first “zero” block in a blockchain:

defmodule Block do
  defstruct [:data, :timestamp, :prev_hash, :hash]


  @doc "Build a new block for given data and previous hash"
  def new(data, prev_hash) do
    %Block{
      data: data,
      prev_hash: prev_hash,
      timestamp: NaiveDateTime.utc_now,
    }
  end


  @doc "Build the initial block of the chain"
  def zero do
    %Block{
      data: "ZERO_DATA",
      prev_hash: "ZERO_HASH",
      timestamp: NaiveDateTime.utc_now,
    }
  end

end

Cryptography

Since blockchains involve a lot of cryptography, it’s better to have a separate module responsible for it and provide a uniform interface (so we can change the implementation in the future if we want without breaking the functionality). We also need to encode our data in some format before we hash it. You can go with anything you want, but we’re going to use JSON. Start by adding the poison dependency in your mix.exs file:

{:poison, "~> 3.1"}

And then create a Crypto module with the helper hashing functions. We’ll use the built in SHA256 hashing in Erlang’s :crypto module:

defmodule Crypto do

  # Specify which fields to hash in a block
  @hash_fields [:data, :timestamp, :prev_hash]


  @doc "Calculate hash of block"
  def hash(%{} = block) do
    block
    |> Map.take(@hash_fields)
    |> Poison.encode!
    |> sha256
  end


  @doc "Calculate and put the hash in the block"
  def put_hash(%{} = block) do
    %{ block | hash: hash(block) }
  end


  # Calculate SHA256 for a binary string
  defp sha256(binary) do
    :crypto.hash(:sha256, binary) |> Base.encode16
  end

end

We were very careful to only encode only the main 3 fields here (without the hash field of the block) using the Map.take/2 function, and then calculate their SHA256. We also added a helper method put_hash/1 to directly put the hash in it. If we were using public-key cryptography and storing the author’s signature in the blocks as well, we would have created sign/1 and verify/1 methods for that purpose, using the RsaEx library or Erlang’s built-in :public_key module.

We’ll also add a valid? method in our Block module to check if the hash of block is valid, and matches the previous block’s hash. When only the block itself is given, it calculates the hash of the block and compares it with the stored value. But when the previous block is also given, it compares the value of the previous block’s hash to the one stored in the prev_hash field:

defmodule Block do

  # Existing stuff...


  @doc "Check if a block is valid"
  def valid?(%Block{} = block) do
    Crypto.hash(block) == block.hash
  end

  def valid?(%Block{} = block, %Block{} = prev_block) do
    (block.prev_hash == prev_block.hash) && valid?(block)
  end

end

Bringing it Together

Finally, let’s build the Blockchain module that contains all blocks in order and provides a neat interface to insert new blocks and calculate their hashes. Ideally, the blockchain should be stored on the disk (using a database, for example) and the module providing a quick interface to it. Even if we decide to not persist the chain on the disk, we can use GenServers or Mnesia tables to maintain the state. But again, for the purposes of this post, we won’t manage the state and just interact with it as a list.

defmodule Blockchain do
  @doc "Create a new blockchain with a zero block"
  def new do
    [ Crypto.put_hash(Block.zero) ]
  end


  @doc "Insert given data as a new block in the blockchain"
  def insert(blockchain, data) when is_list(blockchain) do
    %Block{hash: prev} = hd(blockchain)

    block =
      data
      |> Block.new(prev)
      |> Crypto.put_hash

    [ block | blockchain ]
  end


  @doc "Validate the complete blockchain"
  def valid?(blockchain) when is_list(blockchain) do
    zero = Enum.reduce_while(blockchain, nil, fn prev, current ->
      cond do
        current == nil ->
          {:cont, prev}

        Block.valid?(current, prev) ->
          {:cont, prev}

        true ->
          {:halt, false}
      end
    end)

    if zero, do: Block.valid?(zero), else: false
  end
end

The new/0 method returns a new blockchain, with a zero block as its only contents. The insert/2 method creates a new block for the given data, hashes it and inserts it into the blockchain, returning the updated state. Finally, the most interesting method here, valid?/1, goes through the entire blockchain in reverse order (most recent block to the oldest one) using Enum.reduce_while/3, validating blocks in pairs. If at any point the internal Block.valid? returns false, the entire method does too.

Running it

We don’t have fancy blockchain network running right now, so we’ll simply jump in iex and test it out there:

# Create a new Blockchain
iex(1)> chain = Blockchain.new
[%Block{data: "ZERO_DATA",
  hash: "BDC59FEE8F9239579CB4A29828F5F19B081EF3D34A99331E898B22CDDA7922F0",
  prev_hash: "ZERO_HASH", timestamp: ~N[2018-01-14 10:25:25.936002]}]


# Insert new data as a block
iex(2)> chain = Blockchain.insert(chain, "MESSAGE 1")
[%Block{data: "MESSAGE 1",
  hash: "A14F863D347E37B54DD68BB12251A74FE06EA0623DC40856EBC81942B63001FE",
  prev_hash: "BDC59FEE8F9239579CB4A29828F5F19B081EF3D34A99331E898B22CDDA7922F0",
  timestamp: ~N[2018-01-14 10:26:40.121790]},
 %Block{data: "ZERO_DATA",
  hash: "BDC59FEE8F9239579CB4A29828F5F19B081EF3D34A99331E898B22CDDA7922F0",
  prev_hash: "ZERO_HASH", timestamp: ~N[2018-01-14 10:25:25.936002]}]


# Insert a few more blocks
iex(3)> chain = chain |> Blockchain.insert("MESSAGE 2") |> Blockchain.insert("MESSAGE 3")
[%Block{data: "MESSAGE 3",
  hash: "3708929579029151347B9C588E79EDEEF4E9911579AF9CF5BD2D4437B0FFBEDF",
  prev_hash: "FA609D2A255FD56B90D5F10108B804DF4BF42CA257CDECC291698783DDC74CB0",
  timestamp: ~N[2018-01-14 10:27:07.874248]},
 %Block{data: "MESSAGE 2",
  hash: "FA609D2A255FD56B90D5F10108B804DF4BF42CA257CDECC291698783DDC74CB0",
  prev_hash: "A14F863D347E37B54DD68BB12251A74FE06EA0623DC40856EBC81942B63001FE",
  timestamp: ~N[2018-01-14 10:27:07.874129]},
 %Block{data: "MESSAGE 1",
  hash: "A14F863D347E37B54DD68BB12251A74FE06EA0623DC40856EBC81942B63001FE",
  prev_hash: "BDC59FEE8F9239579CB4A29828F5F19B081EF3D34A99331E898B22CDDA7922F0",
  timestamp: ~N[2018-01-14 10:26:40.121790]},
 %Block{data: "ZERO_DATA",
  hash: "BDC59FEE8F9239579CB4A29828F5F19B081EF3D34A99331E898B22CDDA7922F0",
  prev_hash: "ZERO_HASH", timestamp: ~N[2018-01-14 10:25:25.936002]}]


# Validate the Blockchain
iex(4)> Blockchain.valid?(chain)
true

Concluding Notes

That’s it for our simple blockchain. I hope that gave you a somewhat basic idea of how a blockchain works. Of course there’s a lot more we can do, such as:

  • Use some kind of data store to maintain state
  • Or better yet, use Ecto (with or without a database) to take advantage of its changesets and validations
  • Employ Public-Key Cryptography and sign blocks before hashing them
  • Implement a custom encoding scheme in a certain order to always guarantee the same hash for a given map (For example, Maps in Elixir are unordered and the hash generated would almost never be the same as one generated in another language for the same values)

With the powerful Erlang Virtual Machine, we can also do some fancy stuff:

  • Have a distributed network of nodes running the blockchain
  • Maintaining Blockchain state in Mnesia across many nodes
  • Broadcast new blocks to the entire network
  • Perform state consensus and implement conflict resolution

Here are some other blockchain & elixir related resources and projects: