If there was a contest for the single most beloved person in the functional programming galaxy, Joe Armstrong would have effortlessly won the first prize. For decades, he constantly showed the world that the principles behind functional programming were the key for resilient, concurrent, and highly available systems. And he showed it in the best possible way, which most probably made Pastor Manul Laphroaig very proud: with an astonishingly serious “PoC” called Erlang.
That is why we have chosen as this month’s Vidéothèque choice his presentation titled “Systems that run forever self-heal and scale” at the 2013 Lambda Jam, a conference that featured an impressive schedule with talks by Ola Bini, Chris Ford, Dave Thomas (not the Pragmatic Dave), Bartosz Milewski, Adam Granicz, Steve Vinoski, and Gerald Jay Sussman.
As you most probably know by now, Joe Armstrong (1950-2019) is mostly known for his work on Erlang, a massively concurrent and functional programming language, coupled with BEAM, the virtual machine that enabled telcos, starting in the 1990s, to serve millions of users with complex systems in the most efficient and resilient way.
Kids, this was 20 years before Go or Kubernetes were even conceived. Erlang enabled companies to scale their services in ways that were unthinkable even by today’s standards, running on hardware way underpowered compared to that of our modern world of 2026.
Joe kicked off his acclaimed book “Programming Erlang” (released in 2007 by The Pragmatic Programmers, with a second edition published in 2013) with a clear disclaimer:
In many places we’ll be extolling the virtues of functional programming. Functional programming forbids code with side effects. Side effects and concurrency don’t mix. You can have sequential code with side effects, or you can have code and concurrency that is free from side effects. You have to choose. There is no middle way.
And he goes on to explain on page 44 (mind you, this was written in 2007) what “functional” means in this context:
Erlang is a functional programming language. Among other things this means that funs can be used as arguments to functions and that functions (or funs) can return funs.
Functions that return funs, or functions that can accept funs as their arguments, are called higher-order functions. We’ll see a few examples of these in the next sections.
I stress the fact that this was written in 2007 because merely 20 years ago, as we were entering the “plateau of productivity” in the hype cycle of Object-Oriented Programming, we were also witnessing how Twitter (the original name of a decadent social network still active as this article hits the press) was suffering with “fail whales” shown on its home page, while millions of users were trying to read the tweets on their (not yet algorithm-driven) feeds. The time was ripe for a new paradigm… but apparently, unbeknownst to the Twitter team, it already existed.
Interestingly, the WhatsApp developer team knew exactly that they needed something else to create a system potentially usable by billions of simultaneous users, so they chose Erlang, and boy did that work well. More on that later.
In his 2013 conference talk, Joe Armstrong started with some simple observations: the real world is parallel. Boom. It turns out that Erlang processes are the perfect way to model such a world: they can be thought of as a group of people communicating by message passing.
Let us remember what Alan Kay said about messaging in 1998:
The big idea is “messaging” – that is what the kernal of Smalltalk/Squeak is all about (and it’s something that was never quite completed in our Xerox PARC phase). The Japanese have a small word – ma – for “that which is in between” – perhaps the nearest English equivalent is “interstitial”. The key in making great and growable systems is much more to design how its modules communicate rather than what their internal properties and behaviors should be.
So here we have a programming language that allows you to model a world with certain characteristics that Joe describes around minute 05:30: many computers distributed all over the place, working concurrently, detecting their own failures and repairing them as soon as possible, and even featuring a radical concept called live code upgrades.
Sounds familiar? Any parallels with Kubernetes are just a coincidence. In Erlang, there is no such thing as an atomic update of the “stop it, upgrade, restart” kind (06:50): Erlang applications are continuously partially upgrading themselves whenever needed.
Needless to say, this was beyond impressive in the mid-1990s, but Sun had a bigger marketing budget than Ericsson. Insert sad face emoticon here.
Erlang was designed from the ground up for “5 nines reliability” (07:10) because of a simple observation: it is much better to design a system for 10 million users and scale it down to 10,000 than to scale it up from 10 to 10,000. The result is that by 2013 there was a 50% chance that smartphones went through Erlang to talk to the mobile Internet.
But Erlang is just a piece of the whole architectural cake, albeit a critical one. Joe goes on to elaborate on the patterns required for system consistency and fault tolerance, distributed consensus (23:36), and the evolution from Lamport’s Paxos to Ongaro’s and Ousterhout’s Raft. He also mentions six rules for fault tolerance (30:00), applicable to any massively distributed system running on your nearest Kubernetes cluster nowadays: process isolation, concurrency, failure detection, fault identification, live code upgrade, and stable storage.
Legendary computer scientist Jim Gray wrote a widely quoted paper in 1986 titled “Why Do Computers Stop and What Can Be Done About It?” where he said:
The top priority for improving system availability is to reduce administrative mistakes by making self-configured systems with minimal maintenance and minimal operator interaction.
(…)
As with hardware, the key to software fault-tolerance is to hierarchically decompose large systems into modules, each module being a unit of service and a unit of failure. A failure of a module does not propagate beyond the module.
Now you understand why your Kubernetes Deployment YAML contains a readiness and a liveness probe, for example. You are welcome.
Towards minute 46 of the video, Joe goes on to talk about Erlang in detail, enumerating some success stories: Mnesia, CouchDB, Riak, ejabberd, RabbitMQ, and, yes, of course, WhatsApp, snapped up by Facebook for a hefty 20 billion dollars of pocket money. And then he goes on to discuss the future of Erlang as seen from the perspective of 2013: Elixir, a programming language based on BEAM and OTP but offering an admittedly much friendlier syntax (inspired by Ruby) with interesting metaprogramming possibilities.
Watch this month’s Vidéothèque entry, “Systems that run forever self-heal and scale,” by Joe Armstrong, on YouTube. Continue binge-watching “Erlang: The Movie,” a 1990 short nowadays available on YouTube and the Internet Archive, showing a demo of a bug-fixing session on a live Erlang system. Let us repeat for the people in the back: 1990.
Cover snapshot chosen by the author.