Issue #43: Types

It Is Not The Types That Will Help You

We go back and forth, if you will pardon the pun, on whether programs have types or not. In the very early days of digital computers you had digits (as likely to be decimal or octal as binary), and that was your lot. But then still in the very early days of digital computers you had LISP, which has two types: atoms and lists. In the sort of early days of digital computers you got BASIC which has a whopping three types: numbers, strings, and arrays.

Early proponents of types as useful tools came from the C programming community. In the August 1983 Byte magazine, Johnson and Kernighan advocate for C’s type system as a tool for creating “higher-level models”, programs that act on representations of ideas rather than on bytes and words.

Of course, this was 36 issues after the Smalltalk edition of Byte, which had told the world about a very expressive modelling system for higher-level abstractions that had exactly four types: Objects, Classes (which worked like Objects), Small Integers, and Symbols. Later, the Self language would demonstrate that this was one too many and that Classes are not actually needed.

See, types themselves are not actually higher-level models, they are constraints on the programs you are allowed to write. There is an uncountably large (though not infinite, assuming you have a practical computer) number of computer programs you could write. Of these, nearly all of them are wrong, in that if you want a program that does something, these programs do not do that. But, good news, some of them are right! You just have to find the right ones.

Imposing types in particular places like the parameters to functions or the values of temporary variables constrains the programs you create to ones that satisfy the limitations of the type. No longer can your program put any old value into the rax register then call Employee#give_raise: now it must ensure that that value is a pointer in memory to an instance of a Money class with a defined currency and positive amount. The compiler says so!

Now there are a whole lot of incorrect programs you can no longer write. There are a whole lot of correct programs you can no longer write too. We all hope that there is still at least one correct program available, and that it is easier to search this space to find it.

But you can use types as much as you like and still not make life any easier for yourself. Let us start with a program that does not use types. It is a calculator for the nth Fibonacci number: put an index in byte 25,600 of your computer’s memory, and it will write the appropriate entry from the Fibonacci sequence into byte 25,603.

This program is correct. I know that because I designed it to be so, and also I tested it both on paper and on a real computer. It is written in machine code for the Motorola 6809 CPU, but to aid readability I present it here in Motorola assembly language.

; push the direct page onto the stack
PSHS DP
; set a new direct page
LDA #64
TFR A, DP
; load sequence number from memory
LDA 0
; if A = 0 then the answer is 0
CMPA #0
BNE #6
STA 03
; pull the old direct page from the stack
PULS DP
; return
RTS
; jumping is easier when you know instructions have even addresses
NOP
; if A = 1 then the answer is 1
CMPA #1
BNE #6
STA 03
PULS DP
RTS
NOP
; initialise scratch variables: byte 1 = 0
LDB #0
STB 01
; byte 2 = 1
LDB #1
STB 02
; loop starts here
; B = byte 2 + byte 1
; note: byte 2 already in B
ADDB 01
; byte 3 = B
STB 03
; copy byte 2 to byte 1
LDB 02
STB 01
; copy byte 3 to byte 2
LDB 03
STB 02
; decrement A
SUBA #1
; if A != 1 then jump back to loop
CMPA #0
BNE #4
; return: result is in byte 3 (and also 2)
PULS DP
RTS
NOP
BRA #-24

When I keep making assertions like “it is correct” and “it works”, I of course mean within a limited realm. The calculation of the Fibonacci sequence values occurs in the B register of the 6809 which is 8 bits wide, so only the first 13 values of the sequence can be calculated correctly before overflow occurs. I could have used the D register for 16 bits of Fibonacci goodness, but then I would have had to store the counter outside the A register (maybe in memory pointed to by X) and would have needed four scratch bytes, not two. Software is all about compromises.

Anyway, let us imagine that our hero has learned that types make it easy to “reason about” code, and decides to make a typed version of the Fibonacci program. So they start to design appropriate types:

struct MC6809 {
    union {
        struct {
            uint8_t a;
            uint8_t b;
        } ab;
        uint16_t d;
    };
    union {
        uint8_t cc;
        struct {
            uint8_t e:1;
            uint8_t f:1;
            uint8_t h:1;
            uint8_t i:1;
            uint8_t n:1;
            uint8_t z:1;
            uint8_t v:1;
            uint8_t c:1;
        } flags;
    } cc;
    uint16_t x;
    uint16_t y;
    uint16_t u;
    uint16_t s;
    uint16_t pc;
};

Well, we know that the program works on a 6809 CPU, why not take advantage of that knowledge? We continue:

typedef uint8_t memory[65536];

struct computer {
    memory RAM;
    struct MC6809 CPU;
};

And write some functions that use these types correctly:

struct computer tick(struct computer startState) {
    uint8_t instruction = startState.RAM[startState.CPU.pc];
    return dispatch(instruction, startState);
}

This even looks like it is going to return a new value rather than mutating state, which we all know is an important part of writing a correct program!

Eventually the program is complete, and we can see the entry point:

int main(int argc, char **argv) {
    struct computer emptyMachine = {0};
    uint8_t fibonacci[] = {
        0x34, 0x08, 0x86, 0x64, 0x1f, 0x8b, // ...
    };
    struct computer loaded = loadProgram(emptyMachine, 0x6404, fibonacci);
    loaded.CPU.pc = 0x6404;
    loaded.RAM[0x6400] = atoi(argv[1]);
    struct computer finalState = runUntilHalt(loaded);
    printf("%d Fibonacci number is %d\n", 
           finalState.RAM[0x6400],
           finalState.RAM[0x6403]);
}

This program correctly calculates the Fibonacci number; at least as correctly as the original one did. But the type annotations do not give that away. What the type annotations tell us is that we are emulating a Motorola 6809…hopefully that array of bytes we load into it is the correct program!

In fact given many of the fancy type systems on the market, the Fibonacci program has a pretty boring type. Here is one choice in the TypeScript language:

function fibonacci(n: bigint): bigint;

We know that we are going to get an integer out. Of course we knew that on the 6809 because there is no floating point unit!

A programmer with an expensive type system does not write better programs, any more than a novice guitarist becomes a stadium-grade shredder by buying a fancy guitar.

Cover photo by the author.

Graham is a senior Research Software Engineer at Oxford University. He got hooked on making quality software in front of a NeXT TurboStation Color, and still has a lot to learn.