Stack And Register Machines Python

Feb 19th, 2021 - written by Kimserey with .

Last week we looked at CPython and mentioned that the interpretor was a stack based virtual machine. In today’s post we will go into more details why we mentioned that the VM was stack based by looking at the dissassembled Bytecode and compare it with a assembly code runnable on a register machine.

Stack Machine

We start first by creating a simple addition function:

1
2
3
4
5
6
def addition():
    a = 100
    b = 21
    c = 11
    c = a + b
    return 0

We can then use dis, a package coming built into the standard library, allowing us to dissassemble the Bytecode created by the compilation of Python:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
>>> import dis
>>> dis.dis(addition)
  2           0 LOAD_CONST               1 (100)
              2 STORE_FAST               0 (a)

  3           4 LOAD_CONST               2 (21)
              6 STORE_FAST               1 (b)

  4           8 LOAD_CONST               3 (11)
             10 STORE_FAST               2 (c)

  5          12 LOAD_FAST                0 (a)
             14 LOAD_FAST                1 (b)
             16 BINARY_ADD
             18 STORE_FAST               2 (c)

  6          20 LOAD_CONST               4 (0)
             22 RETURN_VALUE
>>>

We can see that our addition function creates a set of instructions. As our code is extremely simplem, the instructions are composed of only LOAD_CONST, STORE_FAST, LOAD_FAST and BINARY_ADD.

We can find the definition on the official documentation:

1
2
3
4
5
6
7
8
9
10
11
LOAD_CONST(consti)
  Pushes co_consts[consti] onto the stack.

STORE_FAST(var_num)
  Stores TOS into the local co_varnames[var_num].

LOAD_FAST(var_num)
  Pushes a reference to the local co_varnames[var_num] onto the stack.

BINARY_ADD
  Implements TOS = TOS1 + TOS.

We see that LOAD_CONST instruction will push co_consts[consti] onto the stack. STORE_FAST will pop the top of stack (TOS) and store it into co_varnames[var_num]. LOAD_FAST will push a reference to co_varnames[var_num] onto the stack, so a previous value which we have stored via STORE_FAST. And finally BINARY_ADD would do TOS1 + TOS, hence popping the first top of stack, then popping a second one while adding them up.

For example:

  • LOAD_CONST 1 (100) indicates that we load the constant co_consts[1] which contains 100,
  • STORE_FAST 0 (a) indicates that we store the TOS onto co_varnames[0] which refers to a,
  • LOAD_FAST 1 (b) indicates that we load from co_varnames[1] onto the stack.

co_consts and co_varnames are PythonObject that respectively keep track of constants and variables. Their definitions can be found in code.h header of CPython implementation.

Knowing that we can see the execution of BINARY_ADD on the following lines:

1
2
3
4
5          12 LOAD_FAST                0 (a)
           14 LOAD_FAST                1 (b)
           16 BINARY_ADD
           18 STORE_FAST               2 (c)

It loads onto the stack a and b via LOAD_FAST and apply the addition BINARY_ADD by popping the two values from the top of stack.

This is a simple example but we can see howe the whole evaluation revolves around pushing and popping value onto or out of the stack and this is where the stack-based attribute for the virtual machine is derived from.

Register Machine

As opposed to the Python stack based VM, our x86 processors operate with a register machine. Let’s start by recreating our addition example in C++:

1
2
3
4
5
6
7
8
9
10
int main()
{
    int a = 100;
    int b = 21;
    int c = 11;

    c = a + b;

    return 0;
}

We can then compile it:

1
❯ g++ -o addition addition.cpp

And we can dissassemble it using otool -tv (I am using otool as my current computer is a Mac).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
❯ otool -tv addition
addition:
(__TEXT,__text) section
_main:
0000000100003f80        pushq   %rbp
0000000100003f81        movq    %rsp, %rbp
0000000100003f84        xorl    %eax, %eax
0000000100003f86        movl    $0x0, -0x4(%rbp)
0000000100003f8d        movl    $0x64, -0x8(%rbp)
0000000100003f94        movl    $0x15, -0xc(%rbp)
0000000100003f9b        movl    $0xb, -0x10(%rbp)
0000000100003fa2        movl    -0x8(%rbp), %ecx
0000000100003fa5        addl    -0xc(%rbp), %ecx
0000000100003fa8        movl    %ecx, -0x10(%rbp)
0000000100003fab        popq    %rbp
0000000100003fac        retq

Here we can see how the instruction set differs from the Python instruction set (as expected). The very first four lines are setup instructions:

1
2
3
4
0000000100003f80        pushq   %rbp
0000000100003f81        movq    %rsp, %rbp
0000000100003f84        xorl    %eax, %eax
0000000100003f86        movl    $0x0, -0x4(%rbp)

Straigth away we see the registers appearing with %rbp; the register base pointer, and %eax a general purpose register. Registers are data storage located inside the processor providing quick access for logical operations.

Then the next three lines are the assignments of a, b and c:

1
2
3
0000000100003f8d        movl    $0x64, -0x8(%rbp)
0000000100003f94        movl    $0x15, -0xc(%rbp)
0000000100003f9b        movl    $0xb, -0x10(%rbp)

Here movl will move the values into address space in the registers. Those lines correspond to:

1
2
3
int a = 100;
int b = 21;
int c = 11;

where we have the hexadecimal values for:

  • 0x64 for 100,
  • 0x15 for 21,
  • 0xb for 11.

So 100 is moved to address -0x8(%rbp), where we will be able to grab it later, and same for 21 in -0xc(%rbp) and 11 in -0x10(%rbp).

Follwing the assignment, we have the addition:

1
2
3
0000000100003fa2        movl    -0x8(%rbp), %ecx
0000000100003fa5        addl    -0xc(%rbp), %ecx
0000000100003fa8        movl    %ecx, -0x10(%rbp)

which corresponds to:

1
c = a + b;

We grab -0x8(%rbp) from %rbp register containing 100 move it into %ecx, a general purpose register, then add -0xc(%rbp) which contains 21 and finally move the result from %ecx to -0x10(%rbp) on the register. We see how a register machine opperates by moving bytes into different registers to execute operations as opposed with a stack machine which employs a stack to operate.

And that concludes today’s post!

Conclusion

In today’s post we looked into the differences between a stack based machine and register machine. We looked into the stacked based machine by digging into the Python Bytecode and inspecting a simple addition function where we learnt how the Bytecode was read by the machine. We then looked at how register machine worked by looking at the assembly code generated from a C++ program for an x86 processor where we saw how the usage of registers was done. I hope you liked this post and I see you on the next one!

External Sources

Designed, built and maintained by Kimserey Lam.