Feb 19th, 2021 - written by Kimserey with .
Last week we looked at CPython and mentioned that the interpretor was a stack based virtual machine. In today’s post we will go into more details why we mentioned that the VM was stack based by looking at the dissassembled Bytecode and compare it with a assembly code runnable on a register machine.
We start first by creating a simple addition function:
1
2
3
4
5
6
def addition():
a = 100
b = 21
c = 11
c = a + b
return 0
We can then use dis
, a package coming built into the standard library, allowing us to dissassemble the Bytecode created by the compilation of Python:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
>>> import dis
>>> dis.dis(addition)
2 0 LOAD_CONST 1 (100)
2 STORE_FAST 0 (a)
3 4 LOAD_CONST 2 (21)
6 STORE_FAST 1 (b)
4 8 LOAD_CONST 3 (11)
10 STORE_FAST 2 (c)
5 12 LOAD_FAST 0 (a)
14 LOAD_FAST 1 (b)
16 BINARY_ADD
18 STORE_FAST 2 (c)
6 20 LOAD_CONST 4 (0)
22 RETURN_VALUE
>>>
We can see that our addition
function creates a set of instructions. As our code is extremely simplem, the instructions are composed of only LOAD_CONST
, STORE_FAST
, LOAD_FAST
and BINARY_ADD
.
We can find the definition on the official documentation:
1
2
3
4
5
6
7
8
9
10
11
LOAD_CONST(consti)
Pushes co_consts[consti] onto the stack.
STORE_FAST(var_num)
Stores TOS into the local co_varnames[var_num].
LOAD_FAST(var_num)
Pushes a reference to the local co_varnames[var_num] onto the stack.
BINARY_ADD
Implements TOS = TOS1 + TOS.
We see that LOAD_CONST
instruction will push co_consts[consti]
onto the stack. STORE_FAST
will pop the top of stack (TOS) and store it into co_varnames[var_num]
. LOAD_FAST
will push a reference to co_varnames[var_num]
onto the stack, so a previous value which we have stored via STORE_FAST
. And finally BINARY_ADD
would do TOS1 + TOS
, hence popping the first top of stack, then popping a second one while adding them up.
For example:
LOAD_CONST 1 (100)
indicates that we load the constant co_consts[1]
which contains 100
,STORE_FAST 0 (a)
indicates that we store the TOS onto co_varnames[0]
which refers to a
,LOAD_FAST 1 (b)
indicates that we load from co_varnames[1]
onto the stack.co_consts
and co_varnames
are PythonObject
that respectively keep track of constants and variables. Their definitions can be found in code.h
header of CPython implementation.
Knowing that we can see the execution of BINARY_ADD
on the following lines:
1
2
3
4
5 12 LOAD_FAST 0 (a)
14 LOAD_FAST 1 (b)
16 BINARY_ADD
18 STORE_FAST 2 (c)
It loads onto the stack a
and b
via LOAD_FAST
and apply the addition BINARY_ADD
by popping the two values from the top of stack.
This is a simple example but we can see howe the whole evaluation revolves around pushing and popping value onto or out of the stack and this is where the stack-based attribute for the virtual machine is derived from.
As opposed to the Python stack based VM, our x86 processors operate with a register machine. Let’s start by recreating our addition example in C++:
1
2
3
4
5
6
7
8
9
10
int main()
{
int a = 100;
int b = 21;
int c = 11;
c = a + b;
return 0;
}
We can then compile it:
1
❯ g++ -o addition addition.cpp
And we can dissassemble it using otool -tv
(I am using otool
as my current computer is a Mac).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
❯ otool -tv addition
addition:
(__TEXT,__text) section
_main:
0000000100003f80 pushq %rbp
0000000100003f81 movq %rsp, %rbp
0000000100003f84 xorl %eax, %eax
0000000100003f86 movl $0x0, -0x4(%rbp)
0000000100003f8d movl $0x64, -0x8(%rbp)
0000000100003f94 movl $0x15, -0xc(%rbp)
0000000100003f9b movl $0xb, -0x10(%rbp)
0000000100003fa2 movl -0x8(%rbp), %ecx
0000000100003fa5 addl -0xc(%rbp), %ecx
0000000100003fa8 movl %ecx, -0x10(%rbp)
0000000100003fab popq %rbp
0000000100003fac retq
Here we can see how the instruction set differs from the Python instruction set (as expected). The very first four lines are setup instructions:
1
2
3
4
0000000100003f80 pushq %rbp
0000000100003f81 movq %rsp, %rbp
0000000100003f84 xorl %eax, %eax
0000000100003f86 movl $0x0, -0x4(%rbp)
Straigth away we see the registers appearing with %rbp
; the register base pointer, and %eax
a general purpose register. Registers are data storage located inside the processor providing quick access for logical operations.
Then the next three lines are the assignments of a
, b
and c
:
1
2
3
0000000100003f8d movl $0x64, -0x8(%rbp)
0000000100003f94 movl $0x15, -0xc(%rbp)
0000000100003f9b movl $0xb, -0x10(%rbp)
Here movl
will move the values into address space in the registers. Those lines correspond to:
1
2
3
int a = 100;
int b = 21;
int c = 11;
where we have the hexadecimal values for:
0x64
for 100
,0x15
for 21
,0xb
for 11
.So 100
is moved to address -0x8(%rbp)
, where we will be able to grab it later, and same for 21
in -0xc(%rbp)
and 11
in -0x10(%rbp)
.
Follwing the assignment, we have the addition:
1
2
3
0000000100003fa2 movl -0x8(%rbp), %ecx
0000000100003fa5 addl -0xc(%rbp), %ecx
0000000100003fa8 movl %ecx, -0x10(%rbp)
which corresponds to:
1
c = a + b;
We grab -0x8(%rbp)
from %rbp
register containing 100
move it into %ecx
, a general purpose register, then add -0xc(%rbp)
which contains 21
and finally move the result from %ecx
to -0x10(%rbp)
on the register. We see how a register machine opperates by moving bytes into different registers to execute operations as opposed with a stack machine which employs a stack to operate.
And that concludes today’s post!
In today’s post we looked into the differences between a stack based machine and register machine. We looked into the stacked based machine by digging into the Python Bytecode and inspecting a simple addition function where we learnt how the Bytecode was read by the machine. We then looked at how register machine worked by looking at the assembly code generated from a C++ program for an x86 processor where we saw how the usage of registers was done. I hope you liked this post and I see you on the next one!