Jun 18th, 2021 - written by Kimserey with .
When working with cryptographic algorithm and hashes, it’s quite common to operate at the bit and byte level. For those situations, Python provides functionalities to convert int
to byte
and vice versa and bitwise operators to operate on bits. In today’s post we will look at the different bitwise operators available with examples.
In Python 3.x, bitwise operators can only used on bits (as the name indicates). But in most cases, methods take in and return bytes
type, in order to get to the bits, we can either select a single byte in bytes
are directly convert the whole bytes
to a int.
1
>>> value = b"12"
Here we create bytes string which by default is encoded in utf-8
. If you have doubt about how utf-8
works, you can refer to my previous post on Unicode explained.
The little b
in front of the string value indicates that the data is in bytes
and the string value that we see is the utf-8
encoded value of "12"
.
With the bytes, we can convert it to int
by either taking a slice of the bytes
or just converting the whole value with int.from_bytes
:
1
2
3
4
5
>>> value[0]
49
>>> int.from_bytes(value, byteorder="big")
12594
Taking value[0]
will return the decimal value of the character 1
in utf-8
encoding. It would be 49
as the unicode for 1
would be 0x31
which is 0011 0001
hence 2^0 + 2^4 + 2^5 = 49
.
We specify the byteorder
as big-endian to convert the decimal.
The endianness will define the order of the bytes; big
for big-endian means the left byte, the lowest in the byte array, will be the most significant byte of the word. The endiannes is important for the to and from functions converting from int to bytes and vice versa as the resulting utf-8
value would otherwise be reversed.
For example, if we were to convert back 12594
with little
after previously having converted with big
, we’ll end up with b"21"
:
1
2
3
4
5
>>> (12594).to_bytes(2, byteorder="little")
b'21'
>>> (12594).to_bytes(2, byteorder="big")
b'12'
With the decimal value, we can then represent it in multiple format, binary, octal, hexadecimal:
1
2
3
4
5
6
7
8
>>> bin(int.from_bytes(value, byteorder="big"))
'0b11000100110010'
>>> oct(int.from_bytes(value, byteorder="big"))
'0o30462'
>>> hex(int.from_bytes(value, byteorder="big"))
'0x3132'
0b
identifies values displayed as binary (base 2 - bit 0 or 1), 0o
identifies octal (base 8 - values from 0 to 7), 0x
identifies hexadecimal (base 16 - values from 0 to F).
Now that we know how to convert from bytes to binary and display the binary format, we can start looking at bitwise operators.
The bitwise operators available in python are:
Operator | Definition |
---|---|
<< |
Bitwise left shift |
>> |
Bitwise right shift |
& |
Logical AND |
\| |
Logical OR |
^ |
Logical XOR |
~ |
Complement |
The left <<
and right >>
bit shifts are useful to move bits to the left or to the right. For example:
1
2
3
4
5
>>> 0b0101 << 1
10
>>> bin(0b0101 << 1)
0b1010
We shifted the bits by one position. Because integer can also be interchanged with binaries, we can specify a binary notation or hex notation:
1
2
3
4
5
6
7
8
>>> 0b0101 << 0x0f
163840
>>> bin(0b0101 << 0x0f)
'0b101000000000000000'
>>> 0x0f
15
We shifted the bits to the left 15 times.
The &
would be a logical AND as followed:
1
2
3
4
5
>>> 0b1001 & 0b1000
8
>>> bin(0b1001 & 0b1000)
'0b1000'
This can be useful to select parts of some bytes while masking the rest, for example:
1
2
>>> bin(0b10011111 & 0xf0)
'0b10010000'
With & 0xf0
, we do a AND with 1111 0000
so essentially masking the lower 4 bits.
Similarly the |
operator would be:
1
2
3
4
5
>>> 0b1001 | 0b1000
9
>>> bin(0b1001 | 0b1000)
'0b1001'
OR is useful in situation were we want to construct an array, for example we can concatenate two 4 bits word by doing a left shift with |
:
1
2
>>> bin(0b1010 << 4 | 0b1111)
'0b10101111'
We concatenated our first word 1010
with 1111
by left shifting of 4 positions and executing a OR.
The ^
executes an exclusive OR (XOR):
1
2
>>> bin(0b1001 ^ 0b1000)
0b1
And lastly ~
would return the complement:
1
2
3
4
5
>>> bin(~0b00000010)
'-0b11'
>>> ~0b10
-3
The complement of 0b00000010
being -0b11
might not have been what we expected, but this is in fact a two’s complement which allows to represent negative value hence the -
sign in from of 0b
.
For NOT, we would have expected 0b11111101
and we can actually get that value with ~0b00000010 & 0xff
:
1
2
3
4
5
>>> ~0b00000010 & 0xff
253
>>> bin(~0b00000010 & 0xff)
'0b11111101'
And that concludes today’s post!
Today we looked at how we could manipulate bits and bytes in Python. We started to look at how we could convert bytes into bits and how we could display their representations. We then moved on to look at each bitwise operators provided in Python and looked at example where they could be used. I hope you liked this post and I see you on the next one!