Binary arithmetic
Binary addition
1111
1001011
+ 1100101
────────────
1 0110000
In this case, we had an extra carry; the true result of the addition is too big to fit into a single byte.
Binary subtraction
Subtract-with-borrow, but see also the negate-then-add method below.
Subtraction follows a similar pattern, but with “borrowing” instead of carrying. E.g.,
110110
- 100001
────────────
0 - 1 = -1, so we borrow a 1 from the next column (i.e., we are doing 10 - 1 = 1)
1
110110
- 100001
────────────
1
1 - 1 = 0:
1
110110
- 100001
────────────
01
1 - 0 = 1:
1
110110
- 100001
────────────
101
0 - 0 = 0:
1
110110
- 100001
────────────
0101
1 - 0 = 1:
1
110110
- 100001
────────────
10101
And 1 - 1 = 0 (we could drop the leading 0 in the answer):
1
110110
- 100001
────────────
010101
It’s possible to end up with an extra “borrow”, indicating underflow.
Binary to decimal, and the reverse
Input | Remainder | Binary |
---|---|---|
1234 | 0 | __________0 |
617 | 1 | _________10 |
308 | 0 | ________010 |
154 | 0 | _______0010 |
77 | 1 | ______10010 |
38 | 0 | _____010010 |
19 | 1 | ____1010010 |
9 | 1 | ___11010010 |
4 | 0 | __011010010 |
2 | 0 | _0011010010 |
1 | 1 | 10011010010 |
To decimal: multiple bits by powers of two.
Two’s complement representation
Represent the negation of a value by
Flipping all the bits
Adding 1
E.g., 00110110
negated gives
00110110
11001001 Flip all bits
11001010 Add 1
Negative values will always have the high bit set.
Addition/subtraction can be done normally. (To do subtraction, just negate the second operand and then add.)
Registers, and their uses
Syscall register use
64-bits | Low 32-bits | Low 16-bits | Low 8-bits | Comment |
---|---|---|---|---|
rax |
eax |
ax |
al |
Accumulator; syscall code and return |
rbx |
ebx |
bx |
bl |
Base |
rcx |
ecx |
cx |
cl |
Count (syscall clobbered) |
rdx |
edx |
dx |
dl |
Dword accum.; 3rd syscall arg. |
rsi |
esi |
si |
sil |
Source index; 2nd syscall arg. |
rdi |
edi |
di |
dil |
Dest. index; 1st syscall arg. |
rbp |
ebp |
bp |
bpl |
Stack base pointer |
rsp |
esp |
sp |
spl |
Stack pointer |
r8 |
r8d |
r8w |
r8b |
5th syscall arg. |
r9 |
r9d |
r9w |
r9b |
6th syscall arg. |
r10 |
r10d |
r10w |
r10b |
4th syscall arg. |
r11 |
r11d |
r11w |
r11b |
(syscall clobbered) |
… | … | … | … | |
r15 |
r15d |
r15w |
r15b |
The first four registers allow access to their second byte (the high byte of
the word-sized): ah
, bh
, ch
, dh
. These cannot be mixed with any of the
newer registers (e.g., mov r15b, ah
is invalid).
C-style functions
Registers:
Register | Use |
---|---|
rax |
Return value |
rbx |
Callee-preserved |
rcx |
4th argument |
rdx |
3rd argument |
rsi |
2nd argument |
rdi |
1st argument |
rbp |
Callee-preserved |
rsp |
Stack pointer |
r8 |
5th argument |
r9 |
6th argument |
r10 |
Temporary (caller-preserved) |
r11 |
Temporary (caller-preserved) |
r12 -r15 |
Callee-preserved |
Stack (rsp
) must be aligned to a multiple of 16 + 8 before any call
. The
stack is aligned to a multiple of 16 immediately after function entry, so
usually we can just do either
sub rsp, 8
or
push rbp
mov rbp, rsp
Either way, we have to undo the process before ret
-urning.
Arithmetic operations
add
sub
inc
dev
mul
and div
(and imul
and idiv
) and their register usage.
mul rm ; Multiply rdx:rax by rm, store the result back into rdx:rax
div rm ; Divide rdx:rax by rm, store the result back into rdx:rax
Note that these use rdx:rax
as a 128-bit input; if you are not using the
full 128 bits, you should zero rdx
before the operation.
Division stores the quotient into rax
and the remainder (modulo) into rdx
.
Comparisons
cmp
– basically just a sub
which discards the results (but keeps the flags).
test
– just an and
which discards the results. Mostly useful for testing
individual bits.
Flags and their meanings: CF, OF, SF, ZF
CF – Carry flag, set if addition/subtraction generated an extra carry/borrow. Indicates overflow for unsigned arithmetic operations. Meaningless if the operands are signed.
OF – Overflow flag, set if addition/subtraction generated an overflow when interpreted as signed. (I.e., set if the sign of the result is not what it should be.) Meaningless for unsigned operations.
SF – Sign flag, copy of the high bit of the result.
ZF – Zero flag, set if the result = 0.
Condition codes
a
,b
– “above” and “below” for unsigned comparisons.a
means CF unset, ZF unset (ifa > b
thena - b > 0
).ae
means CF unset, ZF ignored.be
meansCF == 1
orZF == 1
.b
meansCF == 1
andZF == 0
.l
g
– “less than”, “greater than” for signed comparisons.l
meansSF != OF and ZF == 0
,le
meansSF != OF
.g
meansSF == OF and ZF == 0
,ge
meansSF == OF
.e
ne
– “equal”, not-equal,e
meansZF == 1
(a - a = 0
) andne
meansZF == 0
.There are condition codes for each of the flags. E.g.,
ns
meansSF == 0
.There are negated forms of all the conditions. E.g.,
nae
means “not above-or-equal” and is equivalent tob
.
Jumps and branches
jmp target
: unconditional jump
jCC target
: conditional jump, replace CC
with condition code
loop target
: decrement rcx
, jump to target if not 0.
Memory operands and arrays/strings
[displacement + scale*offset + base]
Displacement is an immediate address (typically a label)
scale
is 1,2,4 or 8. If omitted, 1 is assumedoffset
is the offset registerbase
is the base register.
Memory-memory operations are generally forbidden.
lea reg, mem
computes the effective address of mem
(i.e., does the math)
and then stores the address, not the value in memory, into reg
.
String operations
String operations implicitly use [rdi]
and [rsi]
as their operands.
Instruction | Description |
---|---|
lodsb |
Load byte [rdi] into al |
stosb |
Write byte from al into byte [rdi] |
movsb |
Copy byte from [rdi] to [rsi] |
cmpsb |
Compare [rdi] with [rsi] and update flags |
scasb |
Compare [rdi] with rax and update flags |
Replace b
with w
for word-sized, d
for dword, etc.
All of these implicitly increment rdi
and rsi
(if used).
Repetition prefixes:
rep
– Repeatrcx
many times. Can be used withlodsb
,stosb
,movsb
.repe
/repne
– Repeatrcx
many times, or until equal/not equal. Can be used withcmpsb
andscasb
.
Structures and alignments
struc
/endstruc
– Shortcut for defining a bunch of equ
definitions. E.g.,
struc thing
a: resb
b: resb
c: resw
d: resd
e: resq
endstruc
defines the following constants:
thing: equ 0
a: equ 0
b: equ 1
c: equ 2
d: equ 4
e: equ 8
thing_size: equ 16
To be C-compatible, the elements of a structure must be aligned (placed
in memory at a multiple of their size). So a qword member must start at a multiple
of 8. Extra resb
s can be used to add padding bytes, or the align
directive.
Instances of structures must be placed in memory at structure alignment, which is a multiple of the largest element of the structure. E.g., the above structure would have 8-byte alignment.
Floating-point operations
Floating point registers are xmm0
through xmm15
. Operations are
suffixed with their operand size: ss
for single-precision (float), sd
for
double-precision (double).
Use movss
, movsd
to move float values into/out of operands. There are no
float immediates; store floating point constants in the .data
section and
then movs*
them into a register.
addss dest, src ; dest += src (float)
addsd dest, src ; dest += src (double)
subss dest, src ; dest -= src (float)
subsd dest, src ; dest -= src (double)
mulss dest, src ; dest *= src (float)
mulsd dest, src ; dest *= src (double)
divss dest, src ; dest /= src (float)
divsd dest, src ; dest /= src (double)
All of these are also available in three-operand forms:
vaddss dest, src1, src2 ; dest = src1 + src2
vaddsd dest, src1, src2 ; dest = src1 + src2
vsubss dest, src1, src2 ; dest = src1 + src2
vsubsd dest, src1, src2 ; dest = src1 + src2
vmulss dest, src1, src2 ; dest = src1 + src2
vmulsd dest, src1, src2 ; dest = src1 + src2
vdivss dest, src1, src2 ; dest = src1 + src2
vdivsd dest, src1, src2 ; dest = src1 + src2
Comparisons use ucomiss
, ucomisd
which update the flags as if for an
unsigned comparison.
Bitwise operations
and
, or
, not
, xor
, andn
(AND followed by negation of the result).
These set flags, so they can be used for various purposes.
Shifts and rotates
shl
– Shift left, fill in low bits with 0shr
– Shift right, fill in high bits with 0sar
– Shift arithmetic right, for signed values, fill high bits with
copies of the existing sign bit.ror
,rol
– Rotate left/right.
The shift/rotate amount can either be an immediate or a byte-sized register.