A C/C++ struct is in reality nothing more than several data values stored together in memory in a known arrangement. If we want to interoperate with C/C++ programs which use structures, we’ll need to see how to form the assembly equivalents.

An example structure:

struct thing { 
    double a;  // 8 bytes
    char   b;  // 1 byte
    int    c;  // 4 bytes
    char*  d;  // 8 bytes    
};

If we sum up the size of the elements of this structure, we get 21 bytes. However, if you compile this structure and then cout << sizeof(thing) C++ will report that its size is 24 bytes. Where did the extra 3 bytes come from? The answer has to do with structure layout, particularly alignment and structure packing.

Remember that the CPU can perform memory accesses faster if they are aligned to multiples of certain powers of 2 (typically, 32 or 64). In order to allow optimized moves, structure members are typically aligned, rather than being packed in as tightly as possible. This leads to some extra space, in the form of padding bytes being added to each structure.

If we create an instance of thing and then examine the addresses of its members, we can deduce where, within those 24 bytes, each of the structure’s elements is located:

thing x;

struct thing { 
    double a;  // &x.a == &x
    char   b;  // &x.b == &x + 8
    int    c;  // &x.c == &x + 12
    char*  d;  // &x.d == &x + 16
};

In C++, this also hold but only for a subset of structs and classes: those that are POD types — “Plain Old Data”. A POD type is one that

Note that this does allow POD types to have (non-virtual) methods, and to use inheritance. POD types are completely compatible between C++ and C, and, with some care, with assembly.

As usual, the Sys V C ABI defines how the elements of a structure are to be packed/aligned in memory. The rules for alignment are actually specified based on the data type, and are not particular to structures (i.e., every int in memory should be aligned to a multiple of 4 bytes, not just those in structures). A summary would be:

Structures in assembly

We can “build” a structure by just arranging things in memory to conform to the structure layout. For example, to build an instance of the thing structure on the stack, we could do

sub rsp, 24     ; Make room for the struct 
mov [rsp + 24], a
mov [rsp + 16], b
mov [rsp + 12], c
mov [rsp + 8],  d

and then the address of the structure is rsp + 24. This is obviously tedious and error-prone. A better option is to use Yasm’s macros for building structures, struc and endstruc. To mirror the above structure in assembly, we’d use

struc thing
    a:      resq    1
    b:      resb    1
            resb    3 ; 3 Padding bytes
    c:      resd    1  
    d:      resq    1
endstruc

(The resb,w,d,q directives reserve a certain number of bytes, words, dwords, or qwords, respectively.)

This implicitly defines, via equ, six constants:

Note that these are file-global constants, which means that the names a, thing, etc. cannot be used for labels or other constants anywhere else in the same file. If this is a problem, you can use local labels .a, .b, etc. for the member names.

Instead of manually adding the padding bytes, we can also use alignb to request a specific alignment of subsequent data. alignb n adds 0 bytes to the current section until the current address $ is a multiple of n, so we would add alignb directives before the elements:

struc thing
            alignb  8   ; Does nothing, already aligned
    a:      resq    1
            alignb  1   ; Does nothing, already aligned
    b:      resb    1
            alignb  4   ; Advance to multiple of 4
    c:      resd    1 
            alignb  8   ; Advance to multiple of 8 
    d:      resq    1
endstruc

Note that the first, second, and fourth alignbs do not add any padding at all, as the member d is already naturally aligned to a multiple of 8. It’s safe to add extra alignbs, because they will not insert any padding unless it is needed.

alignb fills the unused space with 0s.

To instantiate a structure in the .data section, use istruc, at and iend:

my_thing:   istruc thing
    at a,   dq      0.0     ; a = 0.0
    at b,   db      '!'     ; b = '!'
    at c,   dd      -12     ; c = -12
    at d,   dq      0       ; d = nullptr
iend

The at macro advances to the correct offset within the structure. Fields within in an istruc must be given in the exact same order as in the original struc.

Note that istruc/iend can only be used to declare instances in the .data section, i.e., as globals. To create an instance on the stack, we would first reserve the correct amount of space:

add rsp, thing_size

and then populate it relative to rsp:

mov qword [rsp - thing_size + a], 0.0
mov byte  [rsp - thing_size + b], '!'
mov dword [rsp - thing_size + c], -12
mov qword [rsp - thing_size + d], 0

(We subtract thing_size to get back to the beginning of the structure, and then offset the various members from there.)

Stack offset Member Value
rsp - 24 a 0.0
rsp - 16 b ‘!’
(rsp - 15) to (rsp - 13)padding bytes
rsp - 12 c -12
rsp - 8 d 0
rsp top of stack

Function calling convention for structures

Structures which are passed as pointers are passed as 64-bit qword addresses, as usual. What about passing a structure directly, by value:

void f(thing x);

How would we call x from assembly? To pass a structure by value to a function, there are a number of different rules, mostly depending on the size of the structure and its members. The underlying theme is that a structure is passed by “decomposing” it into its members and passing them individually in registers, except that members smaller than a qword may be combined in a single register.

Thus, to figure out how to pass a larger-than-8-bytes structure to a function, we must build a “map” of the structure layout in groupings of 8 bytes, examine the fields in each grouping, and determine how each group will be passed:

QwordFields Passed as
0 a xmm register
1 b, padding, cGP register
2 d GP register

Note that we only go through this process if the structure size is greater than 8, less than 64, and, if greater than 16, the first qword is passed in a xmm register. If we were to reorder the elements of the structure so that the double a was not the first element, then the structure would be passed on the stack (size > 16 and first element not a xmm-compatible value).

Structures are never broken up between registers and the stack; if any part of a structure (or > qword value in general) cannot be passed in registers, then the entire value is passed on the stack. The exceptions are “spilled” structure (where the whole structure could be passed in registers, but there aren’t enough available), and sub-structures.

Returning structure values

Return values are classified according to the above process, except that only two GP registers (rax and rdx), and two xmm registers (xmm0 and xmm1) are available; if the structure does not fit into these, then it is returned as follows:

Example

Here’s a rather complex example, borrowed from the Sys-V ABI specification:

// Note: sizeof(structparm) == 16
struct structparm {
    int a, b;
    double d;
};

structparm s;
int e, f, g, h, i, j, k;
double m, n;

structparm func(int e, int f, 
                structparm s, 
                int g, int h, 
                double m, 
                double n, 
                int i, int j, int k);

How will the registers and stack be set up for this function call?

GP registersFP registersStack
rdi: e
rsi: f
rdx: s.a, s.b
rcx: g
r8: h
r9: i
xmm0: s.d
xmm1: m
xmm2: n
0: j
8: k

How will the return value be represented?

GP registersFP registersStack
rax: ret.a, ret.b xmm0: ret.d None

Handling Signals

Currently, if we divide by 0 (either integer or floating-point), our program simply crashes with a floating-point exception. To avoid this, we have to install a signal handler, to catch the sigfpe signal which is sent to our process on a divide-by-zero error. This requires setting up a sigaction structure and passing it to the C-library sigaction() function.

Note that trying to “recover” from sigfpe is very dangerous in general; you have no idea what your program was doing, so the only reason to catch this signal is to make your program exit in some particular way: e.g., write some information to a log. The only safe thing to do is to end the program. Hence, we won’t try to “resume” computation in any way, we’ll just print a custom message and then exit. Instead of handling sigfpe, we’ll handle a signal which is safe to resume from: sigwinch, sent when the size of the text console changes (e.g., resize your PuTTY window and SIGWINCH is sent to the process you are running).

Signals

Signals are one of the ways in which a Unix-based operating system communicates with the processes running on it. Signals can be broken into those which can be caught by our processes, and those which are not catchable, and of the former, those which have some default behavior, and those which do nothing if not caught

Signal Catchable? Default Behavior
SIGINT (Ctrl-C) Yes Terminate process
SIGKILL No Terminate process
SIGTERM Yes Terminate process
SIGSGEV Yes Terminate process (Null-pointer dereference)
SIGFPE Yes Terminate process (Arithmetic error, div. by 0)
SIGHUP Yes Terminate process (“Hangup”)
SIGWINCH Yes Nothing (“window change”)

Signals are sent to a process asynchronously; this means that if a signal-handler, it may be triggered anywhere. It will appear to our program that the signal handler function was called, e.g., in the middle of an operation. Hence, signal handlers have to be very careful, as the state of the program is unknown. In particular, signal handlers should not call any standard library functions except exit() (it’s OK to call printf for testing or experimentation, but should never be done in production). The typical behavior for a signal handler is to set some (global) variables and then return, if the signal is not a fatal one. For fatal signals, the only real option is to clean-up and then exit.

It’s possible to send and receive custom signals, which are normally ignored.

Signals are a fairly simple communication method: you can’t attach information to signals, and you also can’t determine which process sent a signal to yours. They’re mostly used for simple “event notifications”: “something has happened”.

Catching signals

To catch a signal, we can use one of two mechanisms:

#include <stdio.h>
#include <signal.h>

int window_resized = 0;

void my_handler(int sig) {
    window_resized = 1;
}

int main() {
    if(signal(SIGWINCH, my_handler) == SIG_ERR) {
        return 1; // Handler could not be attached
    }

    // Wait for window resizes
    while(1) { 
        if(window_resized) {
            printf("Window resized!\n");
            window_resized = 0;
        }
    }

    return 0;
}

The signal function takes two parameters: a signal constant and a pointer to a handler function. Every handler function should have the prototype void handler(int sig), where the parameter will be the number of the signal that was caught (this allows a single handler function to be associated with different signals while still distinguishing them).

The behavior of the signal function is not completely specified, in particular, if a signal is caught while a handler is executing. Hence, the second method is preferred, which uses the sigaction structure and function:

#include <stdio.h>
#include <signal.h>

int window_resized = 0;

void my_handler(int sig) {
    window_resized = 1;
}

int main() {
    struct sigaction act;
    act.sa_handler = my_handler;    // Handler function
    sigemptyset(&act.sa_mask);      // Signals to block while running handler
    act.sa_flags = SA_RESTART;      // Flags

    if(sigaction(SIGWINCH, &act, NULL) != 0) {
        return 1; // Could not register handler
    }

    // Wait for window resizes
    while(1) { 
        if(window_resized) {
            printf("Window resized!\n");
            window_resized = 0;
        }
    }

    return 0;
}

The sigaction structure defined in signal.h looks like this:

struct sigaction
{
    handler_t sa_handler;           // Function pointer (8 bytes) 
    unsigned long int sa_mask[16];  // Signal mask      (16*8 = 128 bytes)
    int sa_flags;                   // Flags            (4 bytes)

    // ... Other members
};

The size of the structure as a whole is 152 bytes (!).

The assembly structure definition corresponding to this would be

struc sigaction_t
    sa_handler:     resq 1
    sa_mask:        resq 16
    sa_flags:       resd 1
                    resb 12 ; Padding/other members
endstruc

Fortunately for us, the structure is passed as a pointer, so we don’t have to worry about all the crazy structure-passing rules. We can just allocate a global instance of the structure and pass the address of that.

section .data

SIGWINCH:       equ         28
SA_RESTART:     equ         268435456
msg:            db          "Window resized!\n", 0

window_resized: dq          0

action: istruc sigaction_t
    at sa_handler,  dq              my_handler
    at sa_mask,     times 16 dq     0
    at sa_flags,    dd              SA_RESTART
                    times 12 db     0 
iend

my_handler must be a C-compatible function taking a single int parameter:

my_handler:
    push rbp
    mov rbp, rsp

    mov qword [window_resized], 1

    pop rbp
    ret

The most complex part is main, as it has to set up the signal handler and then loop waiting for signals:

extern sigaction
extern printf

main:
    push rbp
    mov rbp, rsp

    ; Install signal handler
    mov rdi, SIGWINCH
    mov rsi, action
    call sigaction

    cmp rax, 0
    je .continue

    ; Couldn't register handler, return 1
    mov rax, 1
    pop rbp
    ret

    ; Loop forever
.continue:

    cmp qword [window_resized], 1
    jne .continue

    mov rdi, msg
    call printf
    mov qword [window_resized], 0
    jmp .continue

    pop rbp
    mov rax, 0
    ret