Software programmers are conditioned to think of memory as a simple array of bytes and the basic data types are composed of one or more blocks of memory. However, the computer's processor does not read from and write to memory in single byte-sized chunks. Instead, today's modern CPUs access memory in 2, 4, 8, 16, or even 32-byte chunks at a time - although 32 bit and 64 bit instruction set architecture (ISA) architectures are the most common. Due to how the memory is organized in your system, the addresses of these chunks should be multiples of their sizes. If an address satisfies this requirement, then it is said to be aligned. The difference between how high-level programmers think of memory and how modern processors actually work with memory is pretty important in terms of application correctness and performance. For example, if you don't understand the address alignment issues in your software, the following situations are all possible:
- your software will run slower
- your application will lock up/hang
- your operating system can crash
- your software will silently fail, yielding incorrect results
The C++ language provides a set of fundamental types of various sizes. To
make manipulating variables of these types fast, the generated object code will try to
use CPU instructions that read/write the whole data type at once. This in turn means
that the variables of these types should be placed in memory in a way that makes their
addresses suitably aligned. As a result, besides size, each fundamental type has another
property: its alignment requirement. It may seem that the fundamental type’s alignment
is the same as its size. This is not generally the case since the most suitable CPU
instruction for a particular type may only be able to access a part of its data at a
time. For example, a 32-bit x86 GNU/Linux machine may only be able to read at most 4
bytes at a time so a 64-bit long long
type will have a
size of 8 and an alignment of 4. The following table shows the size and alignment (in
bytes) for the basic native data types in C/C++ for both 32-bit and 64-bit x86-64
GNU/Linux machines.
Type | 32-bit x86 GNU/Linux | 64-bit x86 GNU/Linux | ||
---|---|---|---|---|
Size | Alignment | Size | Alignment | |
bool | 1 | 1 | 1 | 1 |
char | 1 | 1 | 1 | 1 |
short int | 2 | 2 | 2 | 2 |
int | 4 | 4 | 4 | 4 |
long int | 4 | 4 | 8 | 8 |
long long int | 8 | 4 | 8 | 8 |
float | 4 | 4 | 4 | 4 |
double | 8 | 4 | 8 | 8 |
long double | 12 | 4 | 16 | 16 |
void* | 4 | 4 | 8 | 8 |
__attribute__ ((aligned(X)))
in order to change the default
alignment for the variable, structures/classes, or a structure field, measured in bytes.
For example, the following declaration causes the compiler to allocate the global
variable x on a 16-byte boundary.
int x __attribute__ ((aligned (16))) = 0;
The __attribute__((aligned (X)))
does
not change the sizes of variables it is applied to, but may change the memory layout of
structures by inserting padding between elements of the struct. As a result, the size of
the structure will change. If you don't specify the alignment factor in an aligned
attribute, the compiler automatically sets the alignment for the declared variable or
field to the largest alignment used for any data type on the target machine you are
compiling for. Doing this can often make copy operations more efficient because the
compiler can use whatever instructions copy the biggest chunks of memory when performing
copies to or from the variables or fields that you have aligned this way. The aligned
attribute can only increase the alignment and can
never decrease it. The C++ function offsetof can be used to determine the alignment of each
member element in a structure.