Rust to assembly: Arrays, Tuples, Box, and Option handling
We have already seen how Rust handles enums under the hood. We also looked at the code generation for the Box smart pointer. Here we put these items together in a Rust example that describes how arrays, tuples, Option enum, and Box smart pointer allocations are handled at the assembly level.
Code example
We will be dissecting the assembly generated for the following code. The code is annotated to explain the Rust code.
// A named tuple with two fields.
pub struct Coordinate(f64, f64);
// A named tuple that contains two named tuples.
pub struct Line(Coordinate, Coordinate);
// This function takes an optional named tuple and returns an optional Box smart pointer containing four
// coordinates.
pub fn make_quad_coordinates(maybe_coordinate: Option<Coordinate>) -> Option<Box<[Coordinate; 4]>> {
// Check if the Option uses the Some variant. If it does, then extract the contents of the tuple
// into x and y variables. If a None variant is used, the function simply returns.
// The question mark is syntactic sugar for matching and extracting from a Some and returning on None.
let Coordinate(x, y) = maybe_coordinate?;
// Create a new Box smart pointer containing four coordinates. Note that this involves memory
// allocation on the heap.
Some(Box::new([
Coordinate(x, y),
Coordinate(-x, -y),
Coordinate(-x, y),
Coordinate(x, -y),
]))
}
// This function takes and optional named tuple and returns an optional tuple by value.
pub fn cross_lines_from_quad_coordinates(
maybe_coordinate: Option<Coordinate>,
) -> Option<(Line, Line)> {
// Pattern match and extract the contents of the array if the Option uses the Some variant.
// Return None if the Option uses the None variant.
let [a, b, c, d] = *make_quad_coordinates(maybe_coordinate)?;
// Form two lines from four coordinates and return them as a tuple.
// The tuple is wrapped in a Some variant.
Some((Line(a, b), Line(c, d)))
}
Assembly code for make_quad_coordinates
pub fn make_quad_coordinates(maybe_coordinate: Option<Coordinate>) -> Option<Box<[Coordinate; 4]>> {
let Coordinate(x, y) = maybe_coordinate?;
Some(Box::new([
Coordinate(x, y),
Coordinate(-x, -y),
Coordinate(-x, y),
Coordinate(x, -y),
]))
}
The assembly code for the make_quad_coordinates
function is shown below. The code has been annotated to explain the mapping to Rust code. Key points in the generated code are:
- Since the function returns a
Box
smart pointer, the assembly code allocates memory on the heap.__rust_alloc
is used to allocate memory on the heap. - If the heap allocation fails, the function throws an exception using a special instruction (
ud2
). - Rust
enum
s typically result in generating a discriminant value that is used to select the variant. Rust code generator optimizes theOption<Box>
implementation by using the NULL pointer as the discriminant value.
Understanding the assembly code will be aided by understanding the memory layout of several data types used in the code.
Representation of Option<Coordinate>
The memory layout of the Option<Coordinate>
type is shown below. Byte offset 0 is the discriminant used to distinguish between the variants Some
and None
. The Coordinate
tuple is stored in the next two entries.
Byte offset | None | Some |
---|---|---|
0 | Discriminator (0) | Discriminator (1) |
8 | f64 | |
16 | f64 |
Representation of Option<Box<[Coordinate; 4]>>
The memory layout of the Option<Box<[Coordinate; 4]>>
type is shown below. There are two memory locations in the Option<Box<[Coordinate; 4]>>
type.
The first is the pointer to the array of coordinates. The second is the array of Coordinate
objects.
Option<Box>
on the stack
The Rust code generator optimizes the Option<Box<>
type to a single pointer on the stack. The pointer works as a pointer to the array of coordinates as well as the discriminator. If the pointer is NULL, the Option
variant is assumed to be None
. A nonzero pointer indicates that the Option
variant is Some
.
Byte offset | None | Some |
---|---|---|
0 | The Box pointer is NULL | The Box pointer contains a valid address that points to the heap. |
[Coordinate; 4]
array on the heap
The [Coordinate; 4]
array is allocated on the heap. The heap pointer is stored in the Box
pointer. The Box
pointer points to the memory shown below. The array contains four Coordinate
objects.
Byte offset | Content |
---|---|
0 | Coordinate |
16 | Coordinate |
32 | Coordinate |
48 | Coordinate |
IEEE 754 floating point standard
Sign | Exponent | Fraction (Mantissa) |
---|---|---|
bit 63 | bit 62 to 52 | Bit 51 to 0 |
1 bit | 11 bits | 52 bits |
; The address of Option<Coordinate> is passed in the rdi register.
; The representation of Option<Coordinate> is shown in the table above.
.LCPI0_0:
.quad 0x8000000000000000 ; Constants used for flipping the signed bit in...
.quad 0x8000000000000000 ; ..of a 64-bit floating point number.
example::make_quad_coordinates:
sub rsp, 40 ; Reserve space for local variables.
cmp qword ptr [rdi], 0 ; Check if the discriminator is set to 0 (None case)
je .LBB0_1 ; If the discriminator is 0, jump to the exit point.
; == Processing Some case of Option ==
movsd xmm0, qword ptr [rdi + 8] ; Get the first entry from the Coordinate tuple
; (Note the 8-byte offset is needed to skip the discriminator)
movaps xmmword ptr [rsp + 16], xmm0 ; Save it into the x local variable om the stack.
movsd xmm0, qword ptr [rdi + 16] ; Get the second entry from the Coordinate tuple
movaps xmmword ptr [rsp], xmm0 ; Save it into the y local variable om the stack.
; Requesting memory allocation for Option<Box<[Coordinate; 4]>>.
; The array has 4 entries, and each entry needs 16 bytes (for the two f64s)
; This adds up to a total of 64 bytes.
; Note that no space is needed for the Option discriminator as all 0s (NULL)
; can be used to represent the None condition. Any nonzero heap address signifies
; the Some condition.
mov edi, 64 ; Request 64 byte memory
mov esi, 8 ; Alignment is set to 8 bytes
call qword ptr [rip + __rust_alloc@GOTPCREL] ; Request memory allocation. The result is returned in rax.
test rax, rax ; Check if the memory allocation returned an all zero address (NULL)
je .LBB0_5 ; If yes, jump to the out of memory error handling
movaps xmm0, xmmword ptr [rip + .LCPI0_0] ; Load the constant with the most significant bit set
movaps xmm2, xmmword ptr [rsp + 16] ; Load the x local variable and keep it xmm2 for future use.
movaps xmm1, xmm2 ; Copy x into xmm1 as well.
xorps xmm1, xmm0 ; Flip the sign bit of x to store -x in xmm1
movaps xmm3, xmmword ptr [rsp] ; Load local variable y into xmm3
xorps xmm0, xmm3 ; Flip the sign bit of y to store -y in xmm0
; rax contains the pointer to the allocated heap memory for Option<Box<[Coordinate; 4]>>
movsd qword ptr [rax], xmm2 ; Copy x to the heap
movsd qword ptr [rax + 8], xmm3 ; Copy y to the heap
movsd qword ptr [rax + 16], xmm1 ; Copy -x to the heap
movsd qword ptr [rax + 24], xmm0 ; Copy -y to the heap
movsd qword ptr [rax + 32], xmm1 ; Copy -x to the heap
movsd qword ptr [rax + 40], xmm3 ; Copy y to the heap
movsd qword ptr [rax + 48], xmm2 ; Copy x to the heap
movsd qword ptr [rax + 56], xmm0 ; Copy -y to the heap
add rsp, 40 ; Pop off the local variables from the stack
ret ; Return the result in rax
; == Processing None case of Option ==
; Rust implements optimizes Option<Box>, the None case is signaled by an all-zero (NULL)
; returned to the caller. There is no discriminator needed.
.LBB0_1:
xor eax, eax ; Setting the lower 32 bits of rax to 0
; Upper bits must be 0 to return an all zero rax
add rsp, 40 ; Pop off the local variables
ret ; Return NULL to signal None
; == Exception processing (memory allocation failed)
.LBB0_5:
mov edi, 64 ; Size of the failed allocation (64-bit)
mov esi, 8 ; Alignment of the failed allocation (8-byte)
call qword ptr [rip + alloc::alloc::handle_alloc_error@GOTPCREL] ; Call alloc error handler
ud2 ; Raise invalid opcode exception to trigger the exception handler.
Assembly code for cross_lines_from_quad_coordinates
pub fn cross_lines_from_quad_coordinates(
maybe_coordinate: Option<Coordinate>,
) -> Option<(Line, Line)> {
let [a, b, c, d] = *make_quad_coordinates(maybe_coordinate)?;
Some((Line(a, b), Line(c, d)))
}
The assembly code for cross_lines_from_quad_coordinates
really surprised us. We were expecting to see a heap allocation in the return value from the call to the make_quad_coordinates
function. Since the Box
was going to be consumed in the function, we were expecting to see a de-allocation of the heap memory before the function returns. What we see is a very efficient generated code that inlined the make_quad_coordinates
and eliminated the Box
altogether. Thus, saving a memory allocation and de-allocation.
The key points in the generated assembly code are:
- The compiler inlines the
make_quad_coordinates
function. This results in deep optimization of the code. - The compiler eliminates the
Box
allocation and de-allocation. - The generated code also optimizes the memory writes by joining together two 64-bit writes into a single 128-bit write.
Understanding the representation of Option<Line,Line>
will assist in keeping track of the flow of the assembly code.
Representation of Option<(Line, Line)>
Byte offset | None | Some |
---|---|---|
0 | Discriminator (0) | Discriminator (1) |
8 | tuple.0 First Coordinate (x): f64 | |
16 | tuple.0 First Coordinate (y): f64 | |
24 | tuple.0 Second Coordinate (x): f64 | |
32 | tuple.0 Second Coordinate (y): f64 | |
40 | tuple.1 First Coordinate (x): f64 | |
48 | tuple.1 First Coordinate (y): f64 | |
56 | tuple.1 Second Coordinate (x): f64 | |
64 | tuple.1 Second Coordinate (y): f64 |
.LCPI1_0:
.quad 0x8000000000000000 ; Constants used for flipping the signed bit in...
.quad 0x8000000000000000 ; ..of a 64-bit floating point number.
example::cross_lines_from_quad_coordinates:
mov rax, rdi ; Return address information is copied to rax
cmp qword ptr [rsi], 0 ; Check if the discriminator is set to 0 (None case)
je .LBB1_1 ; If the discriminator is 0, jump to the exit point.
; The compiler has inlined the make_quad_coordinates function.
movsd xmm0, qword ptr [rsi + 8] ; Use xmm0 as x
movsd xmm1, qword ptr [rsi + 16] ; Use xmm1 as y
movaps xmm2, xmmword ptr [rip + .LCPI1_0] ; Load the constant with the most significant bit set
movaps xmm3, xmm0 ; Copy x to xmm3
xorps xmm3, xmm2 ; Negate x by toggling the sign bit. xmm3 how contains -x.
xorps xmm2, xmm1 ; Negate y by toggling the sign bit. xmm2 how contains -y.
; Prepare the return value in the memory pointed by rax
movsd qword ptr [rax + 8], xmm0 ; Copy x for the tuple.0 First Coordinate(x) to the heap
movsd qword ptr [rax + 16], xmm1 ; Copy y for the tuple.0 First Coordinate(y) to the heap
movaps xmm4, xmm3 ; Move -x into xmm4 (lower 64-bits)
movlhps xmm4, xmm2 ; Copy -y into xmm4 (upper 64-bits)
movups xmmword ptr [rax + 24], xmm4 ; tuple.0 Second Coordinate(x, y): Write a 128-bit word to the
; memory corresponding to x and y
; Note that Intel ordering will result in -x being written at
; rax+24 address. -y will be written at rax+32 address.
movlps qword ptr [rax + 40], xmm3 ; tuple.1 First Coordinate (x): -x written to memory.
movsd qword ptr [rax + 48], xmm1 ; tuple.1 First Coordinate (y): y written to memory.
movsd qword ptr [rax + 56], xmm0 ; tuple.1 Second Coordinate (x): x written to memory.
movlps qword ptr [rax + 64], xmm2 ; tuple.1 Second Coordinate (y): -y written to memory.
mov ecx, 1 ; Set the discriminator for Some (32 bit write)
mov qword ptr [rax], rcx ; Write the discriminator to memory (rax is the 64-bit )
ret ; Return the result in rax
.LBB1_1:
xor ecx, ecx ; None case (Discriminator is set to 0)
mov qword ptr [rax], rcx ; Write the None discriminator
ret