C to assembly: loops, structs and arrays
We have covered the C calling convention, frame pointers and the assembly code in the previous article. This article will focus on the code generation for:
Code generation for a "while" loop
The following example shows the code generation for a simple while loop. Also note that the function shown below does not use a frame pointer as this function does not have local variables. Since the FRAME_POINTER register is not used, parameter access is carried out by directly taking offsets from the STACK_POINTER register.
Code generation for a while loop
void copy_string(const char *source, char *destination)
{
// Save registers on stack so that they can be used in this function
PUSH R1
PUSH R2
// Move destination to register R1 (Direct offset from STACK_POINTER)
MOVE 16(STACK_POINTER), R1
// Move source to address register R2 (Direct offset from STACK_POINTER)
MOVE 12(STACK_POINTER), R2
while(*destination++ = *source++);
// Define the label L1 for beginning of loop
// Copy the byte from source to destination and increment pointers
L1 MOVE_BYTE (R2), (R1)
ADD #1, R1
ADD #1, R2
// Branch if not equal to zero. If the value moved to the destination
// is not zero, branch to L1 (i.e. stay in the while loop)
// Exit while loop as soon as a zero is moved into the destination
BRANCH_IF_NOT_EQUAL L1
}
// Restore the old values of R1 and R2 by popping them off the stack
POP R2
POP R1
// Return back to caller
RETURN_FROM_SUBROUTINE
Code generation for a "for" loop
Code generation for the for loop is covered in the example given below.
Code generation for a for loop
for (i=0; i < 100; i++)
{
. . .
}
* Data Register R2 is used to implement i.
* Set R2 to zero (i=0)
MOVE #0, R2
L1
. . .
// Increment i for (i++)
ADD #1, R2
// Check for the for loop exit condition ( i < 100)
COMPARE #100, R2
// Branch to the beginning of for loop if less than flag is set
BRANCH_IF_LESS_THAN L1
Code generation for structure access
The code generation for C structure access is covered here. The example shows the filling of a message structure. This function does not have LINK and UNLK as the local variable p_msg has been assigned to a register, so no space needs to be allocated for local variables on the stack.
Code generation for structure access
typedef struct
{
UChar type;
UChar sub_type;
UShort source_id;
} MessageHeader;
typedef struct
{
MessageHeader header;
UChar status;
UChar reason;
UShort padding;
} StatusMessage;
void send_status_message()
{
# Save Address Register R1 on stack. Register R1 will be used for p_msg
PUSH R1
StatusMessage *p_msg;
p_msg = malloc(sizeof(StatusMessage));
// Push the size of StatusMessage on the stack
PUSH #8
// Call malloc
JUMP_TO_SUBROUTINE _malloc
// Pop the parameter off the stack
ADD #4, STACK_POINTER
// _malloc returns in R0, this value is copied to R1
MOVE.L R0, R1
p_msg->header.type = STATUS_MESSAGE;
// Type is at offset 0 in the structure
MOVE_BYTE #100, (R1)
p_msg->header.sub_type = 0;
// Clear sub_type at offset 1
MOVE_BYTE #0, 1(R1)
p_msg->header.source_id = TASK_ID;
// source_id is at offset 2 and involves a word access
MOVE_WORD #0x0A22, 2(R1)
p_msg->status = SUCCESS;
// Status field is at offset 4
MOVE_BYTE #1, 4(R1)
p_msg->reason = NOTHING;
// Reason field is at offset 5
MOVE #1, 5(R1)
send_message(STATUS_TASK_ID, p_msg);
// Push p_msg on to the stack
PUSH R1
// Push STATUS_TASK_ID on to the stack
PUSH #0x0B23
// Invoke send_message
JUMP_TO_SUBROUTINE _send_message
// Pop the parameters off the stack
ADD #8, STACK_POINTER
}
// Restore Address Register R1
POP R1
RETURN_FROM_SUBROUTINE
Code generation for array indexing
The code below shows an instance of array indexing. The generated code is very inefficient because it leads to a multiply by structure size. This overhead can also be reduced by making the size of the structure a power of 2, i.e. 2, 4, 8, 16 etc. In such cases the compiler would replace the multiply with a shift instruction.
Code generation for array indexing
typedef struct
{
Ulong a;
Ushort b;
} ABStructure;
for (i=0; i < 100; i++)
{
abstructure[i].x = 0;
. . .
}
// Code for i = 0, R2 is used for i
MOVEL #0, R2
// Move index i into R3
L1 MOVE_WORD R2, R3
// Multiply index with size of ABStructure to determine the offset for the indexed
// structure
MULTIPLY_WORD #6, R3
// Base address of abstructure is in R5, add the offset in D3 to obtain the address
// of the indexed structure entry and set it to zero
MOVE #0, (R5, R3)
. . .
// i++
ADD #1, R2
// For loop comparison (i < 100)
COMPARE #100, R2
// If less than 100, branch to loop start at L1
BRANCH_IF_LESS_THAN L1
Most compilers will optimize the above code by directly incrementing the pointer in a loop. The optimized code and the generated assembly code are shown below. This optimization really speeds up array indexing in a loop as multiply/shifts are avoided.
Code generation for array indexing (optimized)
// Effective C Code
ABStructure *ptr = &abstructure[0];
for (i=0; i < 100; i++)
{
ptr->x = 0;
. . .
ptr++;
}
// Store the base address of the array in Address Register A5
MOVE #_abstructure, R5
// Code for i = 0, R2 is used for i
MOVE #0, R2
// Code corresponding to ptr->x = 0;
L1 MOVE #0, (R5)
. . .
// Code corresponding to ptr++
// Add size of ABStructure to address in A5 (ptr)
ADD #6, R5
// i++
ADD #1, R2
// For loop comparison (i < 100)
COMPARE #100, R2
// If less than 100, branch to loop start at L1
BRANCH_IF_LESS_THAN L1
