Bootstrapped Page

Assembly and ISA Notes

ARM Cortex-M0+ Device

About ARM and Thumb Mode

ARM and Thumb are two different instruction sets supported by ARM cores with a “T” in their name. For instance, ARM7 TDMI supports Thumb mode. ARM instructions are 32 bits wide, and Thumb instructions are 16 wide. Thumb mode allows for code to be smaller, and can potentially be faster if the target has slow memory.

Subroutines

Transferring control from calling program to a subroutine

and back

Alternatively, you can pop the lr register into the pc

Branch Instructions

BL directive

The format of the BL directive is:

bl{Condition} Destination Address

BL is another jump instruction, but before jumping, it saves the current contents of the PC in register R14, so it can be executed by reloading the contents of the R14 to the PC to return to the command after the jump instruction. This directive is a basic but commonly used means of implementing subroutine invocation. The following directives:

BL label; When the program unconditionally jumps to the label label at the time of execution, the current PC value is saved to R14

BLX instruction

The format of the BLX directive is:

BLX Destination Address

The BLX instruction jumps from the arm instruction set to the destination address specified in the instruction and switches the processor's working state to the thumb state, which simultaneously saves the current contents of the PC to the register R14. Therefore, when the subroutine uses the thumb instruction set, and the caller uses the arm instruction set, the subroutine's call and the processor's working state can be toggled through the BLX instruction.

At the same time, the return of the subroutine can be done by copying the register R14 value to the PC.

BX instruction

The format of BX instruction is:

bx{Condition} Destination Address

The BX instruction jumps to the target address specified in the instruction, and the instruction at the destination address can be either an arm instruction or a thumb command.

Program Status Register

The Program Status Register (PSR) combines:

A link register is a special-purpose register which holds the address to return to when a function call completes.

The condition flags

The APSR contains the following condition flags:

N Set to 1 when the result of the operation was negative, cleared to 0 otherwise.

Z Set to 1 when the result of the operation was zero, cleared to 0 otherwise.

C Set to 1 when the operation resulted in a carry, cleared to 0 otherwise.

V Set to 1 when the operation caused overflow, cleared to 0 otherwise.

EXPORT and GLOBAL

Specifies exports using this syntax:

EXPORT|GLOBAL sym{[type]}

sym is the symbol to be exported. [type], if specified, can be either [DATA] to indicate that the symbol points to data or [FUNC] to indicate that the symbol points to code. GLOBAL is a synonym for EXPORT.

PUSH and POP

Push registers onto, and pop registers off a full-descending stack.

Syntax

PUSH reglist
POP reglist

where:

reglist Is a non-empty list of registers, enclosed in braces. It can contain register ranges.

It must be comma separated if it contains more than one register or register range. Operation

PUSH stores registers on the stack, with the lowest numbered register using the lowest memory address and the highest numbered register using the highest memory address. POP loads registers from the stack, with the lowest numbered register using the lowest memory address and the highest numbered register using the highest memory address.

PUSH uses the value in the SP register minus four as the highest memory address, POP uses the value in the SP register as the lowest memory address, implementing a full-descending stack. On completion, PUSH updates the SP register to point to the location of the lowest store value,

POP updates the SP register to point to the location above the highest location loaded.

If a POP instruction includes PC in its reglist, a branch to this location is performed when the

POP instruction has completed. Bit[0] of the value read for the PC is used to update the APSR T-bit. This bit must be 1 to ensure correct operation.

MOV and MVN

MOVS Rd, #imm

where:

S Is an optional suffix. If S is specified, the condition code flags are updated on the result of the operation, see Conditional execution on page 3-9.

Rd Is the destination register.

The MOVS instruction performs the same operation as the MOV instruction, but also updates the N and Z flags.

LDR & STR

Load and Store with register offset.

Syntax

LDR Rt, [Rn, Rm]
LDR<B|H> Rt, [Rn, Rm]
LDR<SB|SH> Rt, [Rn, Rm]
STR Rt, [Rn, Rm]
STR<B|H> Rt, [Rn, Rm]

where:

Rt Is the register to load or store.

Rn Is the register on which the memory address is based.

Rm Is a register containing a value to be used as the offset.

ANDS ORRS EORS BICS

ANDS {Rd,} Rn, Rm
ORRS {Rd,} Rn, Rm
EORS {Rd,} Rn, Rm
BICS {Rd,} Rn, Rm

where:

Rd Is the destination register.

Rn Is the register holding the first operand and is the same as the destination register.

Rm Second register.

Operation

The AND, EOR, and ORR instructions perform bitwise AND, exclusive OR, and inclusive OR operations on the values in Rn and Rm.

The BIC instruction performs an AND operation on the bits in Rn with the logical negation of the corresponding bits in the value of Rm.

The condition code flags are updated on the result of the operation.

ASR, LSL, LSR, and ROR

Arithmetic Shift Right, Logical Shift Left, Logical Shift Right, and Rotate Right. Syntax

ASRS {Rd,} Rm, Rs
ASRS {Rd,} Rm, #imm
LSLS {Rd,} Rm, Rs
LSLS {Rd,} Rm, #imm
LSRS {Rd,} Rm, Rs
LSRS {Rd,} Rm, #imm
RORS {Rd,} Rm, Rs

where:

Rd Is the destination register. If Rd is omitted, it is assumed to take the same value as Rm.

Rm Is the register holding the value to be shifted.

Rs Is the register holding the shift length to apply to the value in Rm.

imm Is the shift length. The range of shift length depends on the instruction:

ASR shift length from 1 to 32

LSL shift length from 0 to 31

LSR shift length from 1 to 32.

ADDS

ADD{S} {Rd,} Rn, <Rm|#imm>

Where:

S Causes an ADD or SUB instruction to update flags

When the optional Rd register specifier is omitted, it is assumed to take the same value as Rn, for example ADDS R1,R2 is identical to ADDS R1,R1,R2.

SXT and UXT

Sign extend and Zero extend.

Suppose an integer register is 32 bits. When a value is loaded from memory with fewer than 32 bits, the remaining bits must be assigned.

Sign extension is used for signed loads of bytes (8 bits using the lb instruction) and halfwords (16 bits using the lh instruction). Sign extension replicates the most significant bit loaded into the remaining bits.

Zero extension is used for unsigned loads of bytes (lbu) and halfwords (lhu). Zeroes are filled in the remaining bits.

Syntax

SXTB Rd, Rm
SXTH Rd, Rm
UXTB Rd, Rm
UXTH Rd, Rm

where:

Rd Is the destination register.

Rm Is the register holding the value to be extended. Operation

These instructions extract bits from the resulting value:

Restrictions

In these instructions, Rd and Rm must only specify R0-R7.

BNE

BNE (short for "Branch if Not Equal") is the mnemonic for a machine language instruction which branches, or "jumps", to the address specified if, and only if the zero flag is clear.

BMI

BMI (short for "Branch if MInus") is the mnemonic for a machine language instruction which branches, or "jumps", to the address specified if, and only if the negative flag is set.

BPL

BPL (short for "Branch if PLus") is the mnemonic for a machine language instruction which branches, or "jumps", to the address specified if, and only if the negative flag is clear.

BCC

BCC (short for "Branch if Carry is Clear") is the mnemonic for a machine language instruction which branches, or "jumps", to the address specified if, and only if the carry flag is clear.

BEQ

BEQ (short for "Branch if EQual") is the mnemonic for a machine language instruction which branches, or "jumps", to the address specified if, and only if the zero flag is set.

Accessing variables in C

more

int p, k; //signed integer (32-bit) variables*
int w[10]; //array of 10 (32-bit) integers
p = k; //copy k to p
ldr r0,=k ;r0 = address of variable k
ldr r1,[r0] ;read value of k from memory and put -> r1
ldr r0,=p ;r0 = address of p
str r1,[r0] ;write value in r1 to variable at memory address p

A link register is a special-purpose register which holds the address to return to when a function call completes. This is more efficient than the more traditional scheme of storing return addressed on a call stack, sometimes called a machine stack. The link register does not require the writes and reads of the memory containing the stack which can save a considerable percentage of execution time with repeated calls of small subroutines.

PUSH and POP

Push registers onto, and pop registers off a full-descending stack.

Syntax

PUSH reglist
POP reglist

Examples

PUSH {R0,R4-R7} ; Push R0,R4,R5,R6,R7 onto the stack
PUSH {R2,LR} ; Push R2 and the link-register onto the stack
POP {R0,R6,PC} ; Pop r0,r6 and PC from the stack, then branch to
; the new PC.

Assembly - NASM

$

$ is the address of the current position before emitting the bytes (if any) for the line it appears on.

$ - msg is like doing here - msg, i.e. the distance in bytes between the current position (at the end of the string) and the start of the string.

$$

$$ refers to the address of the 1st line (where our section started).

Data

What does the 'h' suffix mean?

In x86 assembly, what does an h suffix on numbers represent?

For example:

sub CX, 13h

"H" for "Hexadecimal"

Registers

Modern (i.e 386 and beyond) x86 processors have eight 32-bit general purpose registers, as depicted in Figure 1. The register names are mostly historical. For example, EAX used to be called the accumulator since it was used by a number of arithmetic operations, and ECX was known as the counter since it was used to hold a loop index. Whereas most of the registers have lost their special purposes in the modern instruction set, by convention, two are reserved for special purposes — the stack pointer (ESP) and the base pointer (EBP).

For the EAX, EBX, ECX, and EDX registers, subsections may be used. For example, the least significant 2 bytes of EAX can be treated as a 16-bit register called AX. The least significant byte of AX can be used as a single 8-bit register called AL, while the most significant byte of AX can be used as a single 8-bit register called AH. These names refer to the same physical register. When a two-byte quantity is placed into DX, the update affects the value of DH, DL, and EDX. These sub-registers are mainly hold-overs from older, 16-bit versions of the instruction set. However, they are sometimes convenient when dealing with data that are smaller than 32-bits (e.g. 1-byte ASCII characters).

When referring to registers in assembly language, the names are not case-sensitive. For example, the names EAX and eax refer to the same register.

AX is the 16 lower bits of EAX. AH is the 8 high bits of AX (i.e. the bits 8-15 of EAX) and AL is the least significant byte (bits 0-7) of EAX as well as AX.

Example (Hexadecimal digits):

EAX: 12 34 56 78
AX: 56 78
AH: 56
AL: 78

INT (x86 instruction)

INT is an assembly language instruction for x86 processors that generates a software interrupt. It takes the interrupt number formatted as a byte value.[1]

When written in assembly language, the instruction is written like this:

INT X

where X is the software interrupt that should be generated (0-255).

Depending on the context, compiler, or assembler, a software interrupt number is often given as a hexadecimal value, sometimes with a prefix 0x or the suffix h. For example, INT 13H will generate the software interrupt 0x13 (19 in decimal), causing the function pointed to by the 20th vector in the interrupt table to be executed, which is typically a DOS API call.

INT 10H

INT 10h, INT 10H or INT 16 is shorthand for BIOS interrupt call 10hex, the 17th interrupt vector in an x86-based computer system. The BIOS typically sets up a real mode interrupt handler at this vector that provides video services. Such services include setting the video mode, character and string output, and graphics primitives (reading and writing pixels in graphics mode).

To use this call, load AH with the number of the desired subfunction, load other required parameters in other registers, and make the call. INT 10h is fairly slow, so many programs bypass this BIOS routine and access the display hardware directly. Setting the video mode, which is done infrequently, can be accomplished by using the BIOS, while drawing graphics on the screen in a game needs to be done quickly, so direct access to video RAM is more appropriate than making a BIOS call for every pixel.

Defining Strings

The convention is to declare strings as null-terminating, which means we always declare the last byte of the string as 0, as follows:

my_string:
    db ’Booting OS’,0

When later iterating through a string, perhaps to print each of its characters in turn, we can easily determine when we have reached the end.

Function Calls

At the CPU level a function is nothing more than a jump to the address of a useful routine then a jump back again to the instruction immediately following the first jump.

The caller code could store the correct return address (i.e. the address immediately after the call) in some well-known location, then the called code could jump back to that stored address. The CPU keeps track of the current instruction being executed in the special register ip (instruction pointer), which, sadly, we cannot access directly. However, the CPU provides a pair of instructions, call and ret, which do exactly what we want: call behaves like jmp but additionally, before actually jumping, pushes the return address on to the stack; ret then pops the return address off the stack and jumps to it.

When we call a function, such as a print function, within our assembly program, internally that function may alter the values of several registers to perform its job (indeed, with registers being a scarce resource, it will almost certainly do this), so when our program returns from the function call it may not be safe to assume, say, the value we stored in dx will still be there.

It is often sensible (and polite), therefore, for a function immediately to push any registers it plans to alter onto the stack and then pop them off again (i.e. restore the registers’ original values) immediately before it returns. Since a function may use many of the general purpose registers, the CPU implements two convenient instructions, pusha and popa, that conveniently push and pop all registers to and from the stack respectively.

Examlmple:

some_function:
    pusha
    mov bx, 10 add bx, 20 mov ah, 0x0e int 0x10 popa
    ret

Include Files

After slaving away even on the seemingly simplest of assembly routines, you will likely want to reuse your code in multiple programs. nasm allows you to include external files literally as follows:

%include "my_print_function.asm" ; this will simply get replaced by 
; the contents of the file
...
mov al, ’H’ ; Store ’H’ in al so our function will print it. 
call my_print_function

IMP: remember to include subroutines below the hang ie, jmp $