Skip to content

Assembly Language Syntax

Michael Kamprath edited this page Aug 1, 2021 · 41 revisions

General Assembler Syntax

  • Each line pertains to at most one instruction or label
  • Whitespace is generally ignored except for the minimal amount required to seperate parts of an instruction.
  • Any characters after and including a semicolon ; on any given line are consider to be comments

Numeric Values

Anytime a numeric values is to be expressed, whether it be a immediate value or a memory address, it can be written in decimal, hex, or binary form as shown here:

Type Syntax
Decimal 124
Hex $7C
Hex 0x7C
Binary b01111100
Binary %01111100

Numeric Expressions

Numeric expressions that can be resolved at compile time are supported. A numeric expression can be composed of any number of explicit numeric values, address labels, constant labels, or numeric operators. The supported operators are:

Operator Description
+ Addition
- Subtraction
* Multiply
/ Divide
& Bit-wise AND
` `
^ Bit-wise XOR
( and ) Expression grouping. Parenthesis must be paired

Note that numeric expressions are not the same thing as an offset for a register indirect addressing mode, though the offset value can be expressed as a numeric expression.

Label

A label is a string that can be resolved at compile time to a specific numeric value. Labels can be composed of alphanumeric characters and the underscore _. All labels must be distinct.

Address Label

An address label represents a specific address in the byte sequence being assembled. A label does not generate byte code on its own, but can be used as an instruction argument to specify a specific address value. A label's address value is implied by its relative location among the lines to be assembled.

A label is represented by any alphanumeric character string the immediately precedes a colon :. There will be only one label allowed per line.

Constant Label

A constant is a special label that has an explicitly assigned numeric value. Constants can be placed anywhere in the assembly code, as its value is only set by the assigned value. Assignment uses the following syntax based on the = sign:

constant_var = 10204

Constants cannot be assign a numeric expression, they must be assigned an explicit numeric value.

Register Labels

A register label is defined in the instruction set configuration file. It is used to represent hardware registers in operands of instructions. Note that address and constant labels cannot use a string that has been declared a register label.

Instructions

Instruction are converted into byte code. It is composed of a specific instruction mnemonic and an option list of operands according to this format:

MNEMONIC [OPERAND1[, OPERAND2[...]]]

Addressing Modes

BespokeASM supports several addressing mode notations for instruction operands, though the precise meaning of each is defined by the instruction set configuration file and the hardware. Explained here is the nominal application of each addressing mode notation.

Mode Notation Description
Immediate numeric_expression A constant value to be used as an operand. The constant value is indicated by a numeric expression.
Indirect [numeric_expression] A value that resides at a memory address indicated by a constant value. The constant value memory address is indicated by a numeric expression.
Register register_label The value in a specified register. The register is indicated by a register label.
Register Indirect [register_label + offset] The specified register contains a memory address where the value is. An offset can be provided which should be added to the value in the register get the memory address where the desired value is. The register is indicated by a register label, and the offset is provided as a numeric expression and follows the register label with a + or - sign in between it and and the register label.

Directives

Directives tell the assembler to do specific things when creating the byte code. Directives start with a period .. There are a few directives supported:

Directive Description
.org X Resets the location counter (address) the assembler is using to address X
.fill N, Y Fills the next N bytes with the byte value Y
.zero N Shorthand for .fill N, 0
.zerountil X Fills the next bytes up to and including address X with the value of 0. Will emit nothing if address X is less than the address location of this directive.

Also supported are data directives, described below.

Data

A data directive allows for explicitly set byte code. Like an instruction, its relative position in the assembly code defines its memory address, but unlike the instruction the byte code edited is directly defined in the assembly code. When paired with a label, a data directive can be used to define variables and other memory blocks.

The data directives have several forms, each indicating how much data is being defined:

Directive Data Value Size Data Length Endian
.byte 1 byte Variable N/A
.2byte 2 bytes Variable Default
.4byte 4 bytes Variable Default

The syntax of usage is simply the directive followed the a data values to be written. More than one value can be provided by a comma separated list of values or labels/constants. The value assembled into the byte code will be masked by the data value size of the directive.

The .byte directive can be used to define character strings delineated by a " or '. Quotes and apostrophes within the quoted string should be escaped. The data values generated will be the ASCII values for each character in the string, terminated by a zero value (C-style strings).

For multi-byte types (.2byte, .4byte, etc), the endian representation of each individual value uses the configured default endianness specified in the instruction set configuration file.

This example includes a label to be used to make the data's address usable elsewhere in the assembly code:

const_value = $BE

single_bytes:
    .byte $DE
    .byte $AD
    .byte const_value
    .byte $EF
byte_list:
    .byte $DE, 0xAD, const_value, $EF
test_str:
    .byte "It\'s a test string"

int16_value:
	.2byte $dead, $beef

int32_value:
	.4byte $deadbeef

Examples

Ben Eater SAP-1

The following example using the instruction set for Ben Eater's SAP-1 Breadboard CPU.

; Count by Loop
;
; For the Ben Eater SAP-1 breadboard CPU
;

zero = 0              ; constant value for 0
one = 1               ; constant value for 1

start:
  ldi zero            ; load value of 0 into A
  out                 ; display

add_loop:
  add increment       ; add current value at 0xF to A
  jc increment_step   ; increment the step if overflow
  out                 ; display
  jmp add_loop        ; loop

increment_step:
  lda increment       ; load current increment value
  add one_value       ; add 1 to increment value
  jc restart_loops    ; if it overflows, just reset everything
  sta increment       ; save updated increment value
  jmp start           ; restart counting

restart_loops:
  ldi one             ; load the value of 1 into register A
  sta increment       ; reset the increment value to 1
  jmp start           ; restart counting

one_value:
  .byte 1             ; 1 value needed for incrementing the increment value

increment:
  .byte 1             ; storage for the current increment value

Recursion with Subroutines

Here is an example that employs an instruction set that enable subroutines (call, rts), a stack (push, pop) and indirect addressing modes. It uses 16-bit addressing and little endian. The example configuration file for this instruction set is here. Also assumes a memory map with $0000 is the start of ROM and $8000 is the start of RAM.

;
; Variables
;

.org $8000           ; variables should be in RAM
n_value:
  .byte 5            ; N value to calculate factorial for

;
; Code
;

.org 0               ; code goes in ROM 
start:
  push [n_value]     ; push the value at n_value onto the stack
  call factorial     ; jump to the factorial subroutine
  out                ; factorial results are in A register. display it
  hlt                ; done

; factorial subroutine
;
; Input:
;   stack - function return pointer
;   stack+2 - The input N value to calculate factorial. A single 8-bit value
;
; Output:
;   A register - the results of the factorial calculation. A single 8-bit value
;
; Registers used: A
;
factorial:
  mov [sp+2],a      ; copy the N value to A register
  je f_stop,1       ; jump to f_stop if A is 1
  sub 1             ; subtract 1 from A to get (N-1)
  push a            ; put the n-1 value on the stack
  call factorial    ; recurse into factorial
  pop               ; remove the (N-1) value from stack
  push [sp+2]       ; push the N value on the stack
  push a            ; push the factorial(n-1) results on stack
  call multiply     ; call multiply subroutine
  pop               ; pop factorial(n-1) from stack
  pop               ; pop N-value from stack
f_stop:
  rts               ; return from subroutine. Register A contains factorial(N)

; multiply subroutine
;
; Input:
;   stack - function return pointer
;   stack+2 - A single 8-bit value to multiply
;   stack+3 - A single 8-bit value to multiply
;
; Output:
;   A register - the results of the multiply calculation. A single 8-bit value
;
; Registers use: A, I
;
multiply:
  mov [sp+2],a     ; copy the multiplicand to A
  je m_zero,0      ; jump to zero handler if multiplicand is 0
  mov a,b          ; copy multiplicand to B to set up for add loop
  mov [sp+3],i     ; copy multiplier to I
  dec i            ; decrement I for 0-based loop
  jc m_zero        ; was multiplier zero? If so, carry was set on the dec so jump to m_zero             
m_loop:
  jz m_done        ; jump to done if multiplier counter is now zero
  add b            ; add b to a
  dec i            ; decrement multiplier counter
  jmp m_loop       ; restart addition loop
m_zero:
  mov a,0          ; set the return value to zero
m_done:
  rts              ; return from subroutine
Clone this wiki locally