Skip to content

Instruction Set Configuration File

Michael Kamprath edited this page Jul 10, 2021 · 32 revisions

Description

The instruction set configuration file enables you to define the instruction set and assembly language features that will be used by BespokeASM to assemble byte code. This configuration file can be made using JSON or YAML.

Machine Code Compilation

The purpose of this configuration file is to control how machine code should be compiled for the instruction set. BespokeASM has a fix method for compiling machine for for any given instruction. The standard form or an instruction is:

  MNEMONIC [OPERAND1[, OPERAND2[, ...]]]

Here, each instruction must be composed of at least a mnemonic, and can optionally have 1 or more operands.

The machine code generated for an instruction is composed of first byte code, then argument values. The byte code represents the value that will be used by a CPU's instruction register to indicate what instruction the CPU is executing. The byte code is composed of values specific to the mnemonic and optionally for each operand. The size of the packed byte code for a mnemonic and its operands should be the same as the instruction size of the hardware running this machine cod. The argument values are used by the that instruction as parameters. If more than one operand has an argument value to be placed in the machine code, then the argument values will be ordered in the same order as the operands. The instruction mnemonic and each of the operands can generate values to be packed into the byte code of an instruction, while only operands can generate argument values in byte code.

As an illustrative example, consider this assembly instruction:

  mov a,[$8000]      ; copy value at address $8000 into register A

In this case, the mnemonic mov and the operands a (for register add) and [...] (for an indirect value) all generate values that will be used to form the instruction's to form the instruction's byte code. The $8000 numeric value is the parameter to the [...] operand and follows the instruction byte code when forming the total machine code. The diagram below illustrated this.

   Byte 0     Byte 1   Byte 2
  ========== ======== ========
  01 001 110 00000000 10000000
  -- --- --- -----------------
   |  |   |          |
   |  |   |          +- The second operand's argument value of $8000 in little endian format
   |  |   +------------ The byte code 110 indicating the second operand ([...])
   |  +---------------- The byte code 001 indicating the first operand (register A)
   +------------------- The byte code 01 indicating the mov mnemonic 

Configuration Sections

The configuration has the following main sections:

general

The general section defines the general configuration of BespokeASM and various assembly language features. The supported options are:

Option Key Value Type Description
address_size integer The number of bits that is required to represent a memory address.
endian string (Optional) Defines of the endianness of multibyte values. Allowed values are big and little. If not present, this option defaults to big.
registers list[string] (Optional) A list of register labels that will be used in this instruction set. Anything that is declared as a register label cannot be used as a constant or address label, and anything not declared as a register label cannot be used an a register operand. If not present, no register labels are defined.

operand_sets

The operand_sets section allows you to define sets of operands for instructions. A operand set is intend to represent all of the possible operand values for a specific operand position, and defines the byte code and argument values that will be packed when forming the instructions machine code. Operand sets are defined separate from the instruction as to enable an operand set being used by more than one instruction. An operand set consists of 1 or more distinct operands.

The operand_set section is a dictionary, where the dictionary key is the name for the operand set, and the value is the configuration of that operand set. The name of the operand set is only use internally within this configuration file and does not directly impact the assembly language that is derived from this configuration file.

operand_sets Items

Each item listed in the operand_sets consists of a single element title operand_values, which contains a dictionary that configures each of the operand variants in this operand set. In this dictionary, the key is the internal name of the operand value item used within this configuration file, and the value is a collection of operand configuration items defined in the table below:

Option Key Value Type Description
type string Specifies one of the operand addressing modes supported by BespokeASM. The allowed values are:
  • numeric - The Immediate addressing mode
  • indirect_numeric - The Indirect addressing mode
  • register - The Register addressing mode
  • indirect_register - The Register Indirect addressing mode
bytecode dictionary (Optional) A dictionary that configures the byte code associated with this operand. If not present this operand will not generate any byte code. This dictionary contains the following keys:
  • value - integer - The value of the byte code
  • size - integer - The bit size of the byte code. The value will be masked to this bit size.
argument dictionary Configures how the operand argument will be emitted into the machine code. Must be present for the numeric and numeric_indirect operand types. Ignored for all other types.

The dictionary contains the following keys:
  • size - integer - The bit size for the operand argument. The emitted value will be masked to this bit size.
  • byte_align - boolean - Indicates whether the argument value should be aligned to the next whole byte, or can be packed immediately after the prior section's last bit.
  • endian - string - (Optional) The endianness that should be used for this argument. If not present, the default endianness configured in the general section will be used.

register string The assembly code representation of the register value to be used for this operand. Must be one of the register values listed in the registers list of the general section. Must be present for the register and register_indirect operand types, ignore for all other operand types.
offset dictionary Configures the offset value that is optional for the register_indirect operand type. Ignored for all other types. If not present, then no offset is enabled. If offset values are enabled, the compiler will still permit not specifying an offset for a register_indirect instruction. In this case, the offset of zero is implied.

The dictionary contains the following keys:
  • size - integer - The bit size for the operand offset. The emitted value will be masked to this bit size.
  • byte_align - boolean - Indicates whether the offset value should be aligned to the next whole byte, or can be packed immediately after the prior section's last bit.
  • endian - string - (Optional) The endianness that should be used for this offset. If not present, the default endianness configured in the general section will be used.

instructions

The instructions section is where the supported instruction mnemonics are defined. An instruction definition is comprised or three parts: the mnemonic, the instruction arguments, and the instruction byte code. This section is a key/value dictionary where the keys are the mnemonic string name of the instruction and the value is another dictionary that defines the instructions arguments configuration and byte code.

Option Key Value Type Description
byte_code dictionary A dictionary that descriptor the byte code that should be emitted to indicate the instruction. The key and values that must be present are:
  • value - The value of the byte code
  • size - The bit size of the byte code. The value will be masked to this bit size.
operands dictionary A dictionary that configures the set of operands that are allowed for this instruction mnemonic. The key and values that are used in this dictionary are described in the table below. If not present, then the instruction mnemonic is assumed to have no operands.

operands Configuration

Option Key Value Type Description
count integer The number of operands this mnemonic must have.
operand_sets dictionary (Optional) Present if operand sets are used to configure the operands of the mnemonic. Contains the following keys and values:
  • list - A list of names for the operand sets to be used as the operand options for this instruction. Must have count number of items in the list, and the position in the list pertains to the position of the operand.
  • disallowed_pairs - (Optional) A list of operand name tuples that represents combinations of operands from the configured operand sets that the compiler should not permit. The operand name is the key names of the operand_values dictionary for a given operand set. The tuple is expressed as a python-style list. For example, [a, b] is used to indicate a disallowed operand set for a mnemonic with two operands where the unallowed operand tuple is a from the first operand set in combination with b from the second operand set.

Examples

Example configuration files can be found in the examples directory of the BespokeASM repository.

Clone this wiki locally