Two-pass Assembler

An assembler is a translator, that translates an assembler program into a conventional machine language program. Basically, the assembler goes through the program one line at a time, and generates machine code for that instruction. Then the assembler procedes to the next instruction. In this way, the entire machine code program is created.

Consider an assembler instruction like the following

          JMP  LATER
          ...
          ...
LATER:
This is known as a forward reference.

If the assembler is processing the file one line at a time, then it doesn't know where LATER is when it first encounters the jump instruction. So, it doesn't know if the jump is a short jump, a near jump or a far jump.
There is a large difference amongst these instructions. They are 2, 3, and 5 bytes long respectively. The assembler would have to guess how far away the instruction is in order to generate the correct instruction. If the assembler guesses wrong, then the addresses for all other labels later in the program woulds be wrong, and the code would have to be regenerated.

Structure of Two-pass Assembler

Processing the source program into two passes.
The internal tables and subroutines that are used only during Pass 1.
The SYMTAB, LITTAB, and MOTTAB are used by both passes.
The main problems to assemble a program in one pass involves forward references.

compile assembler.cpp

    g++ -std=c++11 assembler.cpp
    ./a.out