Assembly Language

A programming language is defined by a set of rules that users must adhere to for their programs to be translated correctly. Almost every commercial computer has its own particular assembly language, with the rules for writing programs documented in manuals provided by the computer manufacturer.

Structure of Assembly Language Programs

The basic unit of an assembly language program is a line of code. The specific language is defined by rules that specify the symbols that can be used and how they may be combined to form a line of code. Below are the rules for writing symbolic programs for the basic computer.

Rules of the Language

Each line of an assembly language program is arranged in three columns called fields. These fields specify the following information:

Label Field: This may be empty or may specify a symbolic address.
Instruction Field: This specifies a machine instruction or a pseudoinstruction.
Comment Field: This may be empty or include a comment.

A symbolic address consists of one to three alphanumeric characters. The first character must be a letter; the next two may be letters or numerals. A symbolic address in the label field is terminated by a comma to be recognized as a label by the assembler.

Instruction Field Specifications

The instruction field in an assembly language program may specify one of the following items:

Memory-Reference Instruction (MRI): This occupies two or three symbols separated by spaces. The first must be a three-letter symbol defining an MRI operation. The second is a symbolic address. The third symbol, which may or may not be present, is the letter I. If I is missing, the line denotes a direct address instruction. The presence of the symbol I denotes an indirect address instruction.
Non-MRI: This does not have an address part and is recognized by any one of the three-letter symbols for register-reference and input-output instructions.
Pseudoinstruction: This may or may not have an operand.

Example of Instructions

CLA ; non-MRI
ADD OPR ; direct address MRI
ADD PTR I ; indirect address MRI

Symbolic Address

A symbolic address in the instruction field specifies the memory location of an operand. This location must be defined somewhere in the program by appearing as a label in the first column. Each symbolic address mentioned in the instruction field must occur again in the label field to allow proper translation to binary code.

Pseudoinstructions

A pseudoinstruction is not a machine instruction but rather an instruction to the assembler providing information about some phase of the translation. Four recognized pseudoinstructions are:

ORG N: Specifies the memory location for the instruction or operand listed in the following line.
END: Denotes the end of the symbolic program.
DEC N: Signed decimal number to be converted to binary.
HEX N: Hexadecimal number to be converted to binary.

Comments

The third field in a program is reserved for comments. Comments are preceded by a slash (/) and are used to explain the program, helping in understanding the step-by-step procedure. Comments are ignored during the binary translation process.

Example Program

Here is an example of an assembly language program:

ORG 100 / Origin of program is location 100
LDA SUB / Load subtrahend to AC
CMA / Complement AC
INC / Increment AC
ADD MIN / Add minuend to AC
STA DIF / Store difference
HLT / Halt computer
MIN, DEC 83 / Minuend
SUB, DEC -23 / Subtrahend
DIF, HEX 0 / Difference stored here
END / End of symbolic program

When translated to binary code, the program will perform a subtraction between two numbers. The subtraction is performed by adding the minuend to the 2's complement of the subtrahend. For instance, -23 is converted into its 2's complement form and added to 83, resulting in the final difference.

Translation to Binary

The translation of the symbolic program into binary is done by an assembler. The assembler performs the translation in two scans:

First Scan: Assigns a memory location to each machine instruction and operand, forming a table that defines the hexadecimal value of each symbolic address.
Second Scan: Translates the program by replacing symbols with their machine code binary equivalents using the address symbol table from the first scan.

Here is the translation of the above program:

Location Content Symbolic Program
100 2107 ORG 100
101 7200 LDA SUB
102 7020 CMA
103 1106 INC
104 3108 ADD MIN
105 7001 STA DIF
106 0053 HLT
107 FFE9 MIN, DEC 83
108 0000 SUB, DEC -23
109 0000 DIF, HEX 0

The translation process involves converting symbolic addresses and instructions into their binary equivalents. This example program demonstrates the systematic approach to translating assembly language into machine code.

Computer Systems The Assembler