This document is under construction
Aiming to create an assembly SDK focused on retro
computing, using NASM code style as a base, making it easier for
new users coming from x86 to migrate to 8-bit processors.
This set of tools is distributed under the 4-Clause BSD License.
| Download for browsers that do not support javascript |
| Download for browsers that support javascript |
Assembler compatible with NASM style.
Both the source code and the generated binary are composed of
sections, as follows:
The sections are independent and their contents are grouped
sequentially in the order they are defined in the file, so even if
the section contents are defined interleaved and mixed, the result
will always be grouped by section.
The assembly source code consists of a few distinct blocks of
information:
The directives accept as arguments values
in the format supported by the assembler.
All topics listed above will be explained in more detail later
Example:
| ; comment, at the end of any line of code a
comment is accepted, identified by its initial character
semicolon ';' followed by free text until the end of the
current line mov ax, bx ; example of mnemonic (CPU instruction) and its arguments xpto: ; example of simple label with comment (this one) xpto2: mov bx, cx ; example of label with mnemonic instruction (could be an assembler directive) and comment db 0, 1, 2 ; example of assembler directive, in syntax it is identical to a mnemonic instruction |
For each supported architecture, an assembler executable is
provided, with the name starting with "hcasm-" followed by the
supported processor.
Example to compile Z80 code (file.s) into an object (file.obj):
| hcasm-z80 -o file.obj file.s |
There are 2 types of labels:
Optionally for better identification, a label definition can be
followed by the character ':' as a separator between the label and
the next mnemonic/assembler directive; however, when using a label
name as a reference to an existing label, the use of the ':'
character is forbidden.
Any label that is not present in the current file but is
referenced in it will be automatically considered as coming from
another object, and will be handled by HC Linker.
Example (Intel 8086/8088 code):
| global primary1
; exports the label primary1 for use in other objects, its
references will be automatically handled by HC Linker primary1: ; this is a primary label .secondary1: ; this is a label inside primary1, accessible within primary1 by its direct name (.secondary1) or from outside primary1 using the compound name primary1.secondary1primary2: ; this is the second primary label of the file, from here on all secondary labels from previous primary labels are no longer accessible by their direct names .xpto: ; this label is the first secondary label inside primary2, accessible as .xpto from within primary2 or primary2.xpto from anywhere in the file |
Every constant is declared as a label, following the same rules.
(See Labels)
What differentiates a label from a constant is that instead of
containing a pointer to that position in the source/binary, it
contains a predefined value.
Constants can be either secondary or primary to other constants
or labels, with no distinction in their use in source code;
however, constants are not subject to relocation calculation,
making them useful for fixed memory references.
To declare a constant, simply declare a label and on the same
line after the label use the marker 'equ' followed by the desired
expression.
Example:
person_struct: equ 11
; suggestion: primary label as a structure, containing the
structure size.name: equ 0 ; suggestion: in the case of a structure, fields contain their offset from the beginning of the structure, for example the name field is the first field, so it contains offset 0 bytessection data ; defines that the code from this line onwards belongs to the data section person_obj: resb person_struct ; example of reserving an object of type person_struct, using the constant that stores the structure size section text ; defines that the code from this line onwards belongs to the text section (source code) mov si, person_obj ; example of indirect use of fields mov al, [si + person_struct.age] ; example of reading an indirect field, using a pointer (register SI) mov al, [person_obj + person_struct.age] ; example of reading the field directly without using pointers |
As arguments for directives, values are accepted in the following
formats:
When used as arguments, they must be separated by commas.
Example (in 8086/8088):
| test_const equ 10 ;
registers a constant for use in code mov ax, 123 ; sets a simple value to a register mov bx, 123 + 23 ; sets a value composed of a mathematical expression to a register mov cx, [label] ; reads the content of a label into a register mov dx, [label + test_const - 3] ; reads the content of an address calculated by a mathematical expression containing labels, constants, and values into a register mov si, label ; sets a pointer to a label in a register mov ax, [si] ; reads the content of the previously defined pointer into a register mov ax, [si+test_const] ; reads the content of a pointer with an offset calculated by a mathematical expression into a register label: ; registers a label for use in code db "test string",0 ; emits the binary of a string followed by the numeric value zero db 0b10101010 ; emits a byte db 0x99 ; emits a byte |
To aid development, the assembler supports several basic
directives based on their NASM equivalents.
This directive emits binary directly to the output file, and can
only be used in the TEXT and DATA segments.
The bytes (db) or words (dw) are emitted according to the sequence
of numbers given as arguments to the directive.
There is no limit on the number of arguments, and they can be
numeric or strings.
| db 0x12, 0x34, 0x56
; in the output file at this position it will emit the bytes
of value 0x12 followed by 0x34 and 0x56 dw 0x1234, 0x5678 ; in the output file at this position it will emit the bytes (words are converted to bytes in little endian format): 0x34, 0x12, 0x78, 0x56 db "ABC",0 ; in the output file at this position it will emit the bytes: 0x41 0x42 0x43 0x00 |
This directive reserves an empty space in the output file.
Reservations of bytes (rb or resb) or words (rw or resw) are made
according to the quantity given in its argument.
| rb 10 ; in the
output file it will reserve 10 bytes at the current position rw 10 ; in the output file it will reserve 10 words (20 bytes) at the current position resb 10 ; in the output file it will reserve 10 bytes at the current position |
This directive repeats the content that follows it a defined
number of times in the output file
| times 5 db 0x12
; emits in the output file 5 consecutive bytes of value
0x12, resulting in: 0x12 0x12 0x12 0x12 0x12 times 510-$ db 0 ; emits in the output file bytes of value 0 until filling what is missing to position 510, for example if the binary already contains 500 bytes, it will fill the remaining 10 with 0 times 3 db "AB" ; emits in the output file 3 times the binary content equivalent to the string "AB" resulting in: 0x41 0x42 0x41 0x42 0x41 0x42 times 3 nop ; emits the mnemonic instruction nop 3 times in the output binary |
Changes the current output section, allowing interleaving of
multiple output sections in the same file.
Supported sections: text, data, bss*
* The BSS section does not support any directive that generates
binary data, only directives that reserve space and labels of
all types are allowed
| section text ;
changes current section to text nop ; instruction generated inside the text section section data ; changes current section to data nop ; instruction generated inside the data section section bss ; changes current section to bss rb 123 ; directive reserving space in the bss section |
Merges necessary objects to create executables.
Executables, regardless of output format, are built as follows:
| Linker
Executable |
Format |
| hclink-bin |
Pure binary / Flat Binary Format |
| hclink-rex |
Relocatable Executable |
When calling the linker, you should call the version equivalent
to the desired format, specifying the output file and the objects
and libraries
| hclink-bin -o executable.com -text 0x100
main.obj test.obj runtime.lib |
These constants are automatically added during linking and
editing of objects to generate the final executable.
| Name |
Description |
| __text_start__ |
Start position of the text section |
| __text_end__ OR _etext |
End position of the text section |
| __text_size__ |
Size of the text section |
| __data_start__ |
Start position of the initialized data
section |
| __data_end__ OR _edata |
End position of the initialized data section |
| __data_size__ |
Size of the initialized data section |
| __bss_start__ |
Start position of the uninitialized data
section |
| __bss_end__ OR _ebss |
End position of the uninitialized data
section |
| __bss_size__ |
Size of the uninitialized data section |
| _end |
End position of all sections of the
executable |
An object is a file made up of individual records, each executing
a command when creating the final executable.
| Size
in Bytes |
Description |
| 1 |
Record Type |
| 1 |
Size of Record Content |
| 2 |
Value, usually in signed 16-bit integer
format |
| 2 |
Auxiliary Value, usually in unsigned 16-bit
integer format |
| ? |
Content of the size defined in the second
field of this record (Maximum size of 256 bytes) |
The list below is subject to future additions; as a reference,
the file include/obj.h should be used as the official
basis.
The file contains a header describing the file content, which
must be discarded during loading, and is disregarded in relocation
calculation.
| Size
in Bytes |
Description |
| 2 |
Signature (ASCII: "HC") |
| 2 |
Text section size |
| 2 |
Data section size |
| 2 |
BSS section size |
| 2 |
Pointer to the _start function |
| 2 |
Relocation table size |
| 3 |
Reserved (Set to 0) |
| 1 |
CPU identifier (Use the same code as the
record, e.g., 8085: 0xf1) |
Sections are placed sequentially in the file, following the
order:
The BSS section is allocated during loading, placed after the
Data section.
When loading the sections to the target memory location, the
relocation records must be read according to the format described
below:
| Size
in Bytes |
Description |
| 2 | Position where the pointer to be relocated is
located |
How to perform relocation:
Coordinates actions for compiling and maintaining projects
To compile a configuration, it must be called with the make
action followed by the configuration
| hcbuild project.prj make release hcbuild project.prj make debug |
To clean the output files from a previous compilation, simply
call with the clean action followed by the configuration
| hcbuild project.prj clean release |
The file follows the format compatible with the standard INI
configuration file, where files are separated into sections and
variables, and comments start with the character ';'
| [config] ; Config section contains general variables dump = yes ; Defines whether the assembler should generate description files of the generated object records (Accepts: yes; no) verbose = yes ; Defines whether the linker will display the commands being executed sdk_path = /usr/local/bin/ ; Defines the directory where the SDK is located (Optional) [files:CPU] ; Files section contains source code files related to the tools of a specific CPU ; in the section name, the cpu name must be included, e.g., Z80 => [files:z80]; 8080 => [files:8080] file1.s file2.s [libs] ; Section contains the list of libraries and external objects used runtime.lib [link:CONFIGURATION] ; contains output definitions for a given configuration. Commonly used configurations: release, debug format = bin ; linker output format filename = executable.bin ; output file name |
Example of a project for generating a relocatable executable:
| [config] verbose = yes [files:z80] main.s test.s [libs] runtime.lib [link:release] format = rex filename = test.rex [link:debug] format = rex filename = test.rex symbols = test.sym |
Groups objects into a single library
The library follows the same object format,
simply allocating the object records sequentially in the library
file
When calling the librarian, you must provide the name of the
library to be created/edited, followed by the objects to be
added/replaced
| hclib library.lib functions_a.obj
functions_b.obj |