Introduction to
|
26th September 2000:
In the first release of this assembler programming tutorial, I missed out on a vital piece of
information. Why the hell would you want to do this in the first place?
That, I cannot answer. Everybody has their reasons. Maybe your boss wants you to make something
that little bit faster and more efficient. Maybe you have been given the source to a program
that you often use, and it is up to you to update it. Maybe you are just doing this 'cos you
think it might be a worthwhile hobby.
One thing that is important is the psychology of it all. Here, in this section, I have examined
my motives, my feelings. I'd never really thought about it before, but it makes for interesting
reading.
I won't repeat it all here, but the two principal documents you should have a look at are:
Now, on with our scheduled documentation.....
If you want to knock up a quick little program, then you can't pick much better than BASIC.
PRINT "HeyRick!"
If you want to write a less-hackable program with large speed benefits, you should pick C or C++. For speed concerns, the old CastAVote vote editor (written in BASIC) took around 20 to 30 seconds to delete a vote at the beginning of a full file. The new vote editor tasks this out to VoteModule (written in C). It does it in a second.
printf("HeyRick\n");If you want the ultimate speed and flexibility with the ability to perform low-level hackery, you should be looking to assembler.
GET h.SWIs AREA |asm$$code|, CODE, READONLY ADR R0, text ENTRY SWI "OS_PrettyPrint" SWI "OS_NewLine" ADR r0, text MOV PC, R14 SWI OS_PrettyPrint SWI OS_NewLine .text MOV pc, r14 EQUS "HeyRick" EQUB 0 text DCB "HeyRick", 0 ALIGN ALIGNAs you can see, assembler is much more involved. There are quicker ways to write the above, but it is a reasonable example.
You may know that it is possible to call BASIC functions in assembler, provided you are using the BASIC assembler to compile your code:
MOV R0, #RND(64)But if you plan to write a fully-assembler application, you'll need to know how to write the RND code.
Lets take that example and expand it into a fully fledged application...
REM Assembler demo #1 : DIM code% 64 FOR pass%=0 TO 2 STEP 2 P%=code% [ OPT pass% ADR R0, text SWI "OS_PrettyPrint" SWI "OS_NewLine" MOV PC, R14 .text EQUS "HeyRick" EQUB 0 ALIGN ] NEXT CALL code%
Let's work through the program. The first line is "DIM code% 64
".
This reserves enough memory for your program. BASIC ensures that your memory begins on a word
boundary.
The next line is "FOR pass%=0 TO 2 STEP 2
". This is important, as the
first time through, the assembler cannot resolve all the references. Therefore, in conjuction
with the OPT
statement, the code is actually passed twice.
The first time through, all errors are ignored. Then, once all references should have been set
up, the code is passed again and references can be recognised.
Read opt.html for details of the available OPT
ions.
If you want to see the assembly taking place, amend the line to say:
FOR pass%=1 TO 3 STEP 2
The third line, "P%=code%
" tells the assembler where to compile the code.
The variable P%
is assumed to be the pointer to the code - so you must set it to
point to the start of the memory block before each pass.
Do not forget to set P% if you are cobbling together some test code in, say, a TaskWindow.
Because P% is likely to be initialised to zero, and trying to compile code over the hardware
vectors is not good for the health of your data, or you...
If you are using offset assembly, P%
is set to zero and the pointer to memory block
is placed in O%
instead. This is demonstrated later on.
"[ OPT pass%
" is an important line. The opening square bracket denotes
the following code are assembler instructions. The "OPT
" then specifies
which compilation options are to be used this time in (refer to
opt.html for details of the options available).
"ADR
" is not a real instruction. What it does is place the address of the
specified value into the given register. In this line ("ADR R0, text
") it
places the address of text into register zero.
Read this for more on ADR.
Next, two SWIs are called. This is similar to BASIC's SYS command. Firstly "SWI
"OS_PrettyPrint"
" prints the text, and automatically wraps it to fit
nicely in the available space. Secondly, "SWI "OS_NewLine"
"
prints a newline character. Unlike BASIC, this is not implicit.
Many registers exist, and some of them have special functions. This is detailed in
regs.html. However I shall tell you here that register 15 (also
known as "PC") is the program counter. When a program is started, or when a branch
with link occurs, the return address is stored in register 14 - the link register.
Therefore is becomes easier to understand the line "MOV PC, R14
". It
places the value of R14 (currently holding the return address back to BASIC) into the program
counter. Hey presto, we are back!
Lastly the block ".frobnicate_text
" defines a zero terminated string and
aligns the tail end so it is on a word boundary. The definition is marked by a label following a
period. The data follows, and is a regular instruction - thus it can be either active code, or
simply statements to load data into the current memory location.
The closing square bracket marks the end of an assembler section. Unlike in BASIC, a closing statement does not mean the end of the routine. You must explicitly return before closing up, as shown in this example.
The "NEXT
" matches the "FOR
" (above).
Finally, we "CALL code%
", or in other words we branch to the address of
the variable code% and begin execution there - hence we execute the assembler section.
We DO NOT call P%
as it is incremented as each instruction is compiled. Only
code%
points to the start of the assembler code.
Try it.
Another thing to note is that assembler is a "compiled language" similar to C, but in the loosest sense of the concept. You cannot usually type assembler at the BASIC prompt and get it to run. Programs are written by the following steps:
There is another way of writing assembler - the APCS specification. This method is used to link assembler with other high-level compiled languages (such as C or Pascal), or to take advantage of Acorn's Desktop Development Environment. You can read about it here, but you are advised to become familiar with assembler before venturing on to APCS.
The last thing to say is the notation that is used in this section. It will be familiar to BASIC coders, but may seem confusing to others...
& Denotes a hexadecimal number. For example &16F is the number 367 in hex. Other ways of denoting hexadecimal are: $16F 0x16F 16Fh H16F >> Binary shift right, has the effect of dividing a number by the shift amount. Thus: 12 >> 1 equals 6 and: 128 >> 3 equals 16. Work this out in binary if you are unsure. << Binary shift left, has the effect of multiplying a number by the shift amount. Thus: 12 << 1 equals 24 and: 3 << 8 equals 768 % Denotes a binary number. For example %11010010 is 210 represented in binary.If you have not come across binary before, you should at least know that a computer represents data (text, pictures, etc) as a series of bytes. These bytes themselves are comprised of eight bits. These bits reflect the ON/OFF patterns used by the computer. Binary is a way of representing these patterns is a readable form.
%1 1 1 0 0 1 1 1 1 1 1 1 2048 1024 512 256 128 64 32 16 8 4 2 1The upper row is an example binary number. The lower row shows the values associated with each bit. To calculate value of the binary number, simply add it all up! As you can see, bits seven and eight are zero, so skip them when adding...
Finally, here's a quick tip for converting binary to hexadecimal.
People are not good at remembering large number sequences (well, except Carol Vorderman) and
trying to get %1011111011101111000100000110 right will prove difficult, especially if you are
trying to remember other stuff at the same time.
You can, however, convert between binary and hex extremely easily, and you only need to count
up to 15 in binary.
Chop the number into groups of four, remember to begin on the right hand side: 1011 1110 1110 1111 0001 0000 0110 Then convert each group of four into a single denary number: 1011 1110 1110 1111 0001 0000 0110 =11 =14 =14 =15 =1 =0 =6 And convert that denary number into hex: 1011 1110 1110 1111 0001 0000 0110 11 14 14 15 1 0 6 =B E E F 1 0 6 Which tells us that %1011111011101111000100000110 can be better remembered as &BEEF106. This process works in reverse too. Unfortunately you cannot stop at the denary version as there is no direct co-relation between base 2 or base 16 and base 10. In case you are interested, that number is 200208646. :-)