Floating point
instructions

 

The BASIC assembler, as standard, does not have any support for true floating point instructions. You have the ability to convert integers to your implementation-defined 'floating point' and perform basic mathematics with them (most usually fixed point), but you cannot interact with a floating point co-processor and do things the 'native' way.
There are, however, patches which extend the things that the assembler can do - which include FP instructions.

Parts of this documentation has been taken from the ARM Assembler manual.

 

 

The ARM processor can interface with up to sixteen co-processors. The ARM3 and later have virtual co-processors within the ARM to handle internal control functions. But the first co-processor that was available was the floating point processor. This chip handles floating point maths to the IEEE standard.
A standard ARM floating point instruction set has been defined, so that the code may be used across all RISC OS machines. If the actual hardware does not exist, then the instructions are trapped and executed by the floating point emulator module (FPEmulator). The program does not need to know whether or not the FP co-processor is present. The only real difference will be speed of execution.
If you are interested in the co-processor aspect, read the document on co-processor access.

The ARM IEEE FP system has eight high precision FP registers (F0 to F7). The register format is irrelevant as you cannot access those registers directly, the register is only 'visible' when it is transferred to memory or to an ARM register. In memory, an FP register consumes three words, but as the FP system will be reloading its own register, the format of these three words is considered irrelevant.
There is also an FPSR (floating point status register) which, similar to the ARM's own PSR, holds the status information that an application might require. Each of the flags available has a 'trap' which allows the application to enable or disable traps associated with the given error.
The FPSR also allows you to tell between different implementations of the FP system.
There may also be an FPCR (floating point control register). This holds information that the application should not access, such as flags to turn the FP unit on and off. Typically, hardware will have an FPCR, software will not.

FP units can be software implementations such as the FPEmulator modules, hardware implementations such as the FP chip (and support code), or a combination of both.
The best example of a 'both' that I can think of is the Warm Silence Software patch that will utilise the 80x87 chip on suitably equipped PC co-processor cards as a floating point processor for ARM FP operations. Talk about resource sharing...!

The results are calculated as though it were infinite precision, then they are rounded to the length required. The rounding may be to nearest, to +infinity(P), to -infinity(M), or to zero. The default is rounding to nearest. If a tie, it will round to nearest even.
The working precision is 80 bits, comprising of a 64 bit mantissa, a 15 bit exponent, and a sign bit. Specific instructions that work with single precision may provide better performance in some implementations - notably fully-software-based ones.

The FPSR contains the necessary status for the FP system. The IEEE flags are always present, but the result flags are only available after an FP compare operation.

Floating point instructions should not be used from SVC mode.

Exception Flags Byte, the lower byte of the FPSR.

        6          4      3      2      1      0
FPSR:   Reserved   INX    UFL    OFL    DVZ    IVO
Whenever an exception condition arises, the appropriate cumulative exception flag in bits 0 to 4 will be set to 1. If the relevant trap enable bit is set, then an exception is also delivered to the user's program in a manner specific to the operating system. (Note that in the case of underflow, the state of the trap enable bit determines under which conditions the underflow flag will be set.) These flags can only be cleared by a WFS instruction.

IVO - invalid operation
The IVO flag is set when an operand is invalid for the operation to be performed. Invalid operations are:

DVZ - division by zero
The DVZ flag is set if the divisor is zero and the dividend a finite, non-zero number. A correctly signed infinity is returned if the trap is disabled. The flag is also set for LOG(0) and for LGN(0). Negative infinity is returned if the trap is disabled.

OFL - overflow
The OFL flag is set whenever the destination format's largest number is exceeded in magnitude by what the rounded result would have been were the exponent range unbounded. As overflow is detected after rounding a result, whether overflow occurs or not after some operations depends on the rounding mode.
If the trap is disabled either a correctly signed infinity is returned, or the format's largest finite number. This depends on the rounding mode and floating point system used.

UFL - underflow
Two correlated events contribute to underflow:

The UFL flag is set in different ways depending on the value of the UFL trap enable bit. If the trap is enabled, then the UFL flag is set when tininess is detected regardless of loss of accuracy. If the trap is disabled, then the UFL flag is set when both tininess and loss of accuracy are detected (in which case the INX flag is also set); otherwise a correctly signed zero is returned.
As underflow is detected after rounding a result, whether underflow occurs or not after some operations depends on the rounding mode.

INX - inexact
The INX flag is set if the rounded result of an operation is not exact (different from the value computable with infinite precision), or overflow has occurred while the OFL trap was disabled, or underflow has occurred while the UFL trap was disabled. OFL or UFL traps take precedence over INX.
The INX flag is also set when computing SIN or COS, with the exceptions of SIN(0) and COS(1).
The old FPE and the FPPC system may differ in their handling of the INX flag. Because of this inconsistency we recommend that you do not enable the INX trap.

 

Precision is:

 

Rounding modes are:

 


LDF<condition><precision><fp register>, <address>
Load Floating Point value
The address can be in the forms:

This call is similar to LDR.
Your assembler may allow literals to be used, such as LDFS F0, [float_value]

 


STF<condition><precision><fp register>, <address>
Store floating point value. The address can be in the forms:

This call is similar to STR.
Your assembler may allow literals to be used, such as STFED F0, [float_value]

 


LFM and SFM
These are similar in idea to LDM and STM, but they will not be described because some versions of FPEmulator do not support them. The FP module in RISC OS 3.1x (2.87) does, as do (I think) later versions. If you know they your software will only operate on a system that supports SFM, then use it. Otherwise you'll need to 'fake' it with a sequence of STFs. Likewise for LFM/LDF.

 


FLT<condition><precision><rounding> <fp register>, <register>
FLT<condition><precision><rounding> <fp register>, #<value>

Convert integer to floating point, either an ARM register or an absolute value.

 


FIX<condition><rounding> <register>, <fp register>
Convert floating point to integer.

 


WFS<condition> <register>
Write floating point status register with the contents of the ARM register specified.

 


RFS<condition> <register>
Read floating point status register into the ARM register specified.

 


WFC<condition> <register>
Write floating point control register with the contents of the ARM register specified.
Supervisor mode only, and only on hardware that supports it.

 


RFC<condition> <register>
Read floating point control register into the ARM register specified.
Supervisor mode only, and only on hardware that supports it.

 

Floating point coprocessor data operations:
The formats of these instructions are:

The #value constants should be 0, 1, 2, 3, 4, 5, 10, or 0.5.


The binary operations are...
ADF - Add
DVF - Divide
FDV - Fast Divide - only defined to work with single precision
FML - Fast Multiply - only defined to work with single precision
FRD - Fast Reverse Divide - only defined to work with single precision
MUF - Multiply
POL - Polar Angle
POW - Power
RDF - Reverse Divide
RMF - Remainder
RPW - Reverse Power
RSF - Reverse Subtract
SUF - Subtract


The unary operations are...
ABS - Absolute Value
ACS - Arc Cosine
ASN - Arc Sine
ATN - Arc Tangent
COS - Cosine
EXP - Exponent
LOG - Logarithm to base 10
LGN - Logarithm to base e
MVF - Move
MNF - Move Negated
NRM - Normalise
RND - Round to integral value
SIN - Sine
SQT - Square Root
TAN - Tangent
URD - Unnormalised Round


CMF<condition><precision><rounding> <fp register 1>, <fp register 2>
Compare FP register 2 with FP register 1.
The varient CMFE compares with exception.


CNF<condition><precision><rounding> <fp register 1>, <fp register 2>
Compare FP register 2 with the negative of FP register 1.
The varient CMFE compares with exception.

Compares are provided with and without the exception that could arise if the numbers are unordered (ie one or both of them is not-a-number). To comply with IEEE 754, the CMF instruction should be used to test for equality (ie when a BEQ or BNE is used afterwards) or to test for unorderedness (in the V flag). The CMFE instruction should be used for all other tests (BGT, BGE, BLT, BLE afterwards).

 

When the AC bit in the FPSR is clear, the ARM flags N, Z, C, V refer to the following after compares:
N = Less than
Z = Equal
C = Greater than, or equal
V = Unordered

When the AC bit in the FPSR is clear, the ARM flags N, Z, C, V refer to the following after compares:
N = Less than
Z = Equal
C = Greater than, or equal
V = Unordered

And when the AC bit is set, the flags refer to:
N = Less than
Z = Equal
C = Greater than, or equal, or unordered
V = Unordered

 

In APCS code with objasm, to store a floating point value, you would use the directive DCF. You append 'S' for single precision, and 'D' for double.

 

 

Here is a brief example. We MUL two numbers, but use the floating point unit instead of the ARM's multiplication instruction. This could be modified to multiply two floating point numbers, and give a floating point response, but as it is only a short example, it will simply use two integers.

REM >fpmul
REM
REM Short example to multiply two integers via the
REM floating point unit. Totally pointless, but...

DIM code% 20

FOR loop% = 0 TO 2 STEP 2
  P% = code%
  [  OPT loop%

   .multiply
     FLTS   F0, R0
     FLTS   F1, R1
     FMLS   F2, F0, F1
     FIXS   R0, F2

     MOVS   PC, R14
  ]
NEXT

INPUT "First number  : "one%
INPUT "Second number : "two%

A% = one%
B% = two%
result% = USR(multiply)

PRINT "The result is "+STR$(result%)
END
There is no option to download this program, as standard BASIC won't touch it. However, you can include FP statements if you can 'build' the instructions.
Alternatively, you could use ExtBASasm by Darren Salt.

This version will work in BASIC:

REM >fpmul
REM
REM Short example to multiply two integers via the
REM floating point unit. Totally pointless, but...

DIM code% 20

FOR loop% = 0 TO 2 STEP 2
  P% = code%
  [  OPT loop%

   .multiply
     EQUD   &EE000110   ; FLTS F0, R2
     EQUD   &EE011110   ; FLTS F1, R1
     EQUD   &EE902101   ; FMLS F2, F0, F1
     EQUD   &EE100112   ; FIXS R0, F2

     MOVS   PC, R14
  ]
NEXT

INPUT "First number  : "one%
INPUT "Second number : "two%

A% = one%
B% = two%
result% = USR(multiply)

PRINT "The result is "+STR$(result%)
END
Download this example

 

One final thing... Remember to use the appropriate precision for what you are doing.

REM >precision
REM
REM Short example to show how data can be 'lost' due
REM to using incorrect precision.

ON ERROR PRINT REPORT$ + " at " + STR$(ERL/10) : END

DIM code% 64

FOR loop% = 0 TO 2 STEP 2
  P% = code%
  [  OPT loop%

     EXT 1

   .single_precision
     FLTS   F0, R0
     FIX    R0, F0
     MOV    PC, R14

   .double_precision
     FLTD   F0, R0
     FIX    R0, F0
     MOV    PC, R14

   .doubleext_precision
     FLTE   F0, R0
     FIX    R0, F0
     MOV    PC, R14
  ]
NEXT

A% = &1ffffff

PRINT "Original input is " + STR$~A%
PRINT "Single precision  " + STR$~(USR(single_precision))
PRINT "Double precision  " + STR$~(USR(double_precision))
PRINT "Double extended   " + STR$~(USR(doubleext_precision))
PRINT
END
The result of this program is:
Original input is 1FFFFFF
Single precision  2000000
Double precision  1FFFFFF
Double extended   1FFFFFF
You don't need to use double precision everywhere, though, as it will be that much slower. Simply keep this in mind if you are dealing with large numbers.

 

In order to test the actual speed differences, I wrote a test program:

DIM code% 64

FOR loop% = 0 TO 2 STEP 2
  P% = code%
  [  OPT loop%

     MOV    R0, #23
     MOV    R1, #1<<16

   .timetest
     FLTD   F0, R0
     FLTD   F1, R0
     MUFD   F2, F0, F1
     SUBS   R1, R1, #1
     BNE    timetest

     MOV    PC, R14
  ]
NEXT

t% = TIME
CALL code%
PRINT "That took "+STR$(TIME - t%)+" centiseconds."
END
I tried various precisions, and also the fast multiply. It showed something interesting. So I tried multiplication, and addition. All with the same data (input 23).

 

Here are my results for a million (roughly) convert-and-process operations (ARM710 processor, FPEmulator 4.14):

   Operation        Fast single   Single        Double        Double extended

   Multiplication   1731cs        1755cs        1965cs        1712cs

   Division         2169cs        2169cs        2618cs        2479cs

   Addition         n/a           1684cs        1899cs        1646cs
This seems to show that double extended precision is the fastest on my machine for a selection of operations. Thus, it is incorrect to simply assume more complexity takes longer time. My personal suspicion here is the internal format is double extended, thus working directly with it entails no loss due to converting the value to a different precision.

The moral here? Don't be afraid to experiment...


Return to assembler index
Copyright © 2000 Richard Murray