There are two reasons for that. The first is that the execution time of certain instructions depends on the context of the instruction use. This context may be defined at compile time or at run time.
The execution time of a given instruction is mainly a function of compile time static aspects which originate from the Ada code. For example, the execution time of the ABORT_TASK instruction is a linear function of the number of tasks that it is applied on. If a task has dependents, they will be recursively aborted in a single blow. This part of the execution time can be fully characterised, as (i) the ATAC algorithms are completely known and defined and (ii) the program complexity is known by the programmer, who is then able to give the necessary inputs to execution time computation.
However, some instructions have their execution time slightly dependent on the current status of the running system. Later benchmarks will show that variations due to different situations of the system account for less than 1 ms in our system. This variation is small enough to be used as a bound for execution time computation and as input to schedulability analysis.
The second reason that makes measurement of each instruction execution time difficult to interpret is that each instruction is very often used in a sequence together with other ATAC instructions and compiler generated instruction. For example, a SWITCH instruction is inserted between a save and a restore context, the whole nested in a `if PREEMPT' branch. The compiler generated code has a more substantial impact on the whole execution time. This must not be charged to the compiler, but only to the fact that additional operations must be performed between ATAC operations. It then becomes more realistic to measure whole sequences.
As the measured sequences include non-ATAC code, the results will depend on the way this code has been generated, i.e on the compiler. A special compiler (called AAA) has been designed by R-Tech AB. This subset compiler does not implement the full Ada language, but at least all the features that are necessary to give a true picture of the tasking performance.
The hardware is based on a MA31750 at 10 MHz and ATAC at 20 MHz.
The execution time of each mechanism has been measured with the CPU running at 10 MHz. The ATAC part of the execution time has been measured as a maximum of 15% of the total time. This allows to estimate the execution time if the CPU was running at 20 MHz.
The following figures are given with courtesy from R-Tech AB, 1994.
A synchronisation rendez-vous (normal call with no parameters, caller and receiver with the same priority) is 13 microseconds. (7 microseconds estimated at 20 MHz)
The whole life cycle of a new dependent task, allocation, activation and termination is 44 microseconds (25 microseconds estimated at 20 MHz) Note especially that this is a heavyweight task with a complete environment, even initialization of symbolic names. The created task has higher priority. The ATAC overhead is much less than 15% in this case, but 15% is used for the estimate.
The interrupt entry time is measured between (i) the transition of an ATAC interrupt line and (ii) the accept of the corresponding higher priority waiting task. It includes the ATAC generating an interrupt to the CPU, the CPU processing the interrupt in a routine to pre-empt the current task and switch to the waiting task, until the first Ada statement after the accept. This time is 22 microseconds (13 microseconds estimated at 20 MHz)
The related benchmarks have been generated automatically by the R-Tech benchmark synthesizer, and have been used in the characterisation of the first ATAC 1.0. They also have been run by R-Tech AB.
Some basic structures are used:
The benchmarks are using these basic structures, with a variable number (1 to 4) of caller, of callee, of entries. The priority is the same between caller and callee The benchmarks give an overview of the behaviour of a compiler as regards Ada tasking performance. When drawn on a curve, at first glance they show the range where a compiler stays. Of course, a hurried conclusion must not be generalized, and a deeper analysis must be performed when considering a compiler.
Here is a curve showing the benchmarks results (Format: PostScript).
The curve shows the execution time of each construct (SY, PL, ...) instantiated with a different number of callers, receivers, and entries. The upper grey curve represents the time execution when the -atac switch is off, and the lower black curve has been obtained with the ATAC.
Comparing the same targets (MA31750/10MHz), the TLD compiler has improved its performance of an average ratio of 10.6 by supporting ATAC. The last release of the TLD compiler reaches an average ratio of 12.
The curves also show the influence of the complexity of the construct on the execution time. For a given construct (e.g. EN-ACC), the curve without ATAC shows some crenels in the range from 215 to 340 microseconds. The deviation is 125 microseconds due to the complexity.
The curve with ATAC is much smoother, from 24 to 32. The excursion is 8. So, not only is the cost of the basic construct lower (215 to 24), but also the complexity has a smaller influence (8 instead of 125).
The consequence of this is an improvement of the predictability. By smoothing the curves, an ATAC system is more linear, less subject to variation in execution time.
The CPU clock has an influence on the previous curves: the figures of both curves will be reduced if the CPU speed increases. The curve without ATAC will obviously decrease, as only the CPU is involved in the execution time. However, the figures of the ATAC curve has to be charged both to the CPU and ATAC. The CPU contribution is substantially more important than the ATAC part. So the figures of this curve will also decrease if the CPU speed increases. Furthermore, the ATAC clock can also be increased.
The necessary inputs for this are:
This last input includes the hardware description of the platform, and the characterisation of the run-time model and system. Various run-time system-typical overhead figures must be provided. These parameters are e.g. context switch time, delay queue time, figures which are tied to the properties of the compiler and the semantic of the language.
The knowledge in the run-time system is often not sufficient to associate accurate values to the parameters. This results in a very pessimistic characterisation of the platform, and therefore a too large WCET.
Consequently, a system which is schedulable, may be considered by the analysis tools as not schedulable because unnecessary margin has been taken. This margin can be very significant especially in a pre-emptive system.
ATAC facilitates a substantial reduction of this margin, and as a consequence the set of parameters has to be reconsidered. Some of the parameters, corresponding to ATAC atomic operations, may be grouped. Others may be given a zero value, as the corresponding ATAC operations are run concurrently with the CPU.
Then the value given to each parameter will be either a constant provided by ATAC characterization, or the result of a formula. The formula will reflect the dependency of ATAC execution time on the parameters of the Ada program . The part of the execution time which is dependent on the status of the system at run-time will be given a bound that remains in acceptable value (typically 1 microsecond).
By providing a precise model of the run-time system, ATAC allows to remove safety margins in the schedulability analysis of an application, and therefore to have a less pessimistic and more realistic assessment of the execution time.
The jitter is therefore reduced to practically nothing.
Data structures used by the software Run-time System to maintain tasking are now in the ATAC memory, but can still be accessed at any time.The original data structures can therefore be removed.
The ATAC evaluators generally reports a 5% saving.
Interrupt latency is the time between the transition of the interrupt line and the execution of the first instruction of the handler. When ATAC is handling the interrupt, three scenarios may occur:
Priority inversion in the frame of interruption is avoided. The first scenario in the previous paragraph shows that it is not possible for an interrupt mapped to a low priority task to disturb a high priority task. The priority inversion time is therefore zero. This is a substantial advantage when the system is subject to burst of interrupts bound for lower priority tasks.
We have seen that the use of ATAC decreases the cost to perform an Ada real-time construct, and also decreases the cost sensitivity to the complexity of the construct.
The overheads due to timing management are then well characterised and can be provided as input to a schedulability analysis. TLD reports that some of the ATAC-based run-time overheads are reduced up to 30 times. The Worst Case Execution Time (WCET) estimation is therefore lower and much more realistic. The use of the computer throughput is much more efficient.
The use of ATAC can also benefit the cyclic model by reducing the jitter, increasing the time accuracy, improving the maintenance aspects and offering a safe extension to pre-emption. With ATAC, the tasks of the Olympus AOCS, which are mainly cyclic, are activated with respect to each other as intended. The ATAC version of this AOCS shows an exceptional stability.
Introducing ATAC makes the system easier to predict, and there is no priority inversion.
With the ATAC, there is no limit to the use of the full Ada83 tasking.
Note that the next version of the chip will include Ada95 support.
Last edited 28 November 1995
All information is provided "as is", there is no warranty that the information is correct or suitable for any purpose, neither implicit nor explicit.
This information does not necessarily reflect the policy of the European Space Agency.
This article may be redistributed provided that the article and this notice remain intact. This article may not under any circumstances be resold or redistributed for compensation of any kind without prior written permission from European Space Agency.