Ada TAsking Coprocessor

Some figures about ATAC performance

Table of contents

About the level of measurement

The ATAC performance can be measured at several levels. The first level is the instruction level. In the same way that you measure the execution time of each instruction of a CPU, it is possible to measure the execution time of each instruction of ATAC. The results are very accurate and are impressive (typically, an ATAC instruction takes in the order of a microsecond to execute, a whole rendez-vous with call and accept takes some 3 microseconds). But these figures cannot be considered out of their context:

There are two reasons for that. The first is that the execution time of certain instructions depends on the context of the instruction use. This context may be defined at compile time or at run time.

The execution time of a given instruction is mainly a function of compile time static aspects which originate from the Ada code. For example, the execution time of the ABORT_TASK instruction is a linear function of the number of tasks that it is applied on. If a task has dependents, they will be recursively aborted in a single blow. This part of the execution time can be fully characterised, as (i) the ATAC algorithms are completely known and defined and (ii) the program complexity is known by the programmer, who is then able to give the necessary inputs to execution time computation.

However, some instructions have their execution time slightly dependent on the current status of the running system. Later benchmarks will show that variations due to different situations of the system account for less than 1 ms in our system. This variation is small enough to be used as a bound for execution time computation and as input to schedulability analysis.

The second reason that makes measurement of each instruction execution time difficult to interpret is that each instruction is very often used in a sequence together with other ATAC instructions and compiler generated instruction. For example, a SWITCH instruction is inserted between a save and a restore context, the whole nested in a `if PREEMPT' branch. The compiler generated code has a more substantial impact on the whole execution time. This must not be charged to the compiler, but only to the fact that additional operations must be performed between ATAC operations. It then becomes more realistic to measure whole sequences.

The PIWGs

Here are the raw results

PIWG

The figures for the Ada mechanisms

An adequate level of sequences to be characterized is the level of Ada mechanisms, as they are the basic elements used by the Ada programmers. Select statement, timed and conditional entry calls, tasking attributes evaluation, task creation, activation and termination, semaphore operations and interrupt entries have been measured.

As the measured sequences include non-ATAC code, the results will depend on the way this code has been generated, i.e on the compiler. A special compiler (called AAA) has been designed by R-Tech AB. This subset compiler does not implement the full Ada language, but at least all the features that are necessary to give a true picture of the tasking performance.

The hardware is based on a MA31750 at 10 MHz and ATAC at 20 MHz.

The execution time of each mechanism has been measured with the CPU running at 10 MHz. The ATAC part of the execution time has been measured as a maximum of 15% of the total time. This allows to estimate the execution time if the CPU was running at 20 MHz.

The following figures are given with courtesy from R-Tech AB, 1994.

A synchronisation rendez-vous (normal call with no parameters, caller and receiver with the same priority) is 13 microseconds. (7 microseconds estimated at 20 MHz)

The whole life cycle of a new dependent task, allocation, activation and termination is 44 microseconds (25 microseconds estimated at 20 MHz) Note especially that this is a heavyweight task with a complete environment, even initialization of symbolic names. The created task has higher priority. The ATAC overhead is much less than 15% in this case, but 15% is used for the estimate.

The interrupt entry time is measured between (i) the transition of an ATAC interrupt line and (ii) the accept of the corresponding higher priority waiting task. It includes the ATAC generating an interrupt to the CPU, the CPU processing the interrupt in a routine to pre-empt the current task and switch to the waiting task, until the first Ada statement after the accept. This time is 22 microseconds (13 microseconds estimated at 20 MHz)

The improvements of the ATAC vs the Ada constructs complexity

Benchmarks description

The Ada constructs can be benchmarked in another way. The goal is now to see the influence of the complexity of the Ada program (number of tasks, number of callers, number of open alternatives...) on the execution time.

The related benchmarks have been generated automatically by the R-Tech benchmark synthesizer, and have been used in the characterisation of the first ATAC 1.0. They also have been run by R-Tech AB.

Some basic structures are used:

The benchmarks are using these basic structures, with a variable number (1 to 4) of caller, of callee, of entries. The priority is the same between caller and callee The benchmarks give an overview of the behaviour of a compiler as regards Ada tasking performance. When drawn on a curve, at first glance they show the range where a compiler stays. Of course, a hurried conclusion must not be generalized, and a deeper analysis must be performed when considering a compiler.

The figures

The TLD compiler release that has been used for the benchmarks is the 93U111. It is not the final version of the compiler, but this release already proves the ATAC benefits.

Here is a curve showing the benchmarks results (Format: PostScript).

The curve shows the execution time of each construct (SY, PL, ...) instantiated with a different number of callers, receivers, and entries. The upper grey curve represents the time execution when the -atac switch is off, and the lower black curve has been obtained with the ATAC.

Comparing the same targets (MA31750/10MHz), the TLD compiler has improved its performance of an average ratio of 10.6 by supporting ATAC. The last release of the TLD compiler reaches an average ratio of 12.

The curves also show the influence of the complexity of the construct on the execution time. For a given construct (e.g. EN-ACC), the curve without ATAC shows some crenels in the range from 215 to 340 microseconds. The deviation is 125 microseconds due to the complexity.

The curve with ATAC is much smoother, from 24 to 32. The excursion is 8. So, not only is the cost of the basic construct lower (215 to 24), but also the complexity has a smaller influence (8 instead of 125).

The consequence of this is an improvement of the predictability. By smoothing the curves, an ATAC system is more linear, less subject to variation in execution time.

The CPU clock has an influence on the previous curves: the figures of both curves will be reduced if the CPU speed increases. The curve without ATAC will obviously decrease, as only the CPU is involved in the execution time. However, the figures of the ATAC curve has to be charged both to the CPU and ATAC. The CPU contribution is substantially more important than the ATAC part. So the figures of this curve will also decrease if the CPU speed increases. Furthermore, the ATAC clock can also be increased.

The benefits of ATAC to the schedulability analysis

A study has been performed in the frame of an ESTEC contract [Hard Real Time Operating System Kernel, 93] about the Hard Real Time theory.A main point in this study is the schedulability analysis, to determine whether a system under design will be schedulable or not.

The necessary inputs for this are:

This last input includes the hardware description of the platform, and the characterisation of the run-time model and system. Various run-time system-typical overhead figures must be provided. These parameters are e.g. context switch time, delay queue time, figures which are tied to the properties of the compiler and the semantic of the language.

The knowledge in the run-time system is often not sufficient to associate accurate values to the parameters. This results in a very pessimistic characterisation of the platform, and therefore a too large WCET.

Consequently, a system which is schedulable, may be considered by the analysis tools as not schedulable because unnecessary margin has been taken. This margin can be very significant especially in a pre-emptive system.

ATAC facilitates a substantial reduction of this margin, and as a consequence the set of parameters has to be reconsidered. Some of the parameters, corresponding to ATAC atomic operations, may be grouped. Others may be given a zero value, as the corresponding ATAC operations are run concurrently with the CPU.

Then the value given to each parameter will be either a constant provided by ATAC characterization, or the result of a formula. The formula will reflect the dependency of ATAC execution time on the parameters of the Ada program . The part of the execution time which is dependent on the status of the system at run-time will be given a bound that remains in acceptable value (typically 1 microsecond).

By providing a precise model of the run-time system, ATAC allows to remove safety margins in the schedulability analysis of an application, and therefore to have a less pessimistic and more realistic assessment of the execution time.

A list of the ATAC benefits

The ATAC offers in first place pure technical performance, but the main advantages must be considered from the application standpoint. The following points have been confirmed by industry evaluating the ATAC. See the ATAC day executive summary (Format: PostScript) for more details.

ATAC technical performance

Faster execution

The first observation when using the ATAC is the significant reduction of the execution time of the Ada tasking features. The execution time of the tasking management is 10 to 12 times faster with ATAC. This result in a reduction of the execution time of the overall application, which of course depends on the amount of tasking in the application.

Accuracy of the delay

The value specified in the delay instruction is much more matched with the ATAC. The error is much lower, and moreover the error is constant, and so predictable.

The jitter is therefore reduced to practically nothing.

Code and data size reduction

The code of the run-time system related to tasking can be removed from the application, as it is performed by ATAC.

Data structures used by the software Run-time System to maintain tasking are now in the ATAC memory, but can still be accessed at any time.The original data structures can therefore be removed.

The ATAC evaluators generally reports a 5% saving.

Interrupt management

The interrupt lock-out is the time between explicit disable and enable interrupts. It is very often the case that an operation that must not be interrupted is executed with one ATAC instruction. The atomic nature of memory bus transactions provides mutual exclusion, and there is no need for interrupt disabling. So in the whole system, interrupt lock-out should be reduced to almost nothing.

Interrupt latency is the time between the transition of the interrupt line and the execution of the first instruction of the handler. When ATAC is handling the interrupt, three scenarios may occur:

  1. the interrupt is mapped to a task which is not waiting at the corresponding accept. Nothing happens.
  2. the interrupt is mapped to a lower priority task than the currently executing one. In this case, ATAC does not pre-empt the CPU, but it just marks the task as ready. The latency can be considered as null.
  3. the interrupt is mapped to a higher priority task. ATAC pre-empts the CPU. The latency to be considered is the time between the transition of the interrupt line and the execution of the first instruction of the receiver task., which is different from the one defined above.
  4. the interrupt is mapped to a task which is for some reason dead. A tasking error exception is raised.

Priority inversion in the frame of interruption is avoided. The first scenario in the previous paragraph shows that it is not possible for an interrupt mapped to a low priority task to disturb a high priority task. The priority inversion time is therefore zero. This is a substantial advantage when the system is subject to burst of interrupts bound for lower priority tasks.

The advantages for the application

Predictability improvement

Let us call predictability the ability to know that the system will complete its functions while respecting some timing constraints. It is not the fact that we are able to know exactly what happens at a given moment, or that the timing behaviour can be exactly reproduced at each run (determinism). But it is this level of confidence that we place on the system, knowing that whatever happens in a pre-defined frame, the system will complete successfully. This is linked to schedulability.

We have seen that the use of ATAC decreases the cost to perform an Ada real-time construct, and also decreases the cost sensitivity to the complexity of the construct.

The overheads due to timing management are then well characterised and can be provided as input to a schedulability analysis. TLD reports that some of the ATAC-based run-time overheads are reduced up to 30 times. The Worst Case Execution Time (WCET) estimation is therefore lower and much more realistic. The use of the computer throughput is much more efficient.

The use of ATAC can also benefit the cyclic model by reducing the jitter, increasing the time accuracy, improving the maintenance aspects and offering a safe extension to pre-emption. With ATAC, the tasks of the Olympus AOCS, which are mainly cyclic, are activated with respect to each other as intended. The ATAC version of this AOCS shows an exceptional stability.

Introducing ATAC makes the system easier to predict, and there is no priority inversion.

A full Ada83 design is now possible

Several software (Sciamachy on-board software, Eureca software porting to Ada) were initially designed using the full Ada tasking. But the compilers implementation of some Ada real-time features were so poor that these applications had to be redesigned without these features.

With the ATAC, there is no limit to the use of the full Ada83 tasking.

Note that the next version of the chip will include Ada95 support.

ATAC as the basis for a standardisation

ATAC also creates a possibility to standardize Ada tasking primitives. Being a piece of hardware with a well defined and characterised software interface, it is a standard component for tasking management, that any compiler can interface. Moreover, regardless of compiler, the functional behaviour will be identical. Only the timing will be different, but some benchmarks once for all designed will allow the characterisation.
Home Return to the ATAC page.
Jean-Loup TERRAILLON (jeanloup@wd.estec.esa.nl)

Last edited 28 November 1995


DISCLAIMER

All information is provided "as is", there is no warranty that the information is correct or suitable for any purpose, neither implicit nor explicit.

This information does not necessarily reflect the policy of the European Space Agency.


COPYRIGHT 1995 EUROPEAN SPACE AGENCY. ALL RIGHTS RESERVED.

This article may be redistributed provided that the article and this notice remain intact. This article may not under any circumstances be resold or redistributed for compensation of any kind without prior written permission from European Space Agency.