DRAMA: An Environment for Instrumentation Software

T. J. Farrell, K. Shortridge, J. A. Bailey
Anglo-Australian Observatory, P.O. Box 296, Epping, N.S.W. 2121 Australia.

I. Introduction

For some time now software for AAO instrumentation systems has been built using the Starlink ADAM environment developed in the UK (see, for example, Kelly 1992), with most of the instrumentation software running under ADAM on a central VAX and communicating with embedded instrumentation microprocessors over serial links. With the advent of increasingly powerful microprocessors supporting large address spaces, high level languages, TCP/IP communications and UNIX-like real-time operating systems such as VxWorks, it is now possible to run ADAM-style tasking software on all the processors in an instrumentation system. Of course, this has to be done without compromising the real-time performance of the microprocessor systems. However, if code only need be written once (and if the temptation to add too many incidental features can be resisted), effort can go into making that code efficient. This paper describes the system being developed at AAO for instrument control.

The application of this system to one of the emerging AAO instruments has previously been described by Shortridge et al. (1993). In this paper we concentrate on the detailed structure of the new system, now called DRAMA, and describe its use of `action handler' routines and the possibilities this provides for `inheritance' of code through the use of existing standard action handlers in new tasks.

II. Design Requirements

The Basic unit in an ADAM-type environment is the Task. A Task responds to messages from and initiates messages to other Tasks. A Task is normally implemented as a separate process in a multi-process operating system. A Task supports a number of named `actions', and almost everything that happens in an ADAM system is the result of the various component tasks being requested to perform such actions. This gives tasks a simple external interface (completely defined by a description of the various named actions) and the internal details are encapsulated within the individual tasks.

At the AAO we have generally associated an ADAM Task with each instrumentation sub-system. This partitioning was a natural consequence of the physical independence of most of these sub-systems. It has allowed us to develop and test the software for each sub-system without the rest of the system being present.

A "Control Task" is a special Task which is used to co-ordinate the operation of the various sub-system tasks. Complicated interactions between the various sub-systems can be handled by such a task, and it may also provide a user interface.

Additionally, we have defined a set of standard messages to which all our tasks must respond. This is known as our "Generic Instrumentation Task" specification. It ensures the tasks are consistent in the way they provide standard features such as Initialisation, Reset, Simulate or Shutdown commands.

This standard has allowed us to build a powerful "Generic Control Task". This task will automatically load and control a given set of tasks. It is easy to produce modified versions of this Control Task which include the complex interactions required by various specific instrumentation system configurations.

This approach has proved both powerful and flexible -- allowing us to quickly integrate new instrumentation into complex existing systems. Regardless of this, it has several limitations, particularly that at present it is limited to running on VAX/VMS systems. Although Starlink is currently porting ADAM to Unix, it is not yet building a VxWorks version. This gives us the opportunity to experiment with some of the ideas originally raised at a recent ADAM workshop, where possible future modifications to ADAM were discussed. To that extent, DRAMA can be thought of as a test-bed for a future ADAM II design. We have tried to remove, or at least make optional, some of the more cumbersome features of the original ADAM, such as those features of the parameter system which are geared more towards support for data reduction operations. Also, we are explicitly aiming at a variety of systems, including those such as VxWorks where each Task/Process shares the same address space.

The result is called DRAMA -- The "Distributed Real-time AAO Monitor for Astronomy". Its design requirements were as follows:

DRAMA should provide a distributed tasking architecture using a task model similar to ADAM Instrumentation tasks.
DRAMA should be portable across operating systems which implement a tasking model. In particular, we wish to run it under VxWorks, SunOS and VAX/VMS.
DRAMA should be sufficiently fast for all our real time applications.
It should be possible for VMS ADAM tasks to communicate with DRAMA tasks.
It should be possible to avoid excess baggage, such as parameter systems, in tasks which don't need them.
Tasks should always be awake to incoming messages (so they can handle an abort request). They should not block when sending messages.
The Message system should be as reliable as possible, consistent with real time requirements. It should be possible to send DRAMA system messages from interrupt service routines. If a task with which another task is communicating dies, the later task should be told.

III. DRAMA Components

DRAMA consists of the following packages:

DITS: -- The Distributed Instrumentation Tasking System. DITS ties together the low level systems to provide general techniques and routines for building Instrumentation Tasks. DITS also defines a simple parameter system and provides routines to implement it.
SDS: -- The Self-defining Data System. SDS is used by DITS to move data between machines of various architectures, making allowance for the various machine specific ways of storing numbers and characters. The Task programmer may use SDS directly or via the ARG package supplied as part of SDS.
IMP: -- The Interprocess Message Passing system. IMP is the underling message system used by DITS. It provides the fastest possible means of sending messages between processes on the one machine while also providing an almost transparent interface for sending messages to other machines across a network.
ERS: -- The DRAMA Error Reporting System provides error reporting in a consistent manner. It is independent of any user interface or message sending mechanism.

IV. Overview of a DRAMA Task

Each DRAMA task responds to a set of specific named actions. A user interface or an action executing in another task (the `requesting task') sends it a message (an `OBEY' message) to initiate one of these actions. During execution of the code associated with the action (the `action routine') the task continually reschedules the action, returning from the action routine to allow the underlying DRAMA system to read any new messages sent to the task. These may be new `OBEY' messages to initiate other, concurrent, actions, or they may be `KICK' messages which provide additional information connected with an already active action. KICK messages are usually sent by the task that originally requested the action. A common use is to cancel the action. If an action routine wants to pass information back to the requesting task it sends a `TRIGGER' message back to that task containing that information. (This then `triggers' a reschedule of the requesting action in the requesting task, so that it can read the information in the message.) A task can have global parameter values associated with it that can be set or read by other tasks, and individual actions can have specific arguments whose values are included in the OBEY messages.

V. A DRAMA Task in More Detail

The main routine of a DRAMA task looks something like this:

    Initialise Dits
    Initialise Parameter System (optional)
    Register Action handlers
    Main Message Loop
    Shutdown Dits

At the line "Main Message Loop", the program loops reading and processing incoming messages. In any interesting system, in addition to housekeeping messages, we will get messages each of which specifies the name of an action.

In the line "Register Action Handlers", each Task action is associated with an element of an array of structures. Each such structure element describes an action, including:

The name of the action (a character string).
A routine to execute when an OBEY message is received.
A routine to execute when a KICK message is received.

An OBEY message contains an action name. When an OBEY message is received, the obey routine registered for that action is activated.

The obey routine has a range of facilities at its disposal, most based around the concept of rescheduling. A common example of an action is one that moves a wheel to a specified position. This is normally done by sending a command to hardware and then waiting for an interrupt which indicates either that the move is complete or that there is more work to be done. The complexity arises from two other possibilities. The user may request the move be aborted or the interrupt may never arrive. To handle the first of these, the task must wait for more messages. To handle the second it must set up a timeout. The DRAMA reschedule hides the details of doing this. After starting the wheel movement, the action routine tells Dits how to reschedule the action, which routine to invoke on the reschedule, and then it just returns to the "Main Messages Loop".

When the reschedule event (interrupt or timeout) occurs, the routine specified for the action is invoked. This sequence can be repeated indefinitely. Aborts are handled by the requesting task sending KICK messages. Timeouts and reschedules are both handled by arranging for a message to be sent to the task at the appropriate time. So the whole system is completely message-driven.

The technique of specifying a new routine to execute the next part of the action allows action implementations to be neatly broken up into separate routines. A task tends to operate as a `state machine', changing state (and registered action routines) in response to external events which give rise to messages sent to the task.

When the action completes, a completion message is sent to the initiator of the original message.

An action can also send messages to other tasks. The target task is known as a subsidiary task and the target action as the subsidiary action. When the subsidiary action completes a message is sent to the requesting task. This will trigger a reschedule of the initiating action in that task. In addition to completion messages, the subsidiary action may send other messages to the initiating action, such as error and informational messages for the user. These messages can cause a reschedule of the initiating action but are normally just forwarded to the parent of the initiating action and so on until they reach a user interface. The other message which can be sent is a TRIGGER message. These are intended to allow the subsidiary action to pass information back to its parent during the course of the action; for example, to describe the progress so far.

When a KICK message is received, the corresponding kick routine is activated. KICK messages are intended to affect the way a currently active action is executed.

KICK messages resemble ADAM Cancel messages, although they are more flexible, not being restricted to cancelling the action. A KICK message is defined as a message intended to cause an task to be rescheduled within the context of an already executing action. It can be used to cancel the action or just give it some more information. KICK and TRIGGER messages allow two way communication between an action and its parent action.

If actions reschedule frequently then multiple actions (of different names) can be active at any one time. When one returns to be rescheduled, the next one for which a message is available is invoked. So each DRAMA task is running as a cooperative multi-threaded system, each action being a separate thread.

The call to register action handlers can be made multiple times with later definitions of the same action name overriding the previous one. We have used this technique to implement a simple spectrograph task. A "Generic Instrumentation Package" implements the actions which should be in all of our tasks (such as INITIALISE, EXIT, SIMULATE_LEVEL etc.). This package is "inherited" by the spectrograph task, which just adds its own specific commands and may choose to override particular ones such as INITIALISE. This makes it very easy to build a task obeying particular standards and to update that task should the standards change (by relinking).

The global parameter system is separate from the command line arguments. We intend to use parameters for task reconfiguration and to get task status. A task may send messages to set or get the parameters of other tasks. The parameter system is optional; if one is used the standard system may be `inherited' by a task, or a new one may be provided by the task designer. This allows you to write stripped down tasks or, less commonly, to move to a parameter system of your own design.

A major break from the traditional ADAM approach is the separation of the parameter system from command line arguments. Under ADAM, command line arguments are normally associated with parameters. This resulted in command arguments always being passed as character strings, even when both tasks involved in a communication wanted it in another form. It has also led to heavy overheads as command line arguments are parsed for each message received, even when the argument is not actually used.

As part of this break, we have moved away from the ADAM approach where the target task prompts the user for parameter values which have not been set correctly. DRAMA does not use prompting but relies on the user interface having sufficient information to ensure commands passed to a task are complete. We forsee user interfaces which know about the tasks they are to control. This could done in the case of general user interfaces by loading a task Interface Definition File when they load the task. Under ADAM, this file is only loaded by the task itself. We have already implemented a version of this by using a special ADAM user interface task which loads the interface definition file of a target task. Using the ADAM to DRAMA interface facilities, the target task can be a DRAMA task.

VI. IMP

The Interprocess Message Passing System (IMP) implements a fast, non-blocking, network based message system. It has been designed around a task model which is driven by the arrival of incoming messages.

IMP is necessary as none of the vendor supplied message protocols are both portable enough and always fast enough to do the job. For interprocess communication on the same machine IMP uses shared memory. The message notification mechanism is dependent on the machine in use -- under VxWorks, we use semaphores, under VMS a `hibernate/wake' mechanism, etc. The particular mechanism used is normally the fastest available on the target machine. It is also possible to force an individual task to use a mechanism that is compatible with the X-Windows toolkit. This is generally slower but allows X-Windows based user interfaces to be built.

For networked communication, IMP uses two network communication tasks on each of the machines communicating. One of these is known as the "transmitter", the other as the "receiver". Encapsulating all the details of networked communication in these two tasks makes it easier to provide non-blocking network I/O, and to modify the system if necessary to support other network protocols.

IMP itself sets no limits on the maximum size of a message.

IMP also has task loading and timer support. Having task loading in IMP allows us to ensure appropriate messages are sent if a task fails to load or dies. Timer support is most conveniently provided in IMP, since all timer events under DRAMA are signalled by the sending of messages.

VII. SDS

The Self-Defining data system (SDS) is a system which allows the creation of self-defining hierarchical data structures in a form which allows the data to be moved between different machine architectures. Because the structures are self-defining they can be used for communication between independent modules in a distributed system. The data structures are dynamic, allowing components to be added or deleted, or the size of arrays to be changed. The structures include the names of their components and details of how any data values are stored (IEEE or VAX floating point, byte order, etc.). This allows SDS to locate items by name and provide transparent data conversion of their values to the local machine format.

SDS is flexible and powerful. It is described in detail by Bailey (1993).

SDS is the format used by DRAMA to encode its messages. Every DRAMA message is an SDS structure. There are a number of standard fields in this (action name, etc.) which are interpreted by the underlying DRAMA code before the action routine is called. Additional information, specific to the action, can be placed in a sub-structure of the message and the action routine can access this directly through calls to SDS routines.

This SDS sub-structure may be a complex hierarchical structure, so it can represent anything that one task wishes to pass to another, including complete data arrays (CCD images, etc.).

It is up to the actions involved to define what is in this SDS sub-structure, but we have defined a couple of standards. If the top level name of the SDS sub-structure is "ArgStructure", then it contains a list of named scalar or string items. These items may be accessed using the ARG routines supplied with SDS. The purpose of this definition is to allow user interfaces which are independent of the system they are controlling. Such user interfaces need to wrap up arguments to commands in an SDS structure. This structure is then sent to target tasks which must always be able to understand them, regardless of the type of target task. The ArgStructure definition allows this to be done.

The second standard, currently under development, is the "ImageStructure" standard. This allows a user interface to respond to images sent from subsidiary tasks. For example, an X-windows based user interface may display the image, while a command line version may write it to a file for later display.

The situations where non-standard SDS structures are required do not normally involve general user interfaces. Therefore task documentation must define the structures sent. Even in this case, since SDS is self defining, a user interface can make a good attempt at doing something with the structure.

VIII. Other Facilities

Special techniques are available to allow the writing of user interfaces. These techniques are required as under DRAMA all actions should normally have a continuing parent action which initiated them, to which completion and other message are sent. User interfaces generally disobey this rule. However, by having a special task context for such cases, user interfaces can be built quite easily. The basic user interface is a simple program which sends a single action request to a specified task and waits for the responses. A Motif X-windows version is also provided.

An ADAM to DRAMA interface program is provided. It allows an ADAM task to send commands to a DRAMA task. By using this program, we automatically get some fancy user interfaces such as ADAM's programmable command language, ICL.

IX. Current Status

The current implementation is entirely in C and has been ported to VMS, SunOS and VxWorks systems with TCP/IP as the underlying network protocol. Most facilities are provided, the major exception being task loading. The system is well documented.

X. Acknowledgements

The design of DRAMA owes much to the work done by William Lupton (now at Keck Observatory, Kamuela, Hawaii) on the the ADAM version 2 task structure. We have also benefited from discussions with and guidance from John Straede and Lewis Waller at AAO.

References

Bailey, J. A. 1993,: A Self-Defining Hierarchical Data System, in Astronomical Data Analysis Software and Systems II, ed. Brissenden et al., San Francisco, Astronomical Society of the Pacific (in press).
Kelly, B. D. 1992,: Starlink User Note 134, ADAM -- Guide to Writing Instrumentation Tasks
Shortridge, K., Farrell, T. J., Bailey, J. A. 1993,: The Data Acquisition System for the AAO 2-degree Field Project, in Astronomical Data Analysis Software and Systems II, ed. Brissenden et al., San Francisco, Astronomical Society of the Pacific (in press).

[Start of Document]

Click here for the DRAMA home page and here for the AAO home page.

For more information, contact tony.farrell@mq.edu.au