Creating Well-Behaved Asynchronous Executables

Establishing a proper operational environment for an application can be complex. There are the arguments passed by the command shell (or programmatically), persisted values to be loaded from configuration files and perhaps parsing of input, provided either as a stream on standard input or as a named file. There are also issues such as asynchronous interruption by a platform signal.

This makes for quite a potential for problems. It’s the generic details of being a process versus the specific features and behaviours to be delivered by a product. There is an obvious motivation to resolve all the related issues in a high-quality way. There is also a software maintenance motivation to provide a solution as a library, so that improvements can be made in one place and be propagated mechanically to all the applications based on that library.

Ansar provides the create_object() function as one response to those motivations. It exists primarily to put the proper support structure around a “main” async object, but has been generalized to suit a range of scenarios. It can be used to develop;

  • any application that requires some combination of argument processing, application persistence, sophisticated input/output, or interrupt handling (control-c),

  • asynchronous applications with the same options as above,

  • pure asynchronous applications (aka, components), i.e. processes that can be managed in groups.

Starting At The Command-Line

Consider the description of a standard *nix shell command-line below;

$ [<variable>=<value> ..] app-1 [–<setting>=<value> ..] [word ..] < input > output

There are lists of variables, settings and words. There is also an expression of where input may come from and where output might go to. In plain language the app-1 application accepts a set of arguments, a list of words and input-output information. It can also share a set of global variables with other applications started in the same environment. Except for the presence of an executable, all the elements of this command-line are optional.

Ansar directly supports this definition of how a process is started. The create_object() function optionally accepts an object as input and a set of arguments. It also manages persistent configuration and output of an application object, and it does all of this fully-typed - ansar-encode is used to move data in and out of the process.

If a create_object()-based application expects input, then ansar decodes the expected type from stdin and passes it to the “main” async object, e.g. to __init__() (or directly if the object is a function). If the application declares a settings object then ansar keeps a copy tucked away and recovers it on every start of the application. Arguments are decoded into the members of the recovered object. Lastly, the return value from the async object is encoded onto stdout. The function looks like this;

def create_object(object_type,
    factory_settings=None, factory_input=None, factory_variables=None,
    parameter_passing=standard_passing, logs=log_to_nowhere):

The presence of a parameter indicates the requirements of the application;

factory_settings

persisted values, recovered by each invocation

factory_input

per-invocation values

factory_variables

values shared across multiple processes

upgrade

version handling for settings and input

parameter_passing

custom command-line processing

logs

custom log handling

For the 3 factory_xxx parameters, the general strategy is to pass an instance of a registered class (i.e. registered with ar.bind()) and the type information associated with that class is used to curate the flow of the associated data. The values inside the instance are assumed to be the factory defaults.

Where there is any form of version mismatch the upgrade function is called, e.g. where an updated version of an executable is installed on a system that still has the settings for the previous version. The offending object is passed to the function and an improved version is expected in return. Version handling occurs in the ansar-encode library. Full details on the proper expression of version changes and how an upgrade function should be written, is in the associated documentation.

At least in the early days of a versioned type, upgrade functions are simple. An example is provided at the end of this document.

Persistent Settings

Consider the following code;

import ansar.create as ar

class AppSettings(object):
    def __init__(self, colour=None, number=0):
        self.colour = colour or str()
        self.number = number

ar.bind(AppSettings)

def app_1(self, settings):
    print(settings.colour)
    return 0

ar.bind(app_1)

if __name__ == '__main__':
    ar.create_object(app_1, factory_settings=AppSettings(colour='chartreuse'))

This application has persistent configuration, as declared by the presence of factory_settings. It remembers two values from one invocation to the next. Those values are known by the names colour and number and they each have their own type (i.e. determined during ar.bind(AppSettings));

$ python3 app-1.py
chartreuse
$ python3 app-1.py --colour=supermarine
supermarine
$ python3 app-1.py
chartreuse
$ python3 app-1.py --colour=supermarine --store-settings
{
    "value": [
        "ansar.create.lifecycle.Ack",
        {},
        []
    ]
}
$ python3 app-1.py
supermarine
$ python3 app-1.py --number=1.25
{
    "value": [
        "ansar.create.lifecycle.Faulted",
        {
            "condition": "cannot assign values to settings",
            "explanation": "transformation (no transformation for data/specification float/Integer8)"
        },
        []
    ]
}
$ python3 app-1.py --reset-to-factory-settings
{
    "value": [
        "ansar.create.lifecycle.Ack",
        {},
        []
    ]
}
$ python3 app-1.py
chartreuse

An initial command prints the default, factory_settings colour. The next 2 commands involving a new colour show that command-line settings are transient (or per-invocation) but can be persisted by adding the --store-settings flag. Saving of the value is verified with the output of an Ack, noticing also that for operations like saving the object is never instantiated - the current colour is not printed. An attempt to assign a floating point value to the integer number fails and produces a description of why. Lastly, there is the ability to revert to the settings compiled into the executable. The long argument name is deliberately inconvenient.

The full sophistication of ansar.encode is available. Members of AppSettings can involve complex types such as lists, maps and nested objects (instances of registered classes) and (with one constraint) these can all be entered at the command-line, using --setting=encoding where setting is a name from the registered class and encoding is the JSON representation of the associated type. However, in this context members cannot involve graph objects.

This is good arrangement for the following reasons;

  • it leverages the expressiveness of ansar-encode in areas that are often neglected, e.g. environment variables.

  • the representation of data is consistent across all the data input/output mechanisms, e.g. input pipe, command-line arguments, and persistent configuration.

  • usage is consistent across all create_object()-based executables.

That last item on the list is crucial to the wider goals of ansar. Without consistent behaviour across processes there can be no reliable basis for process orchestration and no development automation.

Shared Environment Variables

Similar handling is available for environment variables and sophisticated input objects, i.e. factory_variables and factory_input respectively. If values are being injected into the host environment, accessing them looks like this;

import ansar.create as ar

class AppVariables(object):
    def __init__(self, colour=None, number=0):
        self.colour = colour or ''
        self.number = number

ar.bind(AppVariables)

def app_2(self):
    variables = ar.command_variables()
    print(variables.colour)
    return 0

ar.bind(app_2)

if __name__ == '__main__':
    ar.create_object(app_2, factory_variables=AppVariables(colour='chartreuse'))

This application expects values in the enviornment, as declared by the presence of factory_variables. It uses the same member declarations as the previous example.

$ python3 app-2.py
chartreuse
$ AR_V_COLOUR=supermarine python3 app-2.py
supermarine
$ python3 app-2.py
chartreuse
$ AR_V_NUMBER=1.25 python3 app-2.py
{
    "value": [
        "ansar.create.lifecycle.Faulted",
        {
            "condition": "cannot process environment variables",
            "explanation": "cannot decode value for \"number\", transformation (no transformation for data/specification float/Integer8)"
        },
        []
    ]
}

Prefixes are stripped (i.e. AR_V_), dashes are replaced with underscores and all text is folded to uppercase before matching is performed. Environment variables cannot be changed by the library, i.e. there is no equivalent to --store-settings. Values of variables must be modified at the point of injection into the environment, such as in an underlying shell script. Environment variables are not passed to the application object at creation time. This is to reduce the complexity of the object definition and also in acknowledgement of the static, global nature of this dataset. There is no inherent danger in multiple access by multiple threads within the application, to a single copy of the variables.

Passing Complex Input Data

import ansar.create as ar

class AppInput(object):
    def __init__(self, colour=None, number=0):
        self.colour = colour or ''
        self.number = number

ar.bind(AppInput)

def app_3(self, settings, input):
    print(input.colour)
    return 0

ar.bind(app_3)

if __name__ == '__main__':
    ar.create_object(app_1, factory_input=AppInput(colour='chartreuse'))

This application expects a representation of an AppInput object on stdin, as declared by the presence of factory_input. It uses the same member declarations as the previous example.

$ python3 app-3.py --dump-input
{
    "value": {
        "colour": "chartreuse",
        "number": 0
    }
}
$ echo '{"value": {"colour": "chartreuse","number": 0}}' | python3 app-3.py
chartreuse
$ echo '{"value": {"colour": "supermarine","number": 0}}' | python3 app-3.py
supermarine
$ echo '{"value": {"colour": "supermarine","number": 1.25}}' | python3 app-3.py
{
    "value": [
        "ansar.create.lifecycle.Faulted",
        {
            "condition": "cannot decode input",
            "explanation": "transformation, near \"number\" (no transformation for data/specification float/Integer8)"
        },
        []
    ]
}
$ python3 app-3.py --dump-input > app-3-input
$ python3 app-3.py --input-file=app-3-input
chartreuse

A representation of the factory_input can be generated with the --dump-input command. This provides a useful template for creating valid examples of input. There is no per-member control over application input. An entire object is decoded from stdin or an --input-file.

Application Object Signature

The presence of factory_input alters the object-creation signature. The app_3() function is defined with both settings and input parameters. The framework tries to use the presence/absence of its own optional parameters to infer the correct signature to use. This approach avoids the requirement that all application objects adopt the same, full signature. In the case of an application that has factory_input but no factory_settings, the settings parameter is None.

Storage Of Persistent Values

Recall this sequence of commands;

$ python3 app-1.py
chartreuse
$ python3 app-1.py --colour=supermarine --store-settings
{
    "value": [
        "ansar.create.lifecycle.Ack",
        {},
        []
    ]
}
$ python3 app-1.py
supermarine

During the create_object() function there is an evaluation of the command-line and environment variables to determine where settings should be stored. In this case the location is defined as;

  • the folder specified in the $ANSAR_TOOL variable,

  • the folder specified in the $HOME variable,

  • or the current working folder.

A folder called .ansar-tool at the resolved location is created as necessary. Further folders and files are created under the hidden tool folder on a per-executable basis, including a settings file for app-1 that contains a representation of an AppSettings instance.

By default the create_object() function loads the currently stored settings. Command-line assignments occur on top of those current settings and finally, if the store-settings flag is present, a store operation occurs. In the very first run of a library-based executable the settings storage is initialized with the contents of the factory_settings object.

Calling A Process And Expecting A Response

Processes are supported as valid asynchronous objects. The Process class is an async machine that accepts the name of an executable and an optional input object. It starts a new platform process and - if present - passes the encoded input object over an input pipe. Output is decoded and passed back to the parent of the async object.

import ansar.create as ar

def main(self):
    a = self.create(ar.Process, 'child')
    m = self.select(ar.Completed, ar.Stop)
    if isinstance(m, ar.Stop):
        self.send(m, a)
        self.select(ar.Completed)
        return ar.Aborted()
    return m.value

ar.bind(main)

if __name__ == '__main__':
    ar.create_object(parent)

This application starts the child executable and waits for its completion. If a Stop message is received before that completion, it is forwarded to the underlying process and the application again waits for completion. Eventually it returns an Aborted object which then becomes the output of the application. Otherwise the application returns the results of the child process.

Async applications can create Process objects freely. Complex hierarchies involving 500+ processes are created regularly in testing, with full interrupt capability retained at all times. Remembering that for this to work the named executable must be an instance of an application that starts with a call to create_object(). The concept of “processes as async objects” is implemented as the pair of software artefacts.

A similar machine is provided for integration of those executables that do not start with create_object(). The Utility machine provides convenient transformation of application data into flags and treats input and output pipes as plain streams of characters. This covers most other usable executables.

Combining Processes Into Groups

Delivery of multi-process solutions is the primary goal of this library. The create_object() function is a necessary foundation. Further information on what it takes to develop and maintain arbitrary compositions of processes can be found in the following links;

  • a user guide to multi-processing development with ansar

  • and the ansar command reference.

A Minimal Implementation Of Upgrading

Upgrade functions may be focused on a single type or multiple, and will typically look like this;

def upgrade(m, v):
    if isinstance(m, SnoozeSettings):
        if v == '':
            return SnoozeSettings(seconds=m.seconds, note='(none)', bonus=0.0)
        elif v == '0.0':
            return SnoozeSettings(seconds=m.seconds, bonus=0.0)
    ar.cannot_upgrade(m, v)

If no version mismatch is detected the function is not called. The empty string is a special version tag generated when the type is moving from no versioning to the first version, e.g. 0.0. The work carried out by the function will match a history being maintained for SnoozeSettings;

class SnoozeSettings(object):
    def __init__(self, seconds=1.0, note=None, bonus=None):
        self.seconds = seconds
        self.note = note
        self.bonus = bonus

SNOOZE_SCHEMA = {
    "seconds": float,
    "note": str,
    "bonus": float,
}

SNOOZE_HISTORY = (
    ('0.0', ar.Added('note'), 'Added note for support hint'),
    ('0.1', ar.Added('bonus'), 'Added bonus as an incentive'),
)

ar.bind(SnoozeSettings, object_schema=SNOOZE_SCHEMA, version_history=SNOOZE_HISTORY)