Skip to content

JSON interface tutorial

The following sections show how to run platform tasks through the JSON interface and how to use its possibilities to build analysis workflows and pipelines.

To execute the JSON configurations of the tutorial change to the subdirectory of the genexplain-api Git repository named docs/tutorial. Let us start with the Hello world example.

Hello world example

The output of the hello_world.json file contained in mentioned folder is shown below. The single task specified for the tasks property leads to execution of docs/tutorial/script.sh which simply echos the commandline input.

1
2
3
4
5
6
7
8
9
{
    "withoutConnect": true,
    "tasks": {
        "do": "external",
        "showOutput": true,
        "bin": "sh",
        "params": ["script.sh", "Hello world"]
    }
}
1
2
3
gene@xplain:genexplain-api/docs/tutorial$ java -jar ../../build/libs/genexplain-api-1.0.jar exec hello_world.json 
INFO  com.genexplain.api.app.APIRunner - Running command exec
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Hello world

Lists of tasks

Building pipelines and workflows is easy. One step to get started with this is to be able to specify lists of tasks that are executed sequentially. For this you just let tasks be a JSON array.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
{
    "withoutConnect": true,
    "tasks": [
        {
            "do": "external",
            "showOutput": true,
            "bin": "sh",
            "params": ["script.sh", "Executing task 1"]
        },
        {
            "do": "external",
            "showOutput": true,
            "bin": "sh",
            "params": ["script.sh", "Executing task 2"]
        },
        {
            "do": "external",
            "showOutput": true,
            "bin": "sh",
            "params": ["script.sh", "Executing task 3"]
        }
    ]
}

Execute the tutorial file task_list.json to see this at work.

1
2
3
4
5
gene@xplain:genexplain-api/docs/tutorial$ java -jar ../../build/libs/genexplain-api-1.0.jar exec task_list.json 
INFO  com.genexplain.api.app.APIRunner - Running command exec
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Executing task 1
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Executing task 2
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Executing task 3

Loading tasks from file

A pipeline with many tasks may be more convenient to manage in multiple files. It is possible to load tasks from a file.

We can simply write the three tasks of our previous example into a separate file and load the task array in our main object.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
[
    {
        "do": "external",
        "showOutput": true,
        "bin": "sh",
        "params": ["script.sh", "Executing task 1"]
    },
    {
        "do": "external",
        "showOutput": true,
        "bin": "sh",
        "params": ["script.sh", "Executing task 2"]
    },
    {
        "do": "external",
        "showOutput": true,
        "bin": "sh",
        "params": ["script.sh", "Executing task 3"]
    }
]

In our main object we specify the file using the fromFile property as shown below. explain more details in the documentation.

1
2
3
4
5
6
7
8
{
    "withoutConnect": true,
    "tasks": {
        "fromFile": {
            "file": "loadable_task_list.json"
        }
    }
}

Running that we get:

1
2
3
4
5
genexplain-api/docs/tutorial$ java -jar ../../build/libs/genexplain-api-1.0.jar exec loading_task_list.json 
INFO  com.genexplain.api.app.APIRunner - Running command exec
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Executing task 1
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Executing task 2
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Executing task 3

Deep nesting and infinite looping

So, now we can write bigger task lists into separate files which is already one step towards maintaining more complex workflows, but sometimes it gets even more complex than is conveniently handled with one file for all the tasks. And indeed the JSON executor is not limited to what we have shown so far, but allows for arbitrary levels of nesting. What goes on in the backend is a recursive method that calls itself with JSON arguments until it finds a task that is processed by the specified executor.

Deep nesting

Let us create an extra level of nesting for task 1 in our example by specifying a list of tasks 1a - 1c in a separate file. We call this file nested_task_1.json.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
[
    {
        "do": "external",
        "showOutput": true,
        "bin": "sh",
        "params": ["script.sh", "Executing task 1a"]
    },
    {
        "do": "external",
        "showOutput": true,
        "bin": "sh",
        "params": ["script.sh", "Executing task 1b"]
    },
    {
        "do": "external",
        "showOutput": true,
        "bin": "sh",
        "params": ["script.sh", "Executing task 1c"]
    }
]

Next, we adapt our loadable_task_list by replacing for task 1 again an instruction that will cause the new file to be loaded. We call this new file loadable_nested_task_list.json.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
[
    {
        "fromFile": {
            "file": "nested_task_1.json"
        }
    },
    {
        "do": "external",
        "showOutput": true,
        "bin": "sh",
        "params": ["script.sh", "Executing task 2"]
    },
    {
        "do": "external",
        "showOutput": true,
        "bin": "sh",
        "params": ["script.sh", "Executing task 3"]
    }
]

Our main JSON object actually only requires a change of file name in the fromFile property and we also save under the name loading_nested_task.json.

1
2
3
4
5
6
7
8
{
    "withoutConnect": true,
    "tasks": {
        "fromFile": {
            "file": "loadable_nested_task_list.json"
        }
    }
}

Executing it, we see the expected output.

1
2
3
4
5
6
7
genexplain-api/docs/tutorial$ java -jar ../../build/libs/genexplain-api-1.0.jar exec loading_nested_task_list.json 
INFO  com.genexplain.api.app.APIRunner - Running command exec
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Executing task 1a
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Executing task 1b
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Executing task 1c
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Executing task 2
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Executing task 3

As the output confirms, tasks 1a - 1c were loaded from the file specified in the outer task list. We can add more and more levels to this which allows us to break down workflows and pipelines into conveniently small and reusable pieces of JSON configurations. And on top of that, there are more ways of loading executable tasks as well as specifying task parameters through placeholders that make the whole more even easier to use and more flexible.

Infinite flow

Since the core executor function continues to call itself until it encounters an executable task, it is possible to build an infinitely running analysis pipeline. A possible use case is a process that continues to check for new data and analyzes them as soon as they are available. Here we focus on showing a way to build a self-perpetuating pipeline. Another important component for the described use case is to follow alternative analysis branches selected by some condition which is shown in a later part of the tutorial.

Looking at the previous section, one can obviously create a self-perpetuating process by specifying an array of tasks in a file which itself is loaded again in the end. Let us call such a task file infinite_loop.json and configure tasks as shown below. So, at the end a fromFile instructions causes the same file to be loaded again, starting over the pipeline.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
[
    {
        "do": "external",
        "showOutput": true,
        "bin": "sh",
        "params": ["script.sh", "Starting infinite loop period"]
    },
    {
        "do": "external",
        "showOutput": true,
        "bin": "sh",
        "params": ["script.sh", "Middle of infinite loop period"]
    },
    {
        "do": "external",
        "showOutput": true,
        "bin": "sh",
        "params": ["script.sh", "End of infinite loop period"]
    },
    {
        "fromFile": {
            "file": "infinite_loop.json"
        }
    }
]

Our main config only requires a change of file name to load the infinite_loop.json and we rename it as loading_infinite_loop.json.

1
2
3
4
5
6
7
8
{
    "withoutConnect": true,
    "tasks": {
        "fromFile": {
            "file": "infinite_loop.json"
        }
    }
}

Execution of the loading_infinite_loop.json results in the expected output and requires forceful interruption.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
genexplain-api/docs/tutorial$ java -jar ../../build/libs/genexplain-api-1.0.jar exec loading_infinite_loop.json 
INFO  com.genexplain.api.app.APIRunner - Running command exec
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Starting infinite loop period
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Middle of infinite loop period
INFO  c.genexplain.api.core.GxJsonExecutor - External output: End of infinite loop period
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Starting infinite loop period
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Middle of infinite loop period
INFO  c.genexplain.api.core.GxJsonExecutor - External output: End of infinite loop period
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Starting infinite loop period
[...]

Loading from a task library

Until now, we have worked with lists of task objects. When I have taken the time to nicely write down the JSON config for some analysis tool, I would like to reuse it at different places. In the simple list format we have covered so far, the tasks are unnamed items and therefore not so easily accessible. Let us extend the main JSON config by one property that instructs the Java application to load task definitions from files. This property is called loadTasks and its value is an array with a list of files. A file that can be loaded contains JSON objects with key/value-pairs consisting of task name and task definition. Here we create such a file named task_lib.json for the three example tasks.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
{
    "task1": {
        "do": "external",
        "showOutput": true,
        "bin": "sh",
        "params": ["script.sh", "Executing task 1"]
    },
    "task2": {
        "do": "external",
        "showOutput": true,
        "bin": "sh",
        "params": ["script.sh", "Executing task 2"]
    },
    "task3": {
        "do": "external",
        "showOutput": true,
        "bin": "sh",
        "params": ["script.sh", "Executing task 3"]
    }
}

This file is similar to the array of tasks in loadable_task_list.json, only that the outer structure is an object and the tasks have names. Next we adapt the main JSON config to load this small library of tasks.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
{
    "withoutConnect": true,
    "loadTasks": [
        "task_lib.json"
    ],
    "tasks": [
        { "fromLib": "task1" },
        { "fromLib": "task2" },
        { "fromLib": "task3" }
    ]
}

Executing the main JSON config shows the expected output, where it is important to note that the tasks are nevertheless executed in the specified order:

1
2
3
4
5
genexplain-api/docs/tutorial$ java -jar ../../build/libs/genexplain-api-1.0.jar exec loading_task_lib.json 
INFO  com.genexplain.api.app.APIRunner - Running command exec
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Executing task 1
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Executing task 2
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Executing task 3

Since the loadTasks value is an array, one can specify many files to load task definitions from.

Parameter placeholders

To make task definitions more reusable one can insert placeholders for parameters which are replaced by the desired value specified in the main config. Let us modify the task_lib.json file and substitute the output strings with placeholders as shown below. While we often use capital strings delimited by $-symbols, any readable string can be used as placeholder.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
{
    "task1": {
        "do": "external",
        "showOutput": true,
        "bin": "sh",
        "params": ["script.sh", "$FIRST_MESSAGE$"]
    },
    "task2": {
        "do": "external",
        "showOutput": true,
        "bin": "sh",
        "params": ["script.sh", "SECOND_MESSAGE"]
    },
    "task3": {
        "do": "external",
        "showOutput": true,
        "bin": "sh",
        "params": ["script.sh", "third message"]
    }
}

In the main config we need to add a replaceStrings property whose value is an array of arrays. We use arrays here, because we want to maintain the order specified placeholders. For our example we add one array for each placeholder that shall be recognized with the replacement as second element.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
{
    "withoutConnect": true,
    "replaceStrings": [
        ["$FIRST_MESSAGE$", "First task"],
        ["SECOND_MESSAGE", "Second task"],
        ["third message", "Third task"]
    ],
    "loadTasks": [
        "placeholder_task_lib.json"
    ],
    "tasks": [
        { "fromLib": "task1" },
        { "fromLib": "task2" },
        { "fromLib": "task3" }
    ]
}

In the output of our example workflow the placeholders are replaced by the specified strings.

1
2
3
4
5
genexplain-api/docs/tutorial$ java -jar ../../build/libs/genexplain-api-1.0.jar exec loading_task_lib_with_placeholders.json 
INFO  com.genexplain.api.app.APIRunner - Running command exec
INFO  c.genexplain.api.core.GxJsonExecutor - External output: First task
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Second task
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Third task

Note that any JSON value can be applied for replacement.

Cascaded replacement

To build pipelines from reusable task definitions and smaller workflows, it is important to know that the replaceStrings value is intentionally an array in order to take advantage of the sequence of substitutions. What happens in the code is that the program iterates over the placeholder strings and substitutes all occurrences in a task definition with the current replacement. This means that one can replace a placeholder with another placeholder whose value is defined a later position in the replaceStrings array. Here is an example. We change placeholders in the main config as shown below and save the new main config as loading_task_lib_multiple_replacements.json.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
{
    "withoutConnect": true,
    "replaceStrings": [
        ["$FIRST_MESSAGE$", "SECOND_MESSAGE"],
        ["SECOND_MESSAGE", "third message"],
        ["third message", "FINALLY"],
        ["FINALLY", "Finally, they are all equal."]
    ],
    "loadTasks": [
        "placeholder_task_lib.json"
    ],
    "tasks": [
        { "fromLib": "task1" },
        { "fromLib": "task2" },
        { "fromLib": "task3" }
    ]
}

And the output shows the result indicated by the final replacement value.

1
2
3
4
5
genexplain-api/docs/tutorial$ java -jar ../../build/libs/genexplain-api-1.0.jar exec loading_task_lib_multiple_replacements.json 
INFO  com.genexplain.api.app.APIRunner - Running command exec
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Finally, they are all equal.
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Finally, they are all equal.
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Finally, they are all equal.

This property of the placeholder handling is important to be able to create reusable task definitions with general placeholders, which can be customized as needed using a cascade of replacements defined in the main config.

Local parameter settings

Being able to specify parameters as needed is certainly important. Therefore, there is an executor named setParameters that can alter the replaceStrings items. In the following example we show how to modify existing items. In addition, setParameters can add items in the beginning or at the end of the replaceStrings array as well as remove placeholders. The latter features are explained in the corresponding documentation section.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
{
    "withoutConnect": true,
    "replaceStrings": [
        ["$FIRST_MESSAGE$", "SECOND_MESSAGE"],
        ["SECOND_MESSAGE", "third message"],
        ["third message", "FINALLY"],
        ["FINALLY", "Finally, they are all equal."]
    ],
    "loadTasks": [
        "placeholder_task_lib.json"
    ],
    "tasks": [
        { "fromLib": "task1" },
        { "fromLib": "task2" },
        { "fromLib": "task3" },
        { 
            "do": "setParameters",
            "set": {
                "$FIRST_MESSAGE$": "Not anymore!",
                "SECOND_MESSAGE": "What happened?",
                "FINALLY": "This can be changed again."
            }
        },
        { "fromLib": "task1" },
        { "fromLib": "task2" },
        { "fromLib": "task3" },
        { 
            "do": "setParameters",
            "set": {
                "$FIRST_MESSAGE$": "SECOND_MESSAGE",
                "SECOND_MESSAGE": "FINALLY",
                "FINALLY": "All equal again."
            }
        },
        { "fromLib": "task1" },
        { "fromLib": "task2" },
        { "fromLib": "task3" }
    ]
}

And the output of this is:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
genexplain-api/docs/tutorial$ java -jar ../../build/libs/genexplain-api-1.0.jar exec loading_task_lib_local_replacements.json 
INFO  com.genexplain.api.app.APIRunner - Running command exec
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Finally, they are all equal.
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Finally, they are all equal.
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Finally, they are all equal.
INFO  c.genexplain.api.core.GxJsonExecutor - External output: Not anymore!
INFO  c.genexplain.api.core.GxJsonExecutor - External output: What happened?
INFO  c.genexplain.api.core.GxJsonExecutor - External output: This can be changed again.
INFO  c.genexplain.api.core.GxJsonExecutor - External output: All equal again.
INFO  c.genexplain.api.core.GxJsonExecutor - External output: All equal again.
INFO  c.genexplain.api.core.GxJsonExecutor - External output: All equal again.