Dynamic Expressions in Tool Descriptions
Create arguments that vary depending on features of the job, inputs, or outputs
This page contains expressions written in sbg:draft-2 version of the Common Workflow Language. For CWL v1.0 expressions, please see this page.
The tool editor allows you to describe the features and behavior of your command line tool. The resulting description is used to create an interface between your tool and other tools that can be run on the CGC.
On this page
Overview
If your tool has a certain behavior, controlled by a command line option argument, that you don't want to ever vary with executions, you can 'hard code' this into the tool description. On the other hand, if you want to chose the argument every time you run the tool, you can manually input it with each execution. But, in many cases, a middle way is more appropriate: although an argument should vary with executions, it is required do so in a deterministic way, dependent on other features of the tool execution.
To achieve this behavior, you can enter the argument in the tool description using a dynamic expression, rather than a literal. The dynamic expression can be given in terms of the inputs, outputs, or other aspects of a tool execution; the object that it refers to will then be determined at runtime on the basis of the values of these other objects.
Using dynamic expressions
Dynamic expressions are given in Javascript. You can enter a dynamic expression anywhere in the tool editor where you find the symbol </>. For example, you could enter the Javascript expression 1+2
, or a function body such as {return 1+2}
.
Places where you might want to use dynamic expressions include:
- Giving your tool's base command.
- Specifying the command options for your tool. Options related to tool inputs are set on the Inputs tab, and others are set in the Arguments field on the General Information tab.
- Specifying stdin and stdout.
- Setting the resources allocated to your tool (memory and CPU number).
- Setting up the glob pattern to catch outputs, on the Outputs tab.
Predefined Javascript objects:
Some Javascript objects have been defined in the scope of the tool execution, and are available for you to use in dynamic expressions to increase their expressivity. The objects, $job
and $self
, denote properties of the ongoing tool execution (the job), and the tool's inputs or outputs in a given execution. They are described below.
The $job object
The object $job
refers to an ongoing tool execution. It has the properties inputs
and allocatedResources
, which, in turn, have further properties that describe the execution in question.
-
inputs
This is an object that denotes the inputs for the tool. Recall that tool inputs include files and parameters.The inputs object has the properties<input_id>
and<input_value>
.<input_id>
is the ID that you have labeled each input port with on the tool editor.<input_value>
refers to the object that is entered into the port named<input_id>
.<input_value>
is set according to the following rule, illustrated in the table below: If the object entered into the port named<input_id>
is any data type other than a file, then<input_value>
is the object itself. If the object entered into the port<input_id>
is a file, then<input_value>
is four objects that describe, and uniquely determine, the file.
Data type input to the port named input_id | <input_value> |
---|---|
File | The following four objects are the input_value:class : this is always 'File'secondary files : this is a list of secondary files (index files) attached to the file. It may be empty.path : this is the file path.size : this is the file size. |
String | The string entered to the input port. |
Enum | The enumerator of the enumerated type that is entered to the input port labeled by <input_id> |
Int | The integer entered to this input port. |
Float | The float entered to this input port. |
Boolean | The boolean value entered to this input port. |
Array | The array entered to this input port. |
-
allocatedResources
This is the second property of the objectinputs
. it is an object with properties:cpu
andmem
:cpu
represents the number of CPU cores that have been allocated to the tool for the execution in question.mem
represents the memory in megabytes that has been allocated to the tool for the execution in question.
Required resources and
allocatedResources
Note that the required resources, entered under 'Resources' on the General Information tab differ from the
allocatedResources
object in$job
: for optimization, more resources may be allocated to the tool than it actually requires to function. The resources actually allocated depend on the particular execution environment, whereas the required resources are a hard requirement of the tool. So, the value entered as a required resource will be less than or equal to the value of theallocatedResources
object in$job
.
Inspecting the $job object
The objects in <input_value>
are defined at runtime, depending on which strings, ints, files, and so on are inputted to the tool. But we can use the Test tab to set dummy inputs, and then inspect the $job
object.
To see the $job
object for your tool description and test values:
- Click the cog icon for Settings in the top right hand corner of the Tool Editor
- Select the option </> Job JSON.
You will see a pop-up window displaying $job
, expressed in Javascript Object Notation (JSON), for the test values entered.
To get a better understanding of what input_id
and input_value
denote, let's create a tool description in the editor that has input ports for different selected data types, and then view the $job
.
Examples of the $job object
$job
example 1: The $job for a a tool with ports for multiple data types
$job
example 1: The $job for a a tool with ports for multiple data typesWe'll create a tool that has input ports that take a range of different data types as input, and see how this affects $job.inputs
.
For example, create the following ports on the Inputs tab:
- Port ID:
my_file
, Input type: file - Port ID:
my_string
, Input type: string - Port ID:
my_enum
, Input type: enum; enumerations: red, yellow, green blue - Port ID:
my_int
, Input type: int - Port ID:
my_boolean
, Input type: boolean
On the Test Tab you can set the values of these ports.
- Set
my_string
to 'here is a test string'; - Set
my_int
to 6; - Set
my_boolean
to True (indicated by checking the box); - Set
my_file
to test_file.etx; - Set
my_enum
to yellow (selected from the enumerations we defined when creating the port).
These settings are shown in the screenshot of the Test tab, below:
Now that we have set the values for the objects inputted to the tool ports, let's look at the $job
object. Click </> Job JSON under the settings cog icon in the top right corner.
The resulting $job object in JSON looks like this:
You can see that each input port is associated with the value of the object that is inputted to it, as set on the Test tab.
The input port for files, my_file
, is associated with the four objects that describe the file inputted to that port: path
, class
, size
, and secondaryFiles
. These associated objects are the <input_value>
for the input ports.
The allocatedResources
object in the $job
also contains the same values as set (by default) on the Test tab.
If we were to execute this tool, the values of <input_value>
would change to the actual strings, ints, files, and so on inputted to the tool.
$job
example 2: The $job for a tool whose ports are reads
and trim_size
$job
example 2: The $job for a tool whose ports are reads
and trim_size
Here is a more concrete example. Suppose that you have a tool whose input port IDs are reads
and trim_size
. In this case, reads
takes files as input type, and trim_size
takes integers. If we enter '10' for trim_size
, and 'reads_file.ext' for reads
in the Test Tab then the job object would look like this:
Examples of $job
usage
$job
usageEarlier, we listed fields in which you may want to use the $job object in dynamic expressions. Here are a few examples:
$job
usage example 1: Extracting the filesize
$job
usage example 1: Extracting the filesizeYou may need to set the tool's required memory (on the General Information tab) to be a function of the size of the inputted file. In this case, you can use the object $job.<input_id>.size
to pick out the size of the file inputted to the port <input_id>
.
$job
usage example 2: Extracting the number of threads
$job
usage example 2: Extracting the number of threads You may need to pass the number of CPUs allocated to the tool in as an argument, for instance, to be the value of the argument for thread number, num-threads
. In this case, in the Arguments field in the General Information tab, enter following object to pick out the CPUs allocated to the job at runtime:
$job.allocatedResources.cpu
This usage is illustrated below:
$job
usage example 3: Extracting file names and paths
$job
usage example 3: Extracting file names and pathsYou can use $job
to perform string manipulation on file paths to extract file names. If, for example, you are copying files to the current working directory of the job, you will need to specify: (a) the file name and (b) the file to be copied. (These are specified in the field marked [Create Files on the General Information tab](general-tool-information#create files); see the section on Attaching Index Files for information on this use case.)
(a) The file name for the file input to the tool is obtained from the path property of the <input_id> object for the input port that takes files. The path of the input file is:
$job.inputs.<input_id>.path
To get the file name, we want to perform string manipulation on the file path; break the path up by forward slashes, using Javascript's slice
operation, and then select just the final slice of the path. This string manipulation can be performed using the following expression:
$job.inputs.<input_id>.path.split('/').slice(-1)[0]
(b) We can refer to the file itself using the `<input_id> of the input port that takes files. The full expression to refer to the file is:
$job.inputs.<input_id>
Below is a screenshot showing how these expressions would be entered in the Create Files fields:
Finally, the $job
object is used to indicate errors in an imported tool. If a tool referred to in a tool description contains errors, such as a cyclical loop, these will be shown in the online editor and added to the JSON object displayed in the $job
JSON, with the property sbg:errors
.
The $self
object
$self
objectLike $job
, the $self
object is another hard-coded Javascript object you can use to create dynamic expressions in tool descriptions. $self
is used to modify the value of the input or output that gets passed to the command line. Its reference is different, depending on whether it occurs in command line bindings for the tool input or output.
In the command line bindings for the tool inputs, $self is set to the value of the input. In other words, in this context $self
is just $job.inputs.<input_id>
.
In the command line bindings for the tool outputs, $self
is an object with the properties path
and size
that match the globbed file (see here for more information on globbing).
Examples of $self
usage
$self
usageHere are a few examples of how you can use the $self
object:
$self
in the Inputs tab
-
Use
$self
for string manipulation. Suppose that your tool takes input files that are entered on the command line, but expects these to be given as filenames, not as file paths. Since$self
refers to the tool input, in the case that files are the input type$self
will refer to a full file path. By performing string manipulation on this file path, as in the example above, you can extract the filename. Then, enter this as the input value to the tool. -
Another use case for
$self
is to manipulate an integer input. Suppose that your tool expects integer inputs in exponent notation, but inputs are given as integers. In this case,$self
will refer to the integer inputted, and you can use an expression in terms of$self
in the Inputs value field to give the integer in its exponent representation.
$self
in the Outputs tab
- In addition to using $self to refer to a tool's inputs, you can use
$self.path
on the Outputs tab in the metadata field to specify to the path of the file matching the glob pattern. You might want to annotate output files with the metadata field 'file type', whose value depends on the particular extension of the file being output by the tool. In this case, set the metadata key tofile_type
and set the value to a Javascript expression that picks out the part of the filepath following the dot, i.e., the file extension. The following expression will do:
$self.path.split('.').slice(-1)[0]
Annotating output files with metadata
Metadata fields and their values are entered as key-value pairs on the outputs tab; for details, see the section Annotating output files with metadata.
The example just described is shown on the screenshot below:
Updated about 2 years ago