Create a tool

Creating a tool

  1. Navigate to your project.
  2. Click the Apps tab.
  3. Click Create app in the top-right corner. App creation dialog opens.
  4. Select Command Line Tool.
  5. In the Name field, enter a name for your tool.
  6. Select the CWL Version to use. The latest and recommended CWL version is selected by default. Read more about CWL formats and versions.
  7. Click Create. You are now taken to the tool editor, where you can start entering the tool information in the sections described below.

📘

For apps described using the sbg:draft-2 version of CWL you can switch back to the legacy editor by clicking your username in the top-right corner of the editor and selecting Switch back to the legacy editor.

Tool editor reference

Docker image

This is the location of the Docker image containing the tool.

If the image is in the CGC image registry, the format is cgc-images.sbgenomics.com/<repository>/<imagename>:<tag>, where:

  • <repository> is your CGC user name, all in lowercase, and with any non-alphanumeric character in your username replaced with an underscore (“_”).
  • <imagename> is the unique name of the image
  • <tag> by convention, is the version number of the image.

If the image is in Docker hub, the format is <docker-repository>:<tag>, where:

  • <docker-repository> is the name of the Docker repository containing the image
  • <tag> by convention, is the version number of the image.

Base command

The base command is the part of the command before any items you want to specify as inputs to be resolved at runtime. This will include the command itself, and usually the sub-command, if present. It may also include command options and parameters that you want to set to hard-coded values, provided these can be positioned at the start of the command line. Alternatively, hardcoded command options and parameters can be defined as arguments, which can have defined positions in the command line.

You can specify everything in the base command as a single section, or, if you are using the sbg:draft-2 version of CWL, you can build this up in multiple sections. If you want to include dynamic expressions in the base command (CWL sbg:draft-2 only), you will have to use separate sections for each fixed part and each dynamic expression as you can’t mix fixed items and dynamic items in the same section.

Note: If you are using CWL v1.x, you cannot include expressions in the base command, so any dynamic expressions need to be specified as an argument, not as part of the base command.

Arguments

Arguments are command options and parameters that you want to set to predefined values that are applied every time when the tool executes. Arguments can also include dynamic expressions. For example, you could use a fixed expression that generates the output file name based on the name of an input file.

If the argument could be positioned immediately after the base command in the command line, you could set it as part of the base command instead. But note that CWL v1.x does not support dynamic expressions in base commands, so if you need a dynamic expression you will have to use an argument.

Set the values in the object inspector as follows:

  • Use command line binding: whether to use command line binding. Set this option if you need to specify a prefix or a position (or both) for the argument, and clear it if the argument is a simple value or dynamic expression.
    • Prefix (shown if Use command line binding is selected): any prefix required by the argument. If the prefix is separated from the value by “=”, include “=” in the prefix string. Leave blank if not required.
    • Expression (shown if Use command line binding is selected): the value to use for the argument. This can be a dynamic expression.
    • Separate value and prefix (shown if Use command line binding is selected): whether there is a space between any prefix and the value.
    • Position (shown if Use command line binding is selected): the position in the command line to place this argument, if applicable. Position ordering includes both arguments and commands in a single numbering sequence. Position values are relative, not absolute, so, for example, if you had two positional arguments and one positional input, you could specify positions -1, 3 and 99. Defaults to 0 if not set.
  • Value (shown if Use command line binding is cleared): the value to use for this argument. This can be a dynamic expression.

Input ports

Input ports correspond to the variable parameters that can be set each time the tool is executed.

The following properties can be set in the object inspector for an input port:

  • Required: whether this input is mandatory.
  • ID: identifier of the input. This is shown against the input when the tool is placed in a workflow.
  • Type: the type of the input. Valid values are array, enum, record, File, string, int, long, float, double, boolean, map (a HashMap with string keys and string values, CWL sbg:draft-2 only), Directory (CWL v1.x only) and stdin.
  • Items type (only shown if Type is array): The type of the items in the array. Valid values are *enum, record, File, string, int, long, float, double, boolean, and Directory** (CWL v1.x only).
  • Allow array as well as single item (not shown if Type is array): whether the input is capable of accepting multiple items. If you set Type to a non-array type, and set Allow array as well as single item, the input can either be a single item or an array, whereas if you set Type to Array, and set Items type to an element type, execution will fail if a single item is supplied instead of an array.
  • Symbols (only shown if Type is enum, or Type is array and Item Type is enum): a list of the valid values for the input item.
  • Include in command line (not shown for map): whether to include this input item in the command line. Usually this will be set, as most inputs to the tool will be included in the command line, but occasionally you may want to supply an input to the tool which can be used in dynamic expressions to define other inputs, but doesn’t appear in the command line itself.
  • Value Transform (not shown if Type is record, or Type is array and Item Type is record): the value to assign to this input. This is optional (the value for an input port will commonly be set when the tool is executed). If supplied, this must be a dynamic expression, which will probably be constructed to transform the value supplied for that input. In CWL v1.x apps, if no input data is set during execution, expressions entered in this field will not be evaluated. To still have a value in the command line when there is no input data, please use Arguments.
  • Prefix: any prefix required to identify the input. Leave blank if not required.
  • Array Item Prefix (only shown if Type is array): the prefix to add before each item in the array. Available for CWL 1.x apps only. For the matching functionality in sbg:draft-2 apps, see Item Separator below.
  • Separate value and prefix: whether there is a space between any prefix and the value.
  • Position: the position in the command line to place this input, if applicable. Position ordering includes both arguments and commands in a single numbering sequence. Position values are relative, not absolute, so, for example, if you had two positional arguments and one positional input, you could specify positions -1, 3 and 99. Defaults to 0 if not set.
  • Item separator (only shown if Type is array): the separator to use between the elements in the array. One of equal, comma, semicolon, space or repeat (this sets the separator to null, and repeats the prefix in front of every item in the array. Available only for sbg:draft-2 apps).

Stage input

  • Stage input (CWL sbg:draft-2 only, and only shown if Type is record or File, or if Type is array and Item Type is record or File): whether to make any file data available in the tool’s working directory. One of None (the file is available as an input to the tool, but isn’t copied into the tool’s working directory) Copy (the file is copied into the tool’s working directory) or Link (a symlink to the file is copied to the tool’s working directory). You can read more about using staged inputs here.

Learn more about staging inputs in CWL v1.x apps.


Secondary files

  • Secondary files (only shown if Type is File, or Type is array and Item Type is File, and, for CWL sbg:draft-2, if Include in command line is set): an optional file extension for a secondary file related to this file, if the tool supports it, for example an index file associated with a BAM file. Defined secondary file settings are also kept when the tool is placed in a workflow. You can read more about using secondary files here.

Description

  • Description: more information about the functionality of the input port. This is provided to help the user of the tool when the tool is placed in a workflow, and does not affect operation of the tool.
    • Label: If provided, this value is shown against the input port in the workflow editor.
    • Description: More extensive text description of the input port.
    • Alternative Prefix: If a single input has two possible command line prefixes, e.g. -b and --bam, the second prefix can be entered here for information purposes.
    • Category: Descriptive text field that can be used to organize inputs - multiple inputs can be grouped in one category.
    • File type(s) (if the input port Type is File): specifies the valid file types that can be connected to this port (for example, TXT, BAM).

Test value

  • Any value entered in this field is used for two purposes:
    • To evaluate an expression, if used to generate the value of the given input field.
    • To be displayed in the command line preview.

Output ports

Output ports correspond to the items that are produced when the tool is executed.

The following properties can be set in the object inspector for an output port. For more information, see the details on output ports.

Set the values in the object inspector as follows:

  • Required: whether this output is mandatory.
  • ID: identifier of the output. This is shown against the output when the tool is placed in a workflow.
  • Type: the type of the output. Valid values are array, enum, record, File, string, int, long, float, double, boolean, and Directory (CWL v1.x).
  • Items type (only shown if Type is array): The type of the items in the array. Valid values are enum, record, File, string, int, long, float, double, boolean, and Directory (CWL v1.x).
  • Allow array as well as single item (not shown if Type is array): whether the output is capable of accepting multiple items.
  • Symbols (only shown if Type is enum, or Type is array and Item Type is enum): a list of the valid values for the output item.
  • Glob: A wildcard pattern for the names of the files in the tool’s working directory that will be associated with this output port. For example, setting a value of *.bam will associate all BAM files with this output port. Note that subfolders are not searched recursively, although you can specify a subfolder as part of the glob syntax if required. You can read more about using globs here.

Metadata

  • Metadata (only shown if Type is File, or Type is array and Item Type is File): the metadata to annotate the file with. The Inherit field allows you to specify which input file to inherit the data from, and you can also add additional metadata as Key, Value pairs. You can read more about metadata on the CGC here, including best practices and predefined metadata fields.

Output eval

  • Output eval: a JavaScript expression that will be used to access and manipulate the first 6 KB of the output file or files. Load content specifies whether the content is actually loaded, and hence can be manipulated, or just read. You can see an example here.

For CWL 1.x tools, the Output Eval field will be populated with the expression that will be used for inheriting metadata. If there is a conflict, you will see a warning under the ID field and will be able to solve the issue by editing the expression in the Output Eval field.

Secondary files

  • Secondary files (only shown if Type is File, or Type is array and Item Type is File): an optional file extension for a secondary file related to this file, if the tool supports it, for example an index file associated with a BAM file. You can read more about using secondary files here.

Description
Contains text fields that help describe the functionality of the output port in more detail. For specific information about each of the fields, please see Description in the Input ports section above.

Computational Resources

Computational resources specify the minimum resource requirements for the tool. The execution will fail if the resources specified here cannot be allocated.

  • Memory [MiB] (min): the minimum amount of memory required to execute the tool. This can include a dynamic expression.
  • CPU (min): the minimum number of CPUs required to execute the tool. You can select between single-thread and multi-thread options, or set a custom number of CPUs. The Custom option can include a dynamic expression.

Hints

Hints specify execution requirements and suggestions, for example, the AWS instance type to use.

File Requirements

These are any additional files the tool needs to run that aren’t already included in the Docker container. For example, you could specify a configuration file here instead of installing it in the Docker container. Any files specified here will be created in the working directory when the tool executes. File requirements can be entered using one of the two options, File or Expression.

If you select File, the following options are available:

  • Writable: When staging input files, if set to Yes, a copy of the will be created. If set to No, a link to the file will be created.
  • File name: the name of the file to be created when the tool executes. This can be a dynamic expression.
  • File content: the contents of the file. This can include dynamic expression.

If you select Expression, the expression editor will open and allow you to enter a dynamic expression that returns a file.

Other

The stream files are optional. They allow you to specify files to use for standard input or standard output.

  • Stdin redirect: If you want to pipe data into standard input when the tool runs, enter the name of the file containing the data here. The filename can include a dynamic expression.
  • Stdout redirect: If you want to save the standard output from the tool, enter the name of the file you want to use here. The filename can include a dynamic expression.
  • Success Codes: Define which app exit code(s) will be treated as success codes.
  • Temporary Fail Codes: Define which app exit code(s) will be treated as temporary fail codes.
  • Permanent Fail Codes: Define which app exit code(s) will be treated as indicators of permanent failure.
  • Tool Time Limit: The upper time limit of tool execution, in seconds. If set to 0, it means that there is no upper limit. Negative values will result in an error.
  • Work Reuse: If set to No, the tool will never be able to use memoization (work reuse). Default: Yes.

App Info tab

The App Info tab contains information about the tool. This information is shown when the tool is placed in a workflow.

Hover over a field to see the Edit option.