Make files available in your tool's working directory

This page will introduce a situation in which you may want to configure the standard procedure on the CGC for storing files.

Default handling of input and output files

Each tool in a workflow can only write its output files to its own working directory. The tool has write access to further directories, but no file written to any other directory can be reported as an output file. When the outputs of one tool are used as inputs for a second 'downstream' tool, the downstream tool has read only access to the first tool's working directory.

This default handling of files has several implications, which it is often convenient to override:

  1. A tool cannot, in general, write to its input files since they are not in the tool's working directory. If you need your tool to write to its input files, you can copy them to the tool's working directory.
  2. A tool cannot in general report one of its input files as an output file. If you need your tool to pass through an input file as an output (without modifying the file), you can create a symbolic link from the input file to the tool's working directory.
  3. The file paths of a tool's input files must be given relative to the tool's working directory. If you need to include the file paths of a tool's input files as arguments, it is often easier to use a symbolic link from the input files to the tool's working directory to simplify the file paths.

Stage Input is a setting on the Tool Editor that enables you to copy an input file to a tool's working directory, or to create a symbolic link from the file to the tool's working directory. See the documentation on stage input for more details.

This page will consider example 3. We will use Stage Input in order to avoid using the full file path to a tool's input file that is located in an upstream tool's working directory. In particular, consider an index file produced by an aligner indexer that is used as a tool's input. Aligner indexers are a set of indexing tools that index reference files and output a TAR archive containing the reference and the index file(s). To use the index files from an aligner indexer as inputs for your tool, you will first need to make the TAR archive available in your tool's current working directory, and then unpack the archive file to get the individual index files. In this case, Stage Input makes the command line significantly simpler as the archive unpack command can use the file name only, rather than the full path to the file.
The procedures below will explain how to use Stage Input to make the TAR archive containing index files available in your tool's working directory. To use index files as inputs, you need to set a single file staged input port for your tool, which would create a symbolic link from your tool's working directory to the TAR archive. This setting is made in the Tool Editor, which you can access by navigating to the Apps tab in the project that contains your tool, and then clicking the pencil icon next to the tool.

🚧

On this page

  • Default handling of input and output files
  • Make the indexing tool output available in your tool's current working directory
  • Configure the tool to unpack the TAR archive

Make the indexing tool output available in your tool's current working directory

  1. On the Inputs tab of the Tool Editor click the + button to add an input port.
  2. Set the ID of the input to e.g. reference. You can use another value as the ID (the field allows only alphanumeric characters and underscore), but note that you will need to modify the Javascript expression below to match the ID you have entered.
  3. Set the value in the Type field to File.
  4. In the Label field set the value which will be displayed in the visual interface.
  5. Set the value in the File Types field to .TAR.
  6. Under Stage Input select Link.
  7. Click Save.
    This will create a symbolic link in your tool's working directory to the archive containing the index files.

📘

The procedure above can be adapted to create a symbolic link from other input files into a tool's working directory. To adapt the procedure, make sure to replace .TAR in step 5 with the extension of the input file(s) you are using.

Once you have made the archive file available in your tool's working directory, configure your tool to unpack it.

Configure the tool to unpack the TAR archive

The following procedure explains how to configure your tool to unpack the input TAR archive.

  1. Navigate to the Apps tab in your project.
  2. Click the pencil icon next to the tool you want to configure.
  3. Navigate to the General tab in the Tool Editor.
  4. Click + in the Base Command section.
    If the field(s) in the Base Command section have already been populated, copy the content of each field to the first blank field below it, until the very first field in the section becomes blank.
  5. Click </> next to the first field.
  6. Paste the following code:
{
  var index_files_bundle = $job.inputs.reference.path.split('/').slice(-1)
  return 'tar -xf ' + index_files_bundle + ' ; '
}

The first line of the expression retrieves the name of the archive file using the $job object. The second line appends the retrieved file name to the command that will unpack the archive file.

🚧

This Javascript expression assumes that the ID of the input port that takes the TAR archive is reference. Please make sure to replace reference in the above code with the ID value of your tool's input port that takes the archive file.

  1. Click Save.
  2. Click Save in the top-right corner of the Tool Editor.
    Your tool is now configured to unpack a TAR archive it receives as its input.