About task execution

Upon execution, your task gets broken down into modular units called jobs. Each job is then executed when its inputs become available. Jobs are carried out on compute instances, a process orchestrated by the Scheduling Algorithm. You can control task execution by setting certain parameters that the Scheduling Algorithm bases its decisions on.

Jobs

Jobs are the various steps of your task execution and include tool executions, file uploading/downloading, and retrieval of Docker images.

A job is created for every tool execution. Most of the time a single tool will execute only once per workflow, thus giving a single job per task.

In some cases, however, a tool may be run multiple times to process individual parts of a large input more efficiently. This yields multiple jobs per tool. A tool may also be executed multiple times if one of its inputs is a list whose elements are to be processed in parallel.

Instances

When you run an analysis, it gets executed on a computational instance from the cloud infrastructure provider (Amazon Web Services or Google Cloud Platform).

The computation instances appear as remote computers capable of executing generic software and are referred to as instances.

As with any computer, an instance will have a number of CPU cores, memory, hard disk, and network resources.

The CGC provides the infrastructure that controls instances throughout the execution of your analysis.

Queueing

There are several cases in which a task can be temporarily queued:

The task has been just submitted and is awaiting execution.
The maximum number of parallel instances for your account has been reached.
Some of the cloud infrastructure resources required for task execution are not available.

In cases 2 and 3 above, task status will change back from QUEUED to RUNNING when the required parallel instances or cloud infrastructure resources become available. This change of task status can happen several times during execution.

When a task is queued due to reaching the maximum allowed number of parallel instances per user account, the time required for the task to change its state from QUEUED back to RUNNING can depend on several factors such as:

size of input files,
time it takes for the tool or workflow to execute,
availability of instances - e.g. whether the required instance type is available immediately.

To ensure that all users can run their tasks on the CGC, each individual user has a limit of 80 parallel instances. This instance limit is implemented because the number of parallel instances used in total by the CGC is limited by Amazon Web Services (AWS), the CGC’s underlying cloud service provider. Even though this means that tasks requiring more than 80 instances might take longer to complete, it ensures that instances are available for all CGC users to run their tasks.

The limit is applied as the cumulative maximum number of parallel instances per user, for all tasks in all projects created by the user. To understand how the limit works, please consider the following example:

User rfranklin has two projects on the CGC, named WGS and WES.
In WES, rfranklin is currently running a batch task that is using 56 parallel instances.
In WGS, rfranklin starts another batch task that requires 42 parallel instances. As the limit of 80 parallel instances is applied per user, this means that the task in WGS will be able to use only 24 instances (80 minus the 56 used by the task in WES), while the remaining instances are allocated as either of the two running tasks releases them.

Users who are added to a project also run their tasks within the project creator’s parallel instance limit. In the example above, if rfranklin finally adds user jsmith to one of the projects, WES or WGS, and jsmith tries to run a task, this task will be queued as rfranklin is already using the maximum allowed number of parallel instances.

For more general information about different task states, please refer to the list of task statuses.

Scheduling

The process of assigning and launching instances that fit the tool executions in a task is called scheduling.

The orchestration of job execution as well as instance provision is carried out by the our bespoke Scheduling Algorithm.

When a tool is about to be executed, the Scheduling Algorithm picks the best instance based on the tool's resource requirements. The Scheduling Algorithm may launch that job on an active instance, side-by-side with an earlier job from the same task, or it may decide to launch a new instance to host the new job.

Controlling execution

You can control the execution of your task by customizing parameters the Scheduling Algorithm works with.

The parameters you can tune are the types of instance(s) used for your analysis, as well as how many of each instance type can be used in parallel.

These parameters can be attached to tools and workflows that the Scheduling Algorithm will respect when deciding which tool runs where. Those parameters are called execution hints.

The CGC understands and implements tools, workflows, jobs and execution as prescribed by the Common Workflow Language (CWL).