API rate limit
Introduction
The API rate limit is a limit to the number of calls you can send to the CGC public API within a predefined time frame. That limit is 1000 requests within 5 minutes. After this limit is reached, no further calls are accepted by the API server, until the 5 minute interval ends.
All rate limit information is returned to the user in the following HTTP headers:
- The header
X-RateLimit-Limit
represents the rate limit - currently this is 1000 requests per five minutes. - The header
X-RateLimit-Remaining
represents your remaining number of calls before hitting the limit. - The header
X-RateLimit-Reset
- represents the time in Unix timestamp when the limit will be reset
To learn how you can write optimized code and make the most of the CGC public API regardless of the rate limit, please consult the examples below.
Each of the examples will first illustrate what an un-optimized code can look like and what makes it less ideal given the rate limit as well as a concrete recommendation on how you can optimize it.
- Submitting tasks for execution
- Copying files between projects
- Importing files from a volume
- Updating file metadata
- Deleting multiple files
- Exporting files to a volume
- Setting maximum pagination limit in queries
- Finding project by a name
Note that these examples assume that you are using the Python client for the CGC public API.
Submitting tasks for execution
There are two methods for starting multiple tasks on CGC:
- Submit tasks one by one inside a loop.
- Submit a batch task. A batch task can consist of many child tasks and can be created with a single API call, whereas submitting tasks inside a loop requires one API call for each task. Using batch tasks is therefore the recommended approach.
Not optimized for rate limit
In the example below, we iterate over samples and submit tasks one by one (in this case we run Salmon for RNA-seq analysis). This example assumes that we already grouped our input FASTQ files by samples in a dictionary.
Optimized for rate limit
In this example, which is optimized for rate limit, we run Salmon as a batch task for all input files at once using sample_id
as batching criterion. This example assumes that file metadata sample_id
is already set for all input files.
Copying files between projects
Instead of copying individual files, which will make one API call per file, we recommend using a bulk API call. This way you can copy up to 100 files with a single API call.
Not optimized for rate limit
Copying individual files requires two API calls for each file: one to find the file by name, and another one to copy it. We recommend using the bulk API call instead.
Optimized for rate limit
Using a bulk API call you can copy up to 100 files.
Importing files from a volume
The CGC API allows you to import files from a volume in bulk rather than one by one. Using the bulk API feature, you can import up to 100 files per call.
Not optimized for rate limit
Importing individual files requires two API calls for each file: one to find the file by name, and another one to import it. We recommend using the bulk API call instead.
Optimized for rate limit
Using the bulk API feature, you will first query all files that need to be imported and then use one API to import up to 100 files.
Updating file metadata
Metadata for multiple files can be set using a bulk API call instead of one call per file. Setting metadata for the files is typically required before they can be provided as input to a CWL workflow.
In the examples below, we will assume that there is a list of FASTQ files for a specific sample and we want to set both sample_id
and paired_end
metadata information for all of them.
Not optimized for rate limit
In an example which is not optimized for the rate limit we are iterating over all FASTQ files and setting metadata for each of the files individually.
Optimized for rate limit
An optimal way to update metadata for multiple files is to use a bulk API call and update metadata for up to 100 files per call.
Deleting multiple files
This example will show how you can delete multiple files with the API rate limit in mind. The optimal way to delete multiple files is via bulk API call which can delete up to 100 files.
Not optimized for rate limit
Fetch and delete files one by one using a loop.
Optimized for rate limit
Fetch all files at once and then use a bulk API call to delete them in batches of 100 files or less.
Exporting files to a volume
When exporting a file from the CGC to an attached volume, export is possible only to a volume that is in the same location (cloud provider and region) as the project from which the file is being exported.
The goal here is to export files from a CGC project to a volume (cloud bucket). Please note that export to a volume is available only via the API (including API client libraries), and through the Seven Bridges CLI.
Again, CGC bulk API calls should be used to reduce the overall number of API calls. Note that below examples make use of the copy_only
export feature, which requires advance_access
to be activated when initializing the API.
Not optimized for rate limit
In this example, files are fetched and exported in a loop, one by one.
Optimized for rate limit
Fetch and export files in bulk.
Setting maximum pagination limit in queries
Several API calls allow setting a pagination limit to the number of results that are returned. Changing the default pagination limit (50) to its allowed maximum value (100) cuts the number of required API calls in half when iterating over the entire result set of a query.
Pagination limits can be set for various API calls, but we recommend that you set it for the following queries as they tend to return the largest result sets:
- api.files.query()
- api.projects.query()
- api.tasks.query()
- task.get_batch_children()
Not optimized for rate limit
Here is an example for a project query that uses the default pagination limit of 50.
Optimized for rate limit
In the example below, the limit is set to its allowed maximum value of 100.
Finding project by name
Not optimized for rate limit
Iterate over all projects and compare names.
Optimized for rate limit
Use 'name' query parameter in search to restrict results. Query parameter performs partial match, so name comparison is still required to ensure the exact match.
Updated less than a minute ago