↳ Browse datasets via the Datasets API
QUERY DATASETS > About the Datasets API > Browse datasets via the Datasets API
Advance Access
This feature is in our advance access program. This means that, while it is fully operational, it is subject to change.
Seven Bridges is committed to providing Cavatica users with up-to-date versions of the datasets that are available from the NCI Genomic Data Commons (GDC). The currently available version of this dataset corresponds to GDC Data Release 31.
More information about the data in this release can be found in the GDC Data Release Notes.
Learn more about our policies regarding updates to the GDC datasets.
On this page:
Overview
Browse datasets via the Datasets API by issuing successive GET
requests. Use these browsing requests individually or in conjunction with querying. For instance, entities located through browsing are resources which can be the subject of a query.
On this page, learn about requests to return:
- datasets you can access
- all the entities within a dataset
- all instances of a single entity
- a single entity's metadata schema
Return accessible datasets
Make the following GET
request. Be sure to replace the authentication token with your own.
This returns a list of accessible datasets, as shown below.
The href
element lists the path for each dataset, such as https://cgc-datasets-api.sbgenomics.com/datasets/ccle/v0
for CCLE.
Return all entities within a dataset
Make a GET
request to the href
of a dataset to return all of its entities. Learn more about each dataset's entities from its metadata page.
The response contains a list of entities for the dataset you specified.
Note that two items in the list returned, query
and self
, are not dataset entities:
self
contains the propertyhref
which is set to the same path issued in making the query.query
contains the propertyhref
which is set to a path that can be used to issue a query into the entities returned, by making aPOST
request. Learn more about querying with the Datasets API.
Return all instances of an entity
Make a GET
request to the href
of an entity to return all instances of that entity within the specified dataset. For example, to see a list of all TCGA files, make the following request.
This returns the following response. As you can see, the response contains 100 results (count) per page. You can page through using the paths under _links
. The resulting files are listed under the _embedded
section.
Return an entity's metadata schema
Make a GET
request to an entity's href to obtain its schema. Each entity has its own metadata schema, a list of the metadata fields used to describe the entity and the permissible datatypes (strings, integers, etc) that each field takes.
Each metadata field, such as hasDataType
in the request below, is followed by an object which indicates the type of value for each field, such as integer
, string
, or enum
. If the type is given as enum
, the object also contains a list of all possible values for the given metadata field.
Additionally, note that under _links
, there is a list of connections to other entities in the dataset. In the example below, these connections include "hasAliquot"
, "hasCase"
, "hasSample"
, "hasPortion"
, and "hasAnalyte"
.
For example, to see the metadata schema for files
send the request:
This returns the following response which lists all the properties of the file
entity:
Next step
Browsing requests can be used in conjunction with querying. For instance, entities located through browsing are resources which can be the subject of a query. Learn more about querying via the Datasets API.
Resources
Updated almost 3 years ago