TCIA Metadata

Metadata is data that describes other data. On this page, we've detailed th metadata that are available for viewing and filtering the The Cancer Imaging Archive (TCIA) data in the Data Browser on the CGC. The TCIA data available on the CGC are Open Access radiological imaging data generated from The Cancer Genome Atlas (TCGA).

TCIA metadata on the CGC consist of entities and their properties.

Entities are particular resources with UUIDs, such as files, cases, samples, and cell lines.

Properties can either describe an entity or relate that entity to another entity. For instance, properties include an entity's vital status, gender, data format, or experimental strategy.

Entities for TCIA include:

  • investigation
  • case
  • file
  • study

Below, each of these entities is followed by a table of their related properties.

Investigation

The investigation entity represents the project or study that generated the data. Members of the investigation entity can be identified by a Universally Unique Identifier (UUID). Find the properties of the investigation entity below.

PropertyDescription
Disease typeThe type of the disease or condition studied. See NCI Thesaurus Code: C2991.

Case

The case entity represents a patient. Members of the case entity are subjects who have taken part in an investigation or program and can be identified by a Universally Unique Identifier (UUID). See the table below for the clinical properties and descriptions of the case entity.

PropertyDescription
Submitter IDUsually a human-readable identifier, such as a number or a string that may contain metadata information. In some instances, this can also be a UUID.
GenderThe collection of behaviors and attitudes that distinguish people on the basis of the societal roles expected for the two sexes. See NCI Thesaurus Code: C17357.
Age at diagnosisThe age in years of the case at the initial pathological diagnosis of disease or cancer. See NCI Thesaurus Code: C15220.

File

The file entity refers to the files in TCIA. Members of the file entity can be identified by a Universally Unique Identifier (UUID). Find the properties of the file entity below.

Property of FileDescription
Submitter IDUsually a human-readable identifier, such as a number or a string that may contain metadata information. In some instances, this can also be a UUID.
File typeThe type of file which stores the data.
Data formatThe format of the data.
Access levelA Boolean value indicating Controlled Data or Open Data. Controlled Data is data from public datasets that has limitations on use and requires approval by dbGaP. Open Data is data from public datasets that doesn't have limitations on its usage within the CGC.
Data categoryThe classification of data used in the analysis, based on its form and content. In TCIA dataset it always has a value ‘Imaging’.
Series dateThe date the Series started.
Series numberA number that identifies this Series.
Series instance UUIDUnique identifier of a Series that is part of this Study.
Series descriptionUser provided description of the Series.
Image countComputed number of images in this series.
Software versionManufacturer's designation of the software version of the equipment that produced the composite instances.
ManufacturerName of the manufacturer of the equipment that produced the composite images.
ModalityType of equipment that originally acquired the data.
Protocol nameUser-defined description of the conditions under which the Series was performed.
Manufacturer model nameManufacturer's model name of the equipment that produced the composite images.
Body part examinedText description of the part of the body examined.

Study

The study entity represents a set of images collected for a specific trial or other reason. Members of the study entity can be identified by a Universally Unique Identifier (UUID). Find the properties of the study entity below.

IDUsually a human-readable identifier, such as a number or a string that may contain metadata information. In some instances, this can also be a UUID.
Study dateThe date the Study started.
Study descriptionThe institution-generated description or classification of the Study (component) performed.
Series countThe computed number of series.