TCIA data
Overview
The Cancer Imaging Archive (TCIA) contains radiological imaging data from The Cancer Genome Atlas (TCGA) and is part of an effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects. TCIA includes radiological images which represent 21 types of cancer detailed in TCGA. All images are accessible for public use. These images are de-identified to ensure that images are free of protected health information (PHI), and are stored in a standard DICOM format.
Distribution of the data
See below for an overview of the number of subjects and the image modalities (such as MRI or CT) of the data, grouped by different cancer types (“Collections”). See a full list of cancer type abbreviations and a full list of DICOM image modality abbreviations.
Collection | Subjects | Modalities |
---|---|---|
TCGA-KIRC | 267 | CT, MR, CR |
TCGA-GBM | 262 | MR, CT, DX |
TCGA-LGG | 199 | MR, CT |
TCGA-HNSC | 192 | CT, MR, PT, RTSTRUCT, RTPLAN, RTDOSE |
TCGA-OV | 143 | CT, MR |
TCGA-BRCA | 139 | MR, MG |
TCGA-BLCA | 97 | CT, CR, MR, PT |
TCGA-LIHC | 97 | MR, CT, PT |
TCGA-LUAD | 69 | CT, PT, NM |
TCGA-UCEC | 58 | CT, CR, MR, PT |
TCGA-CESC | 54 | MR |
TCGA-STAD | 46 | CT |
TCGA-LUSC | 37 | CT, NM, PT |
TCGA-KIRP | 33 | CT, MR, PT |
TCGA-COAD | 25 | CT |
TCGA-ESCA | 16 | CT |
TCGA-KICH | 15 | CT, MR |
TCGA-PRAD | 14 | CT, PT, MR |
TCGA-THCA | 6 | CT, PT |
TCGA-SARC | 5 | CT, MR |
TCGA-READ | 3 | CT, MR |
TCIA Metadata
Each TCIA file on the CGC contains a set of images acquired during the same scanning mode in a compressed file format. The following metadata are also set for each file when available:
Property | Description |
---|---|
Case UUID | A Universally Unique Identifier (UUID) for the sample or files of a case. |
Case ID | A human-readable identifier, such as a number or a string that may contain metadata information. This identifier is often referred as submitter ID. |
Ethnicity | A socially defined category of people based on common ancestral, cultural, biological, and social factors. See NCI Thesaurus Code: C29933. |
Gender | The collection of behaviors and attitudes that distinguish people on the basis of the societal roles expected for the two sexes. See NCI Thesaurus Code: C17357. |
Race | A classification of humans characterized by certain heritable traits, common history, nationality, or geographic distribution. See NCI Thesaurus Code: C17049. |
Investigation | A value denoting the project or study that generated the data. See NCI Thesaurus Code: C41198. |
Age at diagnosis | The age in years of the case at the initial pathological diagnosis of disease or cancer. See NCI Thesaurus Code: C15220. |
Primary site | The anatomical site where the primary tumor is located in the organism. See NCI Thesaurus Code: C43761. |
Disease type | The type of the disease or condition studied. See NCI Thesaurus Code: C2991. |
Vital status | The state of being living or deceased for cases that are part of the investigation. See NCI Thesaurus Code: C25717. |
Days to death | The number of days from the date of the initial pathological diagnosis to the date of death for the case in the investigation. |
Series date | Date the Series was acquired. |
Manufacturer | Manufacturer's name of the equipment that produced the composite instances. |
Body part examined | Text description of the part of the body examined. |
Modality | Type of equipment that originally acquired the data. |
Protocol name | User-defined description of the conditions under which the Series was performed. |
Manufacturer model name | Manufacturer's model name of the equipment that produced the composite instances. |
Series description | User provided description of the Series. |
Software versions | Manufacturer's designation of software version of the equipment that produced the composite instances. |
Image count | Number of images in this series. |
Access TCIA data
Access a repository of TCIA files via the TCIA public project or the Data Browser.
Updated over 2 years ago