ICGC metadata
Overview
Metadata is data that describes other data. On this page, we've detailed ICGC metadata that are available for viewing and filtering ICGC data in the Data Browser on the CGC. ICGC metadata on the CGC consists of properties which describe the entities of the ICGC dataset and their values.
Entities are particular resources with UUIDs, such as files, cases, samples, and cell lines.
Properties can either describe an entity or relate that entity to another entity. For instance, properties include an entity's vital status, gender, data format, or experimental strategy.
The ICGC PCAWG Study dataset includes data from 20 different research projects conducted at participating centers around the world, and differences exist in the ontologies used across centers. Note that all metadata values assigned by ICGC research projects are provided via the CGC without modification. When identifying patient cohorts for further study, researchers are encouraged to investigate the full set of available metadata values to ensure that queries return all relevant Cases, Samples, or similar.
Entities for ICGC
The following are entities for ICGC. They represent clinical data, biospecimen data, and data about ICGC files. Learn more about ICGC Data.
- donor
- exposure
- family
- file
- project
- sample
- specimen
- surgery
- therapy
Below, each of these entities is followed by a table of their related properties.
Donor
The ICGC donor entity represents the subject who has taken part in the investigation/program. Members of the donor entity can be identified by a Universally Unique Identifier (UUID). Find the properties of the donor entity below. Note that once you copy an ICGC file into a project on the CGC, metadata information pertaining to the donor entity will display under the case label on the file's page.
Property | Description |
---|---|
Age at diagnosis | Age at primary diagnosis in years. |
Age at diagnosis group | Age at primary diagnosis group, range given in years. |
Age at enrollment | Age (in years) at which first specimen was collected. |
Age at last follow up | Age (in years) at last followup. |
Cancer type prior malignancy | ICD-10 diagnostic code for type of cancer in a prior malignancy. |
Disease status at last followup | Donor's last known disease status. |
Donor analysis type | The type of analysis performed on the donor's sample. |
ICD-10 diagnostic code | ICD-10 diagnostic code for donor. |
Gender | Donor's biological sex. 'Other' has been removed from the controlled vocabulary due to identifiability concerns. |
History of first degree relative | Indicates if the patient has a first degree relative with cancer |
Interval of last follow up | Interval from the primary diagnosis date to the last followup date, in days. ICGC requests that patients be followed up every 6 months while alive. |
Primary Site | The anatomical site where the primary tumour is located in the organism. |
Prior Malignancy | Prior malignancy affecting patient. |
Relapse interval | If the donor was clinically disease free following primary therapy, and then relapse or progression (for liquid tumours) occurred afterwards, then donor_relapse_interval is the length of disease free interval, in days. |
Relapse type | Type of relapse or progression (for liquid tumours), if applicable. |
Submitted donor ID | Usually a human-readable identifier, such as a number or a string that may contain metadata information. |
Survival time | How long has the donor survived since primary diagnosis, in days. |
Tumour stage at diagnosis | This is the pathological tumour stage classification made after the tumour has been surgically removed, and is based on the pathological results of the tumour and other tissues removed during surgery or biopsy. This information is not expected to be the same as donor's tumour stage at diagnosis since the pathological tumour staging information is the combination of the clinical staging information and additional information obtained during surgery. For this field, please indicate pathological tumour stage value using indicated staging system. |
Tumour stage supplemental | Optional additional staging at the time of diagnosis. |
Tumour staging system at diagnosis | Clinical staging system used at time of diagnosis, if determined. This is supplementary to specimen’s pathological staging. |
Vital status | Donor's last known vital status. |
State | Indicates the state of the donor. |
Study | The study the donor is involved in. |
Exposure
The exposure entity represents details about a donor's antecedent environmental exposures, such as smoking history. See the table below for the clinical properties and descriptions of the exposure entity.
Property | Description |
---|---|
Alcohol history | A response to the question that asks whether the participant has consumed at least 12 drinks of any kind of alcoholic beverage in their lifetime. See CDE (Common Data Element) Public ID: 2201918. Also: A description of an individual's current and past experience with alcoholic beverage consumption. See NCI Thesaurus Code: C81229. |
Alcohol history intensity | A category to describe the patient's current level of alcohol use as self-reported by the patient. See CDE (Common Data Element) Public ID: 3457767. |
Exposure intensity | Extent of the exposure. Use this field to specify intensity of exposure submitted in 'Exposure type' field. |
Exposure type | Type of exposure. This field can be used if the donor was exposed to something other than tobacco or alcohol. |
Tobacco smoking history indicator | Donor's smoking history. |
Tobacco smoking intensity | Smoking intensity in Pack Years: Number of pack years defined as the number of cigarettes smoked per day times (x) the number of years smoked divided (/) by 20. |
Family
The family entity represents details of the family history of the donor. Find the properties of the family entity below.
Property | Description |
---|---|
Relationship age | Age of the donor's relative at primary diagnosis (in years). |
Relationship disease | Name of the donor'zs relative's disease. |
Relationship disease ICD-10 | ICD-10 code of disease affecting family member specified in the 'relationship type' field. |
Relationship sex | Biological sex of the donor's relative |
Relationship type | Relationship to the donor, which can be parent, sibling, grandparent, uncle/aunt, cousin, other or unknown. |
Relationship type other | Relationship to the donor, if the relationship type is ‘other’. |
Relative with cancer history | Indicates whether the donor has a relative with a history of cancer. |
File
The file entity represents the data files generated as part of this study. Members of the file entity can be identified by a Universally Unique Identifier (UUID). Find the properties of the file entity below.
Property | Description |
---|---|
File analysis type | The type of analysis applied to the sample from the donor. |
Experimental strategy | The method or protocol used to perform the laboratory analysis. See NCI Thesaurus Code: C43622. |
Genome build | The reference genome or assembly (such as HG19/GRCh37 or GRCh38) to which the nucleotide sequence of a case/subject/sample can be aligned. |
File size | The size of a file measured in bytes (B), kilobytes (KB), megabytes (MB), gigabytes (GB), terabytes (TB), and larger values. |
Study | The study the donor is involved in. |
Access level | A Boolean value indicating Controlled Data or Open Data. Controlled Data is data from public datasets that has limitations on use and requires approval. Open Data is data from public datasets that doesn't have limitations on its use. |
File name | FIle name. |
External file ID | An identifier pointing to an external file. |
External object ID | An identifier pointing to an external object. |
Project
The project entity represents the project that generated the data. Members of the project entity can be identified by a Project Identifier which is generated from the project name (e.g. Breast Triple Negatice/Lobular Cander - UK BRCA-UK).
Find the properties of the project entity below. Note that once you copy an ICGC file into a project on the CGC, metadata information pertaining to the project entity will display under the investigation label on the file's page.
Property | Description |
---|---|
Partner country | Partner country of the cancer project. |
Primary country | Lead country of the cancer project. |
Primary site | The anatomical site where the primary tumour is located in the organism. |
Project name | Name of the project which generated the data. |
Pubmed ID | ID of the publication at www.ncbi.nlm.nih.gov/pubmed/. |
State | Indicates the state. |
Tumour type | The type of the cancer studied. |
Tumour subtype | Information about tumour type. |
Sample
The sample entity represents samples or specimen material taken from a biological entity for testing, diagnosis, propagation, treatment, or research purposes. For instance, samples include tissues, body fluids, cells, organs, embryos, and body excretory products. Members of the sample entity can be identified by a Universally Unique Identifier (UUID). Find the properties of the sample entity below.
Property | Description |
---|---|
Submitted sample ID | Usually a human-readable identifier, such as a number or a string that may contain metadata information. In some instances, this can also be a UUID. Note that once you copy an ICGC file into a project on the CGC, metadata information pertaining to the Sample ID property will display under the Aliquot Sample ID and Portion Sample ID labels on the file's page. |
Analyzed sample interval | Interval from specimen acquisition to sample use in an analytic procedure (e.g. DNA extraction), in days. |
Study | Study donor is involved in. |
Level of cellularity | The proportion of tumour nuclei to total number of nuclei in a given specimen/sample. If exact percentage cellularity cannot be determined, the submitter has the option to use this field to specify a level that defines a range of percentage |
Percentage of cellularity | The ratio of tumour nuclei to total number of nuclei in a given specimen/sample. |
Specimen
The specimen entity represents information about a specimen that was obtained from a donor. There may be several specimens per donor that were obtained concurrently or at different times. Find the properties of the specimen entity below.
Property | Description |
---|---|
Digital image of stained section | Linkout(s) to digital image of a stained section, demonstrating a representative section of tumour. |
Level of cellularity | The proportion of tumour nuclei to total number of nuclei in a given specimen/sample. If exact percentage cellularity cannot be determined, the submitter has the option to use this field to specify a level that defines a range of percentage. |
Percentage of cellularity | The ratio of tumour nuclei to total number of nuclei in a given specimen/sample. |
Submitted specimen ID | Usually a human-readable identifier, such as a number or a string that may contain metadata information. In some instances, this can also be a UUID. Note that once you copy an ICGC file into a project on the CGC, metadata information pertaining to the Submitted specimen ID property will display under the Sample Submitter ID label on the file's page. |
Specimen available | Whether additional tissue is available for followup studies. |
Specimen biobank | If the specimen was obtained from a biobank, provide the biobank name here. |
Specimen biobank ID | If the specimen was obtained from a biobank, provide the biobank accession number here. |
Specimen processing | Description of technique used to process specimen. |
Specmen processing other | If other technique specified for specimen processing, may indicate technique here. |
Specimen interval | Interval (in days) between specimen acquisition both for those that were obtained concurrently and those obtained at different times. |
Specimen storage | Description of how the specimen was stored. |
Specimen storage other | If other types of storage are specified for specimen storage, may indicate technique here. |
Specimen type | Controlled vocabulary description of specimen type. |
Specimen type other | Free text description of the specimen type. |
Treatment type | Type of treatment the donor received prior to specimen acquisition. |
Treatment type other | Freetext description of the treatment type. |
Tumour confirmed | Whether tumour was confirmed in the specimen as malignant by histological examination. |
Tumour grade | Tumour grade using indicated grading system. |
Tumour grading system | Name of the tumour grading system. |
Tumour stage supplemental | Optional additional staging. For donor, it should be at the time of diagnosis. |
Tumour histological type | WHO International Histological Classification of Tumours code. |
Tumour stage | This is the pathological tumour stage classification made after the tumour has been surgically removed, and is based on the pathological results of the tumour and other tissues removed during surgery or biopsy. This information is not expected to be the same as the donor's tumour stage at diagnosis since the pathological tumour staging information is the combination of the clinical staging information and additional information obtained during surgery. For this field, please indicate pathological tumour stage value using the indicated staging system. |
Tumour stage supplemental | Optional additional staging. |
Tumour staging system | Nam e of the tumour staging system used. |
Surgery
The surgery entity represents details about surgical procedures undergone by the donor. Find the properties of the surgery entity below.
Property | Description |
---|---|
Procedure interval | Interval between primary diagnosis and procedure, in days. |
Procedure site | Anatomical site of the procedure. This must use a standard controlled vocabulary which should be reported in advance to the DCC. |
Procedure type | Controlled vocabulary description of the procedure type. Vocabulary can be extended by disease-specific projects. Prefix extensions with 3-digit center code, e.g. 008.1 Beijing Cancer Hospital, fine needle aspiration of primary. |
Resection status | One of three possible categories that describes the presence or absence of residual tumour following surgical resection. |
Therapy
The therapy entity represents details about the type and duration of the therapy the donor received. Find the properties of the therapy entity below.
Property | Description |
---|---|
First therapy duration | Duration of first postresection therapy, in days. |
First therapy response | The clinical effect of the first postresection therapy. |
First therapy start interval | Interval between primary diagnosis and initiation of the first postresection therapy, in days. |
First therapy therapeutic intent | The therapeutic intent of the first postresection therapy. |
First therapy type | Type of first postresection therapy (i.e. therapy given to the patient after the sample was removed from the patient). |
Other therapy | Other postresection therapy. |
Other therapy response | The clinical effect of the other postresection therapy. |
Second therapy duration | Duration of second postresection therapy, in days. |
Second therapy response | The clinical effect of the second postresection therapy. |
Second therapy start interval | Interval between primary diagnosis and initiation of the second postresection therapy, in days. |
Second therapy therapeutic intent | The therapeutic intent of the second postresection therapy. |
Second therapy type | Type of second postresection therapy (ie. therapy given to the patient after the sample was removed from the patient). |
Updated less than a minute ago