CPTAC metadata
ABOUT METADATA FOR DATASETS > CPTAC metadata
On this page:
Overview
Metadata is data that describes other data. On this page, we've detailed CPTAC metadata that are available for viewing and filtering CPTAC data in the Data Browser and the Datasets API. CPTAC metadata on the CGC consists of properties which describe the entities of the CPTAC dataset.
Entities are particular resources with UUIDs, such as files, cases, samples, and cell lines.
Properties can either describe an entity or relate that entity to another entity. For instance, properties include an entity's vital status, gender, data format, or experimental strategy.
Entities for CPTAC
The following are entities for CPTAC. Learn more about CPTAC data.
- investigation
- case
- demographic
- diagnosis
- sample
- portion
- file
- protocol
Below, each of these entities is followed by a table of their related properties.
Investigation
The investigation entity represents the project or study that generated the data. Members of the investigation entity can be identified by a Universally Unique Identifier (UUID). Find the properties of the investigation entity below.
Property | Description |
---|---|
Disease type | The type of the disease or condition studied. See NCI Thesaurus Code: C2991. |
Primary site | The anatomical site where the primary tumor is located in the organism. See NCI Thesaurus Code: C43761. |
Case
The case entity represents TCGA cases. Members of the case entity are subjects who have taken part in an investigation or program and can be identified by a Universally Unique Identifier (UUID). See the table below for the clinical properties and descriptions of the case entity.
Property | Description |
---|---|
Submitter ID | Usually a human-readable identifier, such as a number or a string that may contain metadata information. In some instances, this can also be a UUID. |
Demographic
The demographic entity represents the statistical characterization of human populations or segments of human populations (e.g., characterization by age, sex, race, or income) and can be identified by a Universally Unique Identifier (UUID). Find the properties of the demographic entity below.
Property | Description |
---|---|
Ethnicity | A socially-defined category of people based on common ancestral, cultural, biological, and social factors. See NCI Thesaurus Code: C29933. |
Race | A classification of humans characterized by certain heritable traits, common history, nationality, or geographic distribution. See NCI Thesaurus Code: C17049. |
Gender | The collection of behaviors and attitudes that distinguish people on the basis of the societal roles expected for the two sexes. See NCI Thesaurus Code: C17357. |
Diagnosis
The diagnosis entity represents the investigation, analysis, or recognition of the presence and nature of a disease, condition, or injury from expressed signs and symptoms. A diagnosis can be identified by a Universally Unique Identifier (UUID). Find the properties of the diagnosis entity below.
Property | Description |
---|---|
Age at diagnosis | The age in years of the Case at the initial pathological diagnosis of the disease or cancer. See NCI Thesaurus Code: C15220. |
Days to death | The time interval from a person's date of death to the date of initial pathologic diagnosis, represented as a calculated number of days. See CDE (Common Data Element) Public ID: 3165475. |
Vital status | The state of being living or deceased for Cases that are part of the investigation. See NCI Thesaurus Code: C25717. |
Sample
The sample entity represents samples or specimen material taken from a biological entity for testing, diagnosis, propagation, treatment, or research purposes. For instance, samples include tissues, body fluids, cells, organs, embryos, and body excretory products. Members of the sample entity can be identified by a Universally Unique Identifier (UUID). Find the properties of the sample entity below.
Property | Description |
---|---|
Submitter ID | Usually a human-readable identifier, such as a number or a string that may contain metadata information. In some instances, this can also be a UUID. |
Portion
The portion entity represents the sequential 100-120 mg sections derived from samples. Members of the portion entity can be identified by a Universally Unique Identifier (UUID). Find the properties of the portion entity below.
Property | Description |
---|---|
Submitter ID | Usually a human-readable identifier, such as a number or a string that may contain metadata information. In some instances, this can also be a UUID. |
File
The file entity refers to the files in TCGA produced by aliquot analyses. Members of the file entity can be identified by a Universally Unique Identifier (UUID). Find the properties of the file entity below.
Property | Description |
---|---|
Submitter ID | Usually a human-readable identifier, such as a number or a string that may contain metadata information. In some instances, this can also be a UUID. |
File type | The type of file which stores the data. |
Data format | The type of format that determines data content. |
Access level | A Boolean value indicating Controlled Data or Open Data. Controlled Data is data from public datasets that has limitations on use and requires approval by dbGaP. Open Data is data from public datasets that doesn't have limitations on its use. |
Protocol
See below for links to the publications that describe the experimental protocols used to generate each subcollection of data.
Updated less than a minute ago