{"_id":"58decc91493bf60f004ea277","version":{"_id":"55faf11ba62ba1170021a9aa","project":"55faf11ba62ba1170021a9a7","__v":40,"createdAt":"2015-09-17T16:58:03.490Z","releaseDate":"2015-09-17T16:58:03.490Z","categories":["55faf11ca62ba1170021a9ab","55faf8f4d0e22017005b8272","55faf91aa62ba1170021a9b5","55faf929a8a7770d00c2c0bd","55faf932a8a7770d00c2c0bf","55faf94b17b9d00d00969f47","55faf958d0e22017005b8274","55faf95fa8a7770d00c2c0c0","55faf96917b9d00d00969f48","55faf970a8a7770d00c2c0c1","55faf98c825d5f19001fa3a6","55faf99aa62ba1170021a9b8","55faf99fa62ba1170021a9b9","55faf9aa17b9d00d00969f49","55faf9b6a8a7770d00c2c0c3","55faf9bda62ba1170021a9ba","5604570090ee490d00440551","5637e8b2fbe1c50d008cb078","5649bb624fa1460d00780add","5671974d1b6b730d008b4823","5671979d60c8e70d006c9760","568e8eef70ca1f0d0035808e","56d0a2081ecc471500f1795e","56d4a0adde40c70b00823ea3","56d96b03dd90610b00270849","56fbb83d8f21c817002af880","573c811bee2b3b2200422be1","576bc92afb62dd20001cda85","5771811e27a5c20e00030dcd","5785191af3a10c0e009b75b0","57bdf84d5d48411900cd8dc0","57ff5c5dc135231700aed806","5804caf792398f0f00e77521","58458b4fba4f1c0f009692bb","586d3c287c6b5b2300c05055","58ef66d88646742f009a0216","58f5d52d7891630f00fe4e77","59a555bccdbd85001bfb1442","5a2a81f688574d001e9934f5","5b080c8d7833b20003ddbb6f"],"is_deprecated":false,"is_hidden":false,"is_beta":true,"is_stable":true,"codename":"","version_clean":"1.0.0","version":"1.0"},"parentDoc":null,"category":{"_id":"58458b4fba4f1c0f009692bb","project":"55faf11ba62ba1170021a9a7","version":"55faf11ba62ba1170021a9aa","__v":0,"sync":{"url":"","isSync":false},"reference":false,"createdAt":"2016-12-05T15:44:15.650Z","from_sync":false,"order":6,"slug":"datasets-hub","title":"DATASETS HUB"},"project":"55faf11ba62ba1170021a9a7","__v":0,"githubsync":"","user":"5613e4f8fdd08f2b00437620","updates":[],"next":{"pages":[],"description":""},"createdAt":"2017-03-31T21:39:29.169Z","link_external":false,"link_url":"","sync_unique":"","hidden":false,"api":{"settings":"","results":{"codes":[]},"auth":"required","params":[],"url":""},"isReference":false,"order":4,"body":"[block:callout]\n{\n  \"type\": \"info\",\n  \"body\": \"Seven Bridges is committed to providing CGC users with the most up-to-date version of the TCGA GRCh38 dataset that is available from the [NCI Genomic Data Commons](https://gdc.cancer.gov/) (GDC). In keeping with this commitment, the CGC transitioned from hosting version 7.0 of this dataset to version 11.0 on July 10, 2018. (See the [GDC Release Notes](https://docs.gdc.cancer.gov/Data/Release_Notes/Data_Release_Notes/) for details of the changes). As of this date, all files accessible via the Data Browser and the API correspond to Data Release 11.0. Files that were added to individual projects before this date and are no longer represented in Data Release 11.0 will no longer be accessible via those projects but may be obtainable from the GDC archive by contacting the [GDC Help Desk](https://gdc.cancer.gov/support). Similarly, files that are no longer represented in Data Release 11.0 are no longer accessible through saved Data Browser queries, and affected queries will return a result of '0'. In addition, due to a change in the way some files within this dataset are hosted, a small number of saved Data Browser queries for which the files are still available also will return a '0' result. Such queries can be recreated using the Data Browser query-building canvas and will continue to return the same results as previously. Please contact the CGC Team at [cgc:::at:::sbgenomics.com](mailto:cgc@sbgenomics.com) if you have any questions. The CGC Team looks forward to continuing to collaborate with the GDC in the months ahead to ensure the timely availability through the CGC of new data releases for this dataset.\"\n}\n[/block]\nThe Cancer Genome Atlas (TCGA) is one of the richest and most complete genomics datasets and was compiled to understand the molecular basis of cancers. Data collection for TCGA began in 2006 as a joint effort by the <a href=\"https://www.cancer.gov/\" target=\"blank\">National Cancer Institute (NCI)</a>, <a href=\"https://www.genome.gov/\" target=\"blank\">National Human Genome Research Institute (NHGRI)</a>, <a href=\"https://www.nih.gov/\" target=\"blank\">the National Institutes of Health (NIH)</a>, and the <a href=\"http://www.hhs.gov/\" target=\"blank\">U.S. Department of Health and Human Services</a>.\n\nOver the past decade, TCGA has grown to contain data on 33 different tumor types and over 11,000 cases (patients). Between 50 and 1500 cases have been sampled for each tumor type. For each case, multiple samples were analyzed, using microarray technology for genome characterization, and next-generation technology for sequencing. TCGA data currently represents more than 2.5 petabytes of information and is expected to grow as new samples are processed.\n\nThis page details data within TCGA GRCh38. Nomenclature for TCGA GRCh38 is in accordance with GDC. For instance, the category Data type for legacy TCGA data is renamed Data Category for harmonized TCGA GRCh38 data. Similarly, Data subtype in legacy TCGA data is Data type in harmonized GRCh38 data. For a full list of TCGA GRCh38 data available on the CGC, see the table below. The table details data categories and types, the data format of data subtypes, and the access level of each data type.\n\n[block:parameters]\n{\n  \"data\": {\n    \"h-0\": \"Data category\",\n    \"h-1\": \"Data type\",\n    \"h-2\": \"Data format\",\n    \"h-3\": \"Data access tier\",\n    \"0-0\": \"Biospecimen\",\n    \"0-1\": \"Biospecimen supplement\",\n    \"0-2\": \"BCR XML\",\n    \"0-3\": \"Open data\",\n    \"1-3\": \"Open data\",\n    \"2-3\": \"Open data\",\n    \"3-3\": \"Open data\",\n    \"4-3\": \"Open data\",\n    \"12-3\": \"Open data\",\n    \"11-3\": \"Open data\",\n    \"10-3\": \"Open data\",\n    \"9-3\": \"Controlled data\",\n    \"8-3\": \"Controlled data\",\n    \"7-3\": \"Controlled data\",\n    \"6-3\": \"Controlled data\",\n    \"5-3\": \"Controlled data\",\n    \"1-0\": \"Clinical\",\n    \"1-1\": \"Clinical supplement\",\n    \"2-0\": \"Copy Number Variation\",\n    \"2-1\": \"Copy Number Segment\",\n    \"1-2\": \"BCR XML\",\n    \"2-2\": \"TXT\",\n    \"3-2\": \"TXT\",\n    \"3-0\": \"Copy Number Variation\",\n    \"3-1\": \"Masked Copy Number Segment\",\n    \"4-0\": \"DNA Methylation\",\n    \"4-1\": \"Methylation Beta Value\",\n    \"4-2\": \"TXT\",\n    \"5-0\": \"Raw Sequencing Data\",\n    \"5-1\": \"Aligned reads\",\n    \"5-2\": \"BAM\",\n    \"6-0\": \"Simple Nucleotide Variation\",\n    \"7-0\": \"Simple Nucleotide Variation\",\n    \"8-0\": \"Simple Nucleotide Variation\",\n    \"9-0\": \"Simple Nucleotide Variation\",\n    \"10-0\": \"Transcriptome profiling\",\n    \"11-0\": \"Transcriptome profiling\",\n    \"12-0\": \"Transcriptome profiling\",\n    \"6-1\": \"Aggregated Somatic Mutation\",\n    \"6-2\": \"MAF\",\n    \"7-1\": \"Annotated Somatic Mutation\",\n    \"7-2\": \"VCF\",\n    \"8-1\": \"Masked Somatic Mutation\",\n    \"8-2\": \"MAF\",\n    \"9-1\": \"Raw Simple Somatic Mutation\",\n    \"9-2\": \"VCF\",\n    \"10-1\": \"Gene Expression Quantification\",\n    \"10-2\": \"TXT\",\n    \"11-1\": \"Isoform Expression Quantification\",\n    \"11-2\": \"TSV\",\n    \"12-1\": \"miRNA Expression Quantification\",\n    \"12-2\": \"TSV\",\n    \"13-0\": \"Biospecimen\",\n    \"13-1\": \"Slide Image\",\n    \"13-2\": \"SVS\",\n    \"13-3\": \"Open data\"\n  },\n  \"cols\": 4,\n  \"rows\": 14\n}\n[/block]","excerpt":"<a href=\"about-datasets\" style=\"color:#132c56\">ABOUT DATASETS</a> > TCGA GRCh38 data","slug":"tcga-grch38-data","type":"basic","title":"TCGA GRCh38 data"}

TCGA GRCh38 data

<a href="about-datasets" style="color:#132c56">ABOUT DATASETS</a> > TCGA GRCh38 data

[block:callout] { "type": "info", "body": "Seven Bridges is committed to providing CGC users with the most up-to-date version of the TCGA GRCh38 dataset that is available from the [NCI Genomic Data Commons](https://gdc.cancer.gov/) (GDC). In keeping with this commitment, the CGC transitioned from hosting version 7.0 of this dataset to version 11.0 on July 10, 2018. (See the [GDC Release Notes](https://docs.gdc.cancer.gov/Data/Release_Notes/Data_Release_Notes/) for details of the changes). As of this date, all files accessible via the Data Browser and the API correspond to Data Release 11.0. Files that were added to individual projects before this date and are no longer represented in Data Release 11.0 will no longer be accessible via those projects but may be obtainable from the GDC archive by contacting the [GDC Help Desk](https://gdc.cancer.gov/support). Similarly, files that are no longer represented in Data Release 11.0 are no longer accessible through saved Data Browser queries, and affected queries will return a result of '0'. In addition, due to a change in the way some files within this dataset are hosted, a small number of saved Data Browser queries for which the files are still available also will return a '0' result. Such queries can be recreated using the Data Browser query-building canvas and will continue to return the same results as previously. Please contact the CGC Team at [cgc@sbgenomics.com](mailto:cgc@sbgenomics.com) if you have any questions. The CGC Team looks forward to continuing to collaborate with the GDC in the months ahead to ensure the timely availability through the CGC of new data releases for this dataset." } [/block] The Cancer Genome Atlas (TCGA) is one of the richest and most complete genomics datasets and was compiled to understand the molecular basis of cancers. Data collection for TCGA began in 2006 as a joint effort by the <a href="https://www.cancer.gov/" target="blank">National Cancer Institute (NCI)</a>, <a href="https://www.genome.gov/" target="blank">National Human Genome Research Institute (NHGRI)</a>, <a href="https://www.nih.gov/" target="blank">the National Institutes of Health (NIH)</a>, and the <a href="http://www.hhs.gov/" target="blank">U.S. Department of Health and Human Services</a>. Over the past decade, TCGA has grown to contain data on 33 different tumor types and over 11,000 cases (patients). Between 50 and 1500 cases have been sampled for each tumor type. For each case, multiple samples were analyzed, using microarray technology for genome characterization, and next-generation technology for sequencing. TCGA data currently represents more than 2.5 petabytes of information and is expected to grow as new samples are processed. This page details data within TCGA GRCh38. Nomenclature for TCGA GRCh38 is in accordance with GDC. For instance, the category Data type for legacy TCGA data is renamed Data Category for harmonized TCGA GRCh38 data. Similarly, Data subtype in legacy TCGA data is Data type in harmonized GRCh38 data. For a full list of TCGA GRCh38 data available on the CGC, see the table below. The table details data categories and types, the data format of data subtypes, and the access level of each data type. [block:parameters] { "data": { "h-0": "Data category", "h-1": "Data type", "h-2": "Data format", "h-3": "Data access tier", "0-0": "Biospecimen", "0-1": "Biospecimen supplement", "0-2": "BCR XML", "0-3": "Open data", "1-3": "Open data", "2-3": "Open data", "3-3": "Open data", "4-3": "Open data", "12-3": "Open data", "11-3": "Open data", "10-3": "Open data", "9-3": "Controlled data", "8-3": "Controlled data", "7-3": "Controlled data", "6-3": "Controlled data", "5-3": "Controlled data", "1-0": "Clinical", "1-1": "Clinical supplement", "2-0": "Copy Number Variation", "2-1": "Copy Number Segment", "1-2": "BCR XML", "2-2": "TXT", "3-2": "TXT", "3-0": "Copy Number Variation", "3-1": "Masked Copy Number Segment", "4-0": "DNA Methylation", "4-1": "Methylation Beta Value", "4-2": "TXT", "5-0": "Raw Sequencing Data", "5-1": "Aligned reads", "5-2": "BAM", "6-0": "Simple Nucleotide Variation", "7-0": "Simple Nucleotide Variation", "8-0": "Simple Nucleotide Variation", "9-0": "Simple Nucleotide Variation", "10-0": "Transcriptome profiling", "11-0": "Transcriptome profiling", "12-0": "Transcriptome profiling", "6-1": "Aggregated Somatic Mutation", "6-2": "MAF", "7-1": "Annotated Somatic Mutation", "7-2": "VCF", "8-1": "Masked Somatic Mutation", "8-2": "MAF", "9-1": "Raw Simple Somatic Mutation", "9-2": "VCF", "10-1": "Gene Expression Quantification", "10-2": "TXT", "11-1": "Isoform Expression Quantification", "11-2": "TSV", "12-1": "miRNA Expression Quantification", "12-2": "TSV", "13-0": "Biospecimen", "13-1": "Slide Image", "13-2": "SVS", "13-3": "Open data" }, "cols": 4, "rows": 14 } [/block]