{"__v":0,"_id":"58458dd248293d1900d1fc01","category":{"project":"55faf11ba62ba1170021a9a7","version":"55faf11ba62ba1170021a9aa","_id":"58458b4fba4f1c0f009692bb","__v":0,"sync":{"url":"","isSync":false},"reference":false,"createdAt":"2016-12-05T15:44:15.650Z","from_sync":false,"order":6,"slug":"datasets-hub","title":"DATASETS HUB"},"parentDoc":null,"project":"55faf11ba62ba1170021a9a7","user":"5613e4f8fdd08f2b00437620","version":{"__v":37,"_id":"55faf11ba62ba1170021a9aa","project":"55faf11ba62ba1170021a9a7","createdAt":"2015-09-17T16:58:03.490Z","releaseDate":"2015-09-17T16:58:03.490Z","categories":["55faf11ca62ba1170021a9ab","55faf8f4d0e22017005b8272","55faf91aa62ba1170021a9b5","55faf929a8a7770d00c2c0bd","55faf932a8a7770d00c2c0bf","55faf94b17b9d00d00969f47","55faf958d0e22017005b8274","55faf95fa8a7770d00c2c0c0","55faf96917b9d00d00969f48","55faf970a8a7770d00c2c0c1","55faf98c825d5f19001fa3a6","55faf99aa62ba1170021a9b8","55faf99fa62ba1170021a9b9","55faf9aa17b9d00d00969f49","55faf9b6a8a7770d00c2c0c3","55faf9bda62ba1170021a9ba","5604570090ee490d00440551","5637e8b2fbe1c50d008cb078","5649bb624fa1460d00780add","5671974d1b6b730d008b4823","5671979d60c8e70d006c9760","568e8eef70ca1f0d0035808e","56d0a2081ecc471500f1795e","56d4a0adde40c70b00823ea3","56d96b03dd90610b00270849","56fbb83d8f21c817002af880","573c811bee2b3b2200422be1","576bc92afb62dd20001cda85","5771811e27a5c20e00030dcd","5785191af3a10c0e009b75b0","57bdf84d5d48411900cd8dc0","57ff5c5dc135231700aed806","5804caf792398f0f00e77521","58458b4fba4f1c0f009692bb","586d3c287c6b5b2300c05055","58ef66d88646742f009a0216","58f5d52d7891630f00fe4e77"],"is_deprecated":false,"is_hidden":false,"is_beta":true,"is_stable":true,"codename":"","version_clean":"1.0.0","version":"1.0"},"updates":[],"next":{"pages":[],"description":""},"createdAt":"2016-12-05T15:54:58.370Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":1,"body":"The Cancer Genome Atlas (TCGA) is one of the richest and most complete genomics datasets and was compiled to understand the molecular basis of cancers. Data collection for TCGA began in 2006 as a joint effort by the <a href=\"https://www.cancer.gov/\" target=\"blank\">National Cancer Institute (NCI)</a>, <a href=\"https://www.genome.gov/\" target=\"blank\">National Human Genome Research Institute (NHGRI)</a>, <a href=\"https://www.nih.gov/\" target=\"blank\">the National Institutes of Health (NIH)</a>, and the <a href=\"http://www.hhs.gov/\" target=\"blank\">U.S. Department of Health and Human Services</a>. \n\nOver the past decade, TCGA has grown to contain data on 33 different tumor types and over 11,000 cases (patients). Between 50 and 1500 cases have been sampled for each tumor type. For each case, multiple samples were analyzed, using microarray technology for genome characterization, and next-generation technology for sequencing. TCGA data currently represents more than 2.5 petabytes of information and is expected to grow as new samples are processed.\n\nFor a full list of TCGA data available on the CGC, see the table below. The table details data types and subtypes, the data format of data subtypes, and the access level of each data subtype.\n\n[block:parameters]\n{\n  \"data\": {\n    \"0-0\": \"Clinical\",\n    \"1-0\": \"Clinical\",\n    \"0-1\": \"Clinical Data\",\n    \"1-1\": \"Biospecimen Data\",\n    \"0-2\": \"XML\",\n    \"1-2\": \"XML\",\n    \"0-3\": \"Open Data\",\n    \"1-3\": \"Open Data\",\n    \"h-0\": \"Data type\",\n    \"h-1\": \"Data subtype\",\n    \"h-2\": \"Data format\",\n    \"h-3\": \"Data Access Tier\",\n    \"h-4\": \"Controlled Data\",\n    \"2-0\": \"Raw Sequencing Data\",\n    \"3-0\": \"Raw Sequencing Data\",\n    \"4-0\": \"Raw Sequencing Data\",\n    \"5-0\": \"Raw Sequencing Data\",\n    \"6-0\": \"Raw Microarray Data\",\n    \"7-0\": \"Raw Microarray Data\",\n    \"8-0\": \"Raw Microarray Data\",\n    \"9-0\": \"Raw Microarray Data\",\n    \"2-1\": \"Aligned Reads\",\n    \"3-1\": \"Unaligned Reads\",\n    \"4-1\": \"Sequencing Tag\",\n    \"5-1\": \"Sequencing Tag Counts\",\n    \"2-2\": \"BAM\",\n    \"3-2\": \"TARGZ, TAR\",\n    \"2-3\": \"Controlled Data\",\n    \"3-3\": \"Controlled Data\",\n    \"4-2\": \"DGE-Tag\",\n    \"4-3\": \"Open Data\",\n    \"5-3\": \"Open Data\",\n    \"5-2\": \"TXT\",\n    \"6-1\": \"Raw Intensities\",\n    \"7-1\": \"Intensities Log2Ratio\",\n    \"8-1\": \"Intensities\",\n    \"9-1\": \"Normalized Intensities\",\n    \"6-2\": \"Idat, CEL, TXT, TIF\",\n    \"7-2\": \"TXT\",\n    \"8-2\": \"TXT\",\n    \"9-2\": \"TXT, Dat\",\n    \"6-3\": \"Open and Controlled Data\",\n    \"7-3\": \"Open Data\",\n    \"8-3\": \"Open Data\",\n    \"9-3\": \"Open and Controlled Data\",\n    \"10-0\": \"Simple Nucleotide Variation\",\n    \"11-0\": \"Simple Nucleotide Variation\",\n    \"12-0\": \"Simple Nucleotide Variation\",\n    \"10-1\": \"Genotypes\",\n    \"11-1\": \"Simple Somatic Mutation\",\n    \"12-1\": \"Simple Nucleotide Variation\",\n    \"10-2\": \"TXT, Dat\",\n    \"11-2\": \"MAF\",\n    \"12-2\": \"VCF\",\n    \"10-3\": \"Controlled Data\",\n    \"11-3\": \"Open and Controlled Data\",\n    \"12-3\": \"Controlled Data\",\n    \"13-0\": \"Gene Expression\",\n    \"14-0\": \"Gene Expression\",\n    \"15-0\": \"Gene Expression\",\n    \"16-0\": \"Gene Expression\",\n    \"17-0\": \"Gene Expression\",\n    \"13-1\": \"Gene Expression Quantification\",\n    \"14-1\": \"miRNA Quantification\",\n    \"15-1\": \"Isoform Expression Quantification\",\n    \"16-1\": \"Exon Junction Quantification\",\n    \"17-1\": \"Exon Quantification\",\n    \"13-2\": \"TXT\",\n    \"14-2\": \"TXT\",\n    \"15-2\": \"TXT\",\n    \"16-2\": \"TXT\",\n    \"17-2\": \"TXT\",\n    \"13-3\": \"Open Data\",\n    \"14-3\": \"Open Data\",\n    \"15-3\": \"Open Data\",\n    \"16-3\": \"Open Data\",\n    \"17-3\": \"Open Data\",\n    \"18-0\": \"Structural Rearrangement\",\n    \"18-1\": \"Structural Variation\",\n    \"18-2\": \"VCF, FA\",\n    \"18-3\": \"Controlled Data\",\n    \"19-0\": \"DNA Methylation\",\n    \"20-0\": \"DNA Methylation\",\n    \"21-0\": \"DNA Methylation\",\n    \"19-1\": \"Bisulfite Sequence Alignment\",\n    \"20-1\": \"Methylation Beta Value\",\n    \"21-1\": \"Methylation Percentage\",\n    \"19-2\": \"VCF\",\n    \"20-2\": \"TXT\",\n    \"21-2\": \"BED\",\n    \"19-3\": \"Controlled Data\",\n    \"20-3\": \"Open Data\",\n    \"21-3\": \"Open Data\",\n    \"22-0\": \"Copy Number Variation\",\n    \"23-0\": \"Copy Number Variation\",\n    \"24-0\": \"Copy Number Variation\",\n    \"25-0\": \"Copy Number Variation\",\n    \"26-0\": \"Copy Number Variation\",\n    \"27-0\": \"Protein Expression\",\n    \"28-0\": \"Other\",\n    \"28-1\": \"Microsatellite Instability\",\n    \"27-1\": \"Protein Expression Quantification\",\n    \"26-1\": \"Normalized Copy Numbers\",\n    \"25-1\": \"Copy Number Variation\",\n    \"24-1\": \"LOH\",\n    \"23-1\": \"Copy Number Estimate\",\n    \"22-1\": \"Copy Number Segmentation\",\n    \"22-2\": \"TXT, Dat\",\n    \"23-2\": \"TXT\",\n    \"24-2\": \"TXT\",\n    \"25-2\": \"VCF\",\n    \"26-2\": \"TXT\",\n    \"27-2\": \"TXT\",\n    \"28-2\": \"FSA, TXT\",\n    \"22-3\": \"Open Data\",\n    \"23-3\": \"Controlled Data\",\n    \"24-3\": \"Open Data\",\n    \"25-3\": \"Controlled Data\",\n    \"26-3\": \"Controlled Data\",\n    \"28-3\": \"Controlled Data\",\n    \"27-3\": \"Open Data\"\n  },\n  \"cols\": 4,\n  \"rows\": 29\n}\n[/block]","excerpt":"<a href=\"about-datasets\" style=\"color:#132c56\">ABOUT DATASETS</a> > TCGA data","slug":"tcga-data","type":"basic","title":"TCGA data"}

TCGA data

<a href="about-datasets" style="color:#132c56">ABOUT DATASETS</a> > TCGA data

The Cancer Genome Atlas (TCGA) is one of the richest and most complete genomics datasets and was compiled to understand the molecular basis of cancers. Data collection for TCGA began in 2006 as a joint effort by the <a href="https://www.cancer.gov/" target="blank">National Cancer Institute (NCI)</a>, <a href="https://www.genome.gov/" target="blank">National Human Genome Research Institute (NHGRI)</a>, <a href="https://www.nih.gov/" target="blank">the National Institutes of Health (NIH)</a>, and the <a href="http://www.hhs.gov/" target="blank">U.S. Department of Health and Human Services</a>. Over the past decade, TCGA has grown to contain data on 33 different tumor types and over 11,000 cases (patients). Between 50 and 1500 cases have been sampled for each tumor type. For each case, multiple samples were analyzed, using microarray technology for genome characterization, and next-generation technology for sequencing. TCGA data currently represents more than 2.5 petabytes of information and is expected to grow as new samples are processed. For a full list of TCGA data available on the CGC, see the table below. The table details data types and subtypes, the data format of data subtypes, and the access level of each data subtype. [block:parameters] { "data": { "0-0": "Clinical", "1-0": "Clinical", "0-1": "Clinical Data", "1-1": "Biospecimen Data", "0-2": "XML", "1-2": "XML", "0-3": "Open Data", "1-3": "Open Data", "h-0": "Data type", "h-1": "Data subtype", "h-2": "Data format", "h-3": "Data Access Tier", "h-4": "Controlled Data", "2-0": "Raw Sequencing Data", "3-0": "Raw Sequencing Data", "4-0": "Raw Sequencing Data", "5-0": "Raw Sequencing Data", "6-0": "Raw Microarray Data", "7-0": "Raw Microarray Data", "8-0": "Raw Microarray Data", "9-0": "Raw Microarray Data", "2-1": "Aligned Reads", "3-1": "Unaligned Reads", "4-1": "Sequencing Tag", "5-1": "Sequencing Tag Counts", "2-2": "BAM", "3-2": "TARGZ, TAR", "2-3": "Controlled Data", "3-3": "Controlled Data", "4-2": "DGE-Tag", "4-3": "Open Data", "5-3": "Open Data", "5-2": "TXT", "6-1": "Raw Intensities", "7-1": "Intensities Log2Ratio", "8-1": "Intensities", "9-1": "Normalized Intensities", "6-2": "Idat, CEL, TXT, TIF", "7-2": "TXT", "8-2": "TXT", "9-2": "TXT, Dat", "6-3": "Open and Controlled Data", "7-3": "Open Data", "8-3": "Open Data", "9-3": "Open and Controlled Data", "10-0": "Simple Nucleotide Variation", "11-0": "Simple Nucleotide Variation", "12-0": "Simple Nucleotide Variation", "10-1": "Genotypes", "11-1": "Simple Somatic Mutation", "12-1": "Simple Nucleotide Variation", "10-2": "TXT, Dat", "11-2": "MAF", "12-2": "VCF", "10-3": "Controlled Data", "11-3": "Open and Controlled Data", "12-3": "Controlled Data", "13-0": "Gene Expression", "14-0": "Gene Expression", "15-0": "Gene Expression", "16-0": "Gene Expression", "17-0": "Gene Expression", "13-1": "Gene Expression Quantification", "14-1": "miRNA Quantification", "15-1": "Isoform Expression Quantification", "16-1": "Exon Junction Quantification", "17-1": "Exon Quantification", "13-2": "TXT", "14-2": "TXT", "15-2": "TXT", "16-2": "TXT", "17-2": "TXT", "13-3": "Open Data", "14-3": "Open Data", "15-3": "Open Data", "16-3": "Open Data", "17-3": "Open Data", "18-0": "Structural Rearrangement", "18-1": "Structural Variation", "18-2": "VCF, FA", "18-3": "Controlled Data", "19-0": "DNA Methylation", "20-0": "DNA Methylation", "21-0": "DNA Methylation", "19-1": "Bisulfite Sequence Alignment", "20-1": "Methylation Beta Value", "21-1": "Methylation Percentage", "19-2": "VCF", "20-2": "TXT", "21-2": "BED", "19-3": "Controlled Data", "20-3": "Open Data", "21-3": "Open Data", "22-0": "Copy Number Variation", "23-0": "Copy Number Variation", "24-0": "Copy Number Variation", "25-0": "Copy Number Variation", "26-0": "Copy Number Variation", "27-0": "Protein Expression", "28-0": "Other", "28-1": "Microsatellite Instability", "27-1": "Protein Expression Quantification", "26-1": "Normalized Copy Numbers", "25-1": "Copy Number Variation", "24-1": "LOH", "23-1": "Copy Number Estimate", "22-1": "Copy Number Segmentation", "22-2": "TXT, Dat", "23-2": "TXT", "24-2": "TXT", "25-2": "VCF", "26-2": "TXT", "27-2": "TXT", "28-2": "FSA, TXT", "22-3": "Open Data", "23-3": "Controlled Data", "24-3": "Open Data", "25-3": "Controlled Data", "26-3": "Controlled Data", "28-3": "Controlled Data", "27-3": "Open Data" }, "cols": 4, "rows": 29 } [/block]