{"__v":1,"_id":"58458c4c29c0970f00e844a8","category":{"project":"55faf11ba62ba1170021a9a7","version":"55faf11ba62ba1170021a9aa","_id":"58458b4fba4f1c0f009692bb","__v":0,"sync":{"url":"","isSync":false},"reference":false,"createdAt":"2016-12-05T15:44:15.650Z","from_sync":false,"order":6,"slug":"datasets-hub","title":"DATASETS HUB"},"parentDoc":null,"project":"55faf11ba62ba1170021a9a7","user":"5613e4f8fdd08f2b00437620","version":{"__v":35,"_id":"55faf11ba62ba1170021a9aa","project":"55faf11ba62ba1170021a9a7","createdAt":"2015-09-17T16:58:03.490Z","releaseDate":"2015-09-17T16:58:03.490Z","categories":["55faf11ca62ba1170021a9ab","55faf8f4d0e22017005b8272","55faf91aa62ba1170021a9b5","55faf929a8a7770d00c2c0bd","55faf932a8a7770d00c2c0bf","55faf94b17b9d00d00969f47","55faf958d0e22017005b8274","55faf95fa8a7770d00c2c0c0","55faf96917b9d00d00969f48","55faf970a8a7770d00c2c0c1","55faf98c825d5f19001fa3a6","55faf99aa62ba1170021a9b8","55faf99fa62ba1170021a9b9","55faf9aa17b9d00d00969f49","55faf9b6a8a7770d00c2c0c3","55faf9bda62ba1170021a9ba","5604570090ee490d00440551","5637e8b2fbe1c50d008cb078","5649bb624fa1460d00780add","5671974d1b6b730d008b4823","5671979d60c8e70d006c9760","568e8eef70ca1f0d0035808e","56d0a2081ecc471500f1795e","56d4a0adde40c70b00823ea3","56d96b03dd90610b00270849","56fbb83d8f21c817002af880","573c811bee2b3b2200422be1","576bc92afb62dd20001cda85","5771811e27a5c20e00030dcd","5785191af3a10c0e009b75b0","57bdf84d5d48411900cd8dc0","57ff5c5dc135231700aed806","5804caf792398f0f00e77521","58458b4fba4f1c0f009692bb","586d3c287c6b5b2300c05055"],"is_deprecated":false,"is_hidden":false,"is_beta":true,"is_stable":true,"codename":"","version_clean":"1.0.0","version":"1.0"},"updates":["5888bf6752d5b70f004e33fb"],"next":{"pages":[],"description":""},"createdAt":"2016-12-05T15:48:28.014Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":0,"body":"[block:callout]\n{\n  \"type\": \"warning\",\n  \"title\": \"On this page:\",\n  \"body\": \"* [Overview](#overview)\\n* [The Cancer Genome Atlas (TCGA)](#tcga)\\n * [TCGA Resources](#tcga-resources)\\n* [Cancer Cell Line Encyclopedia (CCLE)](#ccle)\\n * [CCLE Resources](#ccle-resources)\\n* [Get started](#get-started)\"\n}\n[/block]\n<a name=\"overview\"></a>\n##Overview\n\nThe CGC hosts large genomics datasets along with the tools to query, filter, and browse them. The resulting data can be added to your projects and analyzed with your private data to address your research questions. Below, learn more about datasets on the CGC and access resources describing each dataset's data and metadata.\n\n<a name=\"tcga\"></a>\n##The Cancer Genome Atlas (TCGA)\n\nTCGA is one of the world’s largest cancer genomics data collections, including more than eleven thousand patients, representing 33 cancers, and over half a million total files. Data collection for TCGA began in 2006 as a joint effort by the <a href=\"https://www.cancer.gov/\" target=\"blank\">National Cancer Institute (NCI)</a>, <a href=\"https://www.genome.gov/\" target=\"blank\">National Human Genome Research Institute (NHGRI)</a>, <a href=\"https://www.nih.gov/\" target=\"blank\">the National Institutes of Health (NIH)</a>, and the <a href=\"http://www.hhs.gov/\" target=\"blank\">U.S. Department of Health and Human Services</a>. The CGC provides powerful methods to query and reproducibly analyze TCGA data by itself or in conjunction with your own data.\n\nTCGA on the CGC includes both <a href=\"https://wiki.nci.nih.gov/display/TCGA/Open+Access+and+Controlled+Access+Data\" target=\"blank\">Open and Controlled Data</a>. While all data in TCGA is stripped of direct identifiers, DNA information is inherently unique to an individual. Two types of data access ‘tiers’ (Open Data and Controlled Data) have been put in place to balance the desire to make TCGA data as widely available as possible while ensuring that the rights of study participants are well protected. You can access TCGA Open Data on the CGC as soon as you sign up and agree to data use policies. In addition, you can obtain access to Controlled Data through the NIH via the <a href=\"https://www.ncbi.nlm.nih.gov/gap\" target=\"blank\">Database of Genotypes and Phenotypes (dBGaP) site</a>.\n\n<a name=\"tcga-resources\"></a>\n###TCGA Resources\n* [Required permissions to access TCGA data](tcga-data-access)\n* [TCGA data](doc:tcga-data) \n* [TCGA metadata schema](doc:tcga-metadata) \n\n<a name=\"ccle\"></a>\n##Cancer Cell Line Encyclopedia (CCLE)\n\nThe Cancer Cell Line Encyclopedia (CCLE) is a project performing detailed genetic and pharmacologic characterization of a large number of human cancer cell lines. Cell lines are permanently established cell cultures derived from patients that will proliferate indefinitely given appropriate fresh medium and space. The CCLE is the result of a collaboration between the <a href=\"https://www.broadinstitute.org/\" target=\"blank\">Broad Institute</a>, the <a href=\"https://www.nibr.com/\" target=\"blank\">Novartis Institutes for Biomedical Research</a>, and the <a href=\"https://www.gnf.nibr.com/\" target=\"blank\">Genomics Institute of the Novartis Research Foundation</a>.\n\nCCLE contains Open Access sequencing data (in the form of reads aligned to the hg19 reference genome) for nearly 1000 cancer cell line samples. The CGC hosts the CCLE dataset in the form of a read-only public project which contains cell line samples as available from cgHub on May 11, 2016. You have automatic access to all CCLE data on the CGC.\n\n<a name=\"ccle-resources\"></a>\n###CCLE Resources\n* [CCLE data](doc:ccle-data) \n* [CCLE public project](doc:ccle) \n* [CCLE metadata schema](doc:ccle-metadata) \n\n<a name=\"Get-started\"></a>\n##Get started\n\n1. [Start from a broad overview](browse-datasets) of any dataset available on the CGC via the visual interface.\n2. [Refine your results with a query](query-datasets) issued on the visual interface or programmatically.\n3. [Access data](access-data-from-datasets) for further analysis in your CGC project.","excerpt":"","slug":"about-datasets","type":"basic","title":"ABOUT DATASETS"}
[block:callout] { "type": "warning", "title": "On this page:", "body": "* [Overview](#overview)\n* [The Cancer Genome Atlas (TCGA)](#tcga)\n * [TCGA Resources](#tcga-resources)\n* [Cancer Cell Line Encyclopedia (CCLE)](#ccle)\n * [CCLE Resources](#ccle-resources)\n* [Get started](#get-started)" } [/block] <a name="overview"></a> ##Overview The CGC hosts large genomics datasets along with the tools to query, filter, and browse them. The resulting data can be added to your projects and analyzed with your private data to address your research questions. Below, learn more about datasets on the CGC and access resources describing each dataset's data and metadata. <a name="tcga"></a> ##The Cancer Genome Atlas (TCGA) TCGA is one of the world’s largest cancer genomics data collections, including more than eleven thousand patients, representing 33 cancers, and over half a million total files. Data collection for TCGA began in 2006 as a joint effort by the <a href="https://www.cancer.gov/" target="blank">National Cancer Institute (NCI)</a>, <a href="https://www.genome.gov/" target="blank">National Human Genome Research Institute (NHGRI)</a>, <a href="https://www.nih.gov/" target="blank">the National Institutes of Health (NIH)</a>, and the <a href="http://www.hhs.gov/" target="blank">U.S. Department of Health and Human Services</a>. The CGC provides powerful methods to query and reproducibly analyze TCGA data by itself or in conjunction with your own data. TCGA on the CGC includes both <a href="https://wiki.nci.nih.gov/display/TCGA/Open+Access+and+Controlled+Access+Data" target="blank">Open and Controlled Data</a>. While all data in TCGA is stripped of direct identifiers, DNA information is inherently unique to an individual. Two types of data access ‘tiers’ (Open Data and Controlled Data) have been put in place to balance the desire to make TCGA data as widely available as possible while ensuring that the rights of study participants are well protected. You can access TCGA Open Data on the CGC as soon as you sign up and agree to data use policies. In addition, you can obtain access to Controlled Data through the NIH via the <a href="https://www.ncbi.nlm.nih.gov/gap" target="blank">Database of Genotypes and Phenotypes (dBGaP) site</a>. <a name="tcga-resources"></a> ###TCGA Resources * [Required permissions to access TCGA data](tcga-data-access) * [TCGA data](doc:tcga-data) * [TCGA metadata schema](doc:tcga-metadata) <a name="ccle"></a> ##Cancer Cell Line Encyclopedia (CCLE) The Cancer Cell Line Encyclopedia (CCLE) is a project performing detailed genetic and pharmacologic characterization of a large number of human cancer cell lines. Cell lines are permanently established cell cultures derived from patients that will proliferate indefinitely given appropriate fresh medium and space. The CCLE is the result of a collaboration between the <a href="https://www.broadinstitute.org/" target="blank">Broad Institute</a>, the <a href="https://www.nibr.com/" target="blank">Novartis Institutes for Biomedical Research</a>, and the <a href="https://www.gnf.nibr.com/" target="blank">Genomics Institute of the Novartis Research Foundation</a>. CCLE contains Open Access sequencing data (in the form of reads aligned to the hg19 reference genome) for nearly 1000 cancer cell line samples. The CGC hosts the CCLE dataset in the form of a read-only public project which contains cell line samples as available from cgHub on May 11, 2016. You have automatic access to all CCLE data on the CGC. <a name="ccle-resources"></a> ###CCLE Resources * [CCLE data](doc:ccle-data) * [CCLE public project](doc:ccle) * [CCLE metadata schema](doc:ccle-metadata) <a name="Get-started"></a> ##Get started 1. [Start from a broad overview](browse-datasets) of any dataset available on the CGC via the visual interface. 2. [Refine your results with a query](query-datasets) issued on the visual interface or programmatically. 3. [Access data](access-data-from-datasets) for further analysis in your CGC project.