{"_id":"584593e448293d1900d1fc0c","user":"5613e4f8fdd08f2b00437620","__v":0,"version":{"_id":"55faf11ba62ba1170021a9aa","project":"55faf11ba62ba1170021a9a7","__v":38,"createdAt":"2015-09-17T16:58:03.490Z","releaseDate":"2015-09-17T16:58:03.490Z","categories":["55faf11ca62ba1170021a9ab","55faf8f4d0e22017005b8272","55faf91aa62ba1170021a9b5","55faf929a8a7770d00c2c0bd","55faf932a8a7770d00c2c0bf","55faf94b17b9d00d00969f47","55faf958d0e22017005b8274","55faf95fa8a7770d00c2c0c0","55faf96917b9d00d00969f48","55faf970a8a7770d00c2c0c1","55faf98c825d5f19001fa3a6","55faf99aa62ba1170021a9b8","55faf99fa62ba1170021a9b9","55faf9aa17b9d00d00969f49","55faf9b6a8a7770d00c2c0c3","55faf9bda62ba1170021a9ba","5604570090ee490d00440551","5637e8b2fbe1c50d008cb078","5649bb624fa1460d00780add","5671974d1b6b730d008b4823","5671979d60c8e70d006c9760","568e8eef70ca1f0d0035808e","56d0a2081ecc471500f1795e","56d4a0adde40c70b00823ea3","56d96b03dd90610b00270849","56fbb83d8f21c817002af880","573c811bee2b3b2200422be1","576bc92afb62dd20001cda85","5771811e27a5c20e00030dcd","5785191af3a10c0e009b75b0","57bdf84d5d48411900cd8dc0","57ff5c5dc135231700aed806","5804caf792398f0f00e77521","58458b4fba4f1c0f009692bb","586d3c287c6b5b2300c05055","58ef66d88646742f009a0216","58f5d52d7891630f00fe4e77","59a555bccdbd85001bfb1442"],"is_deprecated":false,"is_hidden":false,"is_beta":true,"is_stable":true,"codename":"","version_clean":"1.0.0","version":"1.0"},"category":{"_id":"58458b4fba4f1c0f009692bb","project":"55faf11ba62ba1170021a9a7","version":"55faf11ba62ba1170021a9aa","__v":0,"sync":{"url":"","isSync":false},"reference":false,"createdAt":"2016-12-05T15:44:15.650Z","from_sync":false,"order":6,"slug":"datasets-hub","title":"DATASETS HUB"},"project":"55faf11ba62ba1170021a9a7","parentDoc":null,"updates":[],"next":{"pages":[],"description":""},"createdAt":"2016-12-05T16:20:52.483Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":29,"body":"[block:callout]\n{\n  \"type\": \"warning\",\n  \"title\": \"On this page:\",\n  \"body\": \"* [Overview](#section-overview)\\n* [Objective](#section-objective)\\n* [Procedure](#section-procedure)\"\n}\n[/block]\n##Overview\n\nBuild your own queries from scratch using the metadata associated with each dataset. Metadata consists of properties, which describe each dataset’s entities, and their values. Entities are particular resources with UUIDs, such as files, cases, samples, and cell lines. Learn more [about metadata for datasets](about-metadata-for-datasets).\n\nLearn about the [parts of a query](doc:data-browser-query-structure). Then, walk through an example demonstrating building a query from scratch below.\n\n<div align=\"right\"><a href=\"#top\">top</a></div>\n\n##Objective\n This query selects Cases that:\n  * are diagnosed with Lung Adenocarcinoma\n  * are males with donated samples of primary tumor tissue\n  * have been analyzed with RNA-Seq\n  * have FASTQ files produced by this experimental strategy\n\n<div align=\"right\"><a href=\"#top\">top</a></div>\n\n##Procedure\n1. Select the TCGA dataset, as shown below. This dataset is marked as \"legacy\" in accordance with the GDC because it contains non-harmonized data.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/447f569-image.png\",\n        \"image.png\",\n        500,\n        320,\n        \"#e3e9eb\"\n      ]\n    }\n  ]\n}\n[/block]\n2. Select the **Case** entity, as shown below.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/117201c-c7a0f0a-Screen_Shot_2016-08-17_at_11.17.49_AM.jpeg\",\n        \"c7a0f0a-Screen_Shot_2016-08-17_at_11.17.49_AM.jpeg\",\n        1974,\n        1326,\n        \"#3e756e\"\n      ]\n    }\n  ]\n}\n[/block]\n3.  Click **Has Diagnosis** and select **Has Disease Type** from the properties below **Case**.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/bc80565-3b6f53a-Screen_Shot_2016-09-02_at_3.07.10_PM.jpeg\",\n        \"3b6f53a-Screen_Shot_2016-09-02_at_3.07.10_PM.jpeg\",\n        918,\n        648,\n        \"#406e75\"\n      ]\n    }\n  ]\n}\n[/block]\n4. Add a filter for the **Disease Type**. Use the **Text search bar** to look up and select **Lung Adenocarcinoma**. Then, click **Add property** to apply your selection. \n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/e587cac-eae53bd-Screen_Shot_2016-08-17_at_11.19.55_AM.jpeg\",\n        \"eae53bd-Screen_Shot_2016-08-17_at_11.19.55_AM.jpeg\",\n        934,\n        748,\n        \"#41757a\"\n      ]\n    }\n  ]\n}\n[/block]\n5. Select **Has Demographic** on the **Case** node, and click **Gender**.\n6. Select **Male** and click **Add property**.\n7. To filter for **Samples** with a **Sample Type** of primary tumor, click on **Sample** from the list of entities next to the **Case** node.\n8. Click **+Add property** and select **Sample type.**\n9. Look up and select **Primary Tumor**. Click **Add property** to add your selection.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/bcf1156-524574b-Screen_Shot_2016-08-17_at_11.21.50_AM.jpeg\",\n        \"524574b-Screen_Shot_2016-08-17_at_11.21.50_AM.jpeg\",\n        1696,\n        986,\n        \"#574da4\"\n      ]\n    }\n  ]\n}\n[/block]\n10. Next to **Sample**, select the **File** entity.\n11. On the **File** entity, click **+Add property** and choose **Experimental Strategy**. Search for and select **RNA - Seq**. Add the filter by clicking **Add property**.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/4b494c9-e2e1250-Screen_Shot_2016-08-17_at_11.22.47_AM.jpeg\",\n        \"e2e1250-Screen_Shot_2016-08-17_at_11.22.47_AM.jpeg\",\n        2206,\n        1104,\n        \"#efeef0\"\n      ]\n    }\n  ]\n}\n[/block]\n12.Click on **+Add property** below the File entity once more to select **Data Format** with a filter of **TARGZ**. This filters for FASTQ files stored in the TARGZ format. Finish your selection by clicking **Add property**.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/71d75db-dc5e725-Screen_Shot_2016-08-17_at_11.24.41_AM.jpeg\",\n        \"dc5e725-Screen_Shot_2016-08-17_at_11.24.41_AM.jpeg\",\n        2238,\n        1080,\n        \"#4775bd\"\n      ]\n    }\n  ]\n}\n[/block]\nNow that you have filtered for your desired data, you may import these files into your project for further analysis. Read more about [accessing data from the Data Browser](doc:access-data-from-the-data-browser).\n\nTo save this query, click **Save** from the **Queries** drop-down menu on the top of the canvas.\n\nThat's it! You've successfully built a query from scratch and found FASTQ files analyzed with RNA - Sequencing technology from primary tumor tissue donated by males with Lung Adenocarcinoma.\n\n<div align=\"right\"><a href=\"#top\">top</a></div>","excerpt":"<a href=\"query-datasets\" style=\"color:#132c56\">QUERY DATASETS</a> > <a href=\"about-the-data-browser\" style=\"color:#132c56\">About the Data Browser</a> > Data Browser query: start from scratch","slug":"data-browser-query-start-from-scratch","type":"basic","title":"↳ Data Browser query: start from scratch"}

↳ Data Browser query: start from scratch

<a href="query-datasets" style="color:#132c56">QUERY DATASETS</a> > <a href="about-the-data-browser" style="color:#132c56">About the Data Browser</a> > Data Browser query: start from scratch

[block:callout] { "type": "warning", "title": "On this page:", "body": "* [Overview](#section-overview)\n* [Objective](#section-objective)\n* [Procedure](#section-procedure)" } [/block] ##Overview Build your own queries from scratch using the metadata associated with each dataset. Metadata consists of properties, which describe each dataset’s entities, and their values. Entities are particular resources with UUIDs, such as files, cases, samples, and cell lines. Learn more [about metadata for datasets](about-metadata-for-datasets). Learn about the [parts of a query](doc:data-browser-query-structure). Then, walk through an example demonstrating building a query from scratch below. <div align="right"><a href="#top">top</a></div> ##Objective This query selects Cases that: * are diagnosed with Lung Adenocarcinoma * are males with donated samples of primary tumor tissue * have been analyzed with RNA-Seq * have FASTQ files produced by this experimental strategy <div align="right"><a href="#top">top</a></div> ##Procedure 1. Select the TCGA dataset, as shown below. This dataset is marked as "legacy" in accordance with the GDC because it contains non-harmonized data. [block:image] { "images": [ { "image": [ "https://files.readme.io/447f569-image.png", "image.png", 500, 320, "#e3e9eb" ] } ] } [/block] 2. Select the **Case** entity, as shown below. [block:image] { "images": [ { "image": [ "https://files.readme.io/117201c-c7a0f0a-Screen_Shot_2016-08-17_at_11.17.49_AM.jpeg", "c7a0f0a-Screen_Shot_2016-08-17_at_11.17.49_AM.jpeg", 1974, 1326, "#3e756e" ] } ] } [/block] 3. Click **Has Diagnosis** and select **Has Disease Type** from the properties below **Case**. [block:image] { "images": [ { "image": [ "https://files.readme.io/bc80565-3b6f53a-Screen_Shot_2016-09-02_at_3.07.10_PM.jpeg", "3b6f53a-Screen_Shot_2016-09-02_at_3.07.10_PM.jpeg", 918, 648, "#406e75" ] } ] } [/block] 4. Add a filter for the **Disease Type**. Use the **Text search bar** to look up and select **Lung Adenocarcinoma**. Then, click **Add property** to apply your selection. [block:image] { "images": [ { "image": [ "https://files.readme.io/e587cac-eae53bd-Screen_Shot_2016-08-17_at_11.19.55_AM.jpeg", "eae53bd-Screen_Shot_2016-08-17_at_11.19.55_AM.jpeg", 934, 748, "#41757a" ] } ] } [/block] 5. Select **Has Demographic** on the **Case** node, and click **Gender**. 6. Select **Male** and click **Add property**. 7. To filter for **Samples** with a **Sample Type** of primary tumor, click on **Sample** from the list of entities next to the **Case** node. 8. Click **+Add property** and select **Sample type.** 9. Look up and select **Primary Tumor**. Click **Add property** to add your selection. [block:image] { "images": [ { "image": [ "https://files.readme.io/bcf1156-524574b-Screen_Shot_2016-08-17_at_11.21.50_AM.jpeg", "524574b-Screen_Shot_2016-08-17_at_11.21.50_AM.jpeg", 1696, 986, "#574da4" ] } ] } [/block] 10. Next to **Sample**, select the **File** entity. 11. On the **File** entity, click **+Add property** and choose **Experimental Strategy**. Search for and select **RNA - Seq**. Add the filter by clicking **Add property**. [block:image] { "images": [ { "image": [ "https://files.readme.io/4b494c9-e2e1250-Screen_Shot_2016-08-17_at_11.22.47_AM.jpeg", "e2e1250-Screen_Shot_2016-08-17_at_11.22.47_AM.jpeg", 2206, 1104, "#efeef0" ] } ] } [/block] 12.Click on **+Add property** below the File entity once more to select **Data Format** with a filter of **TARGZ**. This filters for FASTQ files stored in the TARGZ format. Finish your selection by clicking **Add property**. [block:image] { "images": [ { "image": [ "https://files.readme.io/71d75db-dc5e725-Screen_Shot_2016-08-17_at_11.24.41_AM.jpeg", "dc5e725-Screen_Shot_2016-08-17_at_11.24.41_AM.jpeg", 2238, 1080, "#4775bd" ] } ] } [/block] Now that you have filtered for your desired data, you may import these files into your project for further analysis. Read more about [accessing data from the Data Browser](doc:access-data-from-the-data-browser). To save this query, click **Save** from the **Queries** drop-down menu on the top of the canvas. That's it! You've successfully built a query from scratch and found FASTQ files analyzed with RNA - Sequencing technology from primary tumor tissue donated by males with Lung Adenocarcinoma. <div align="right"><a href="#top">top</a></div>