{"metadata":{"image":[],"title":"","description":""},"api":{"url":"","auth":"required","settings":"","results":{"codes":[]},"params":[]},"next":{"description":"","pages":[]},"title":"Import CDS data","type":"basic","slug":"import-cds-data","excerpt":"","body":"[block:callout]\n{\n  \"type\": \"info\",\n  \"title\": \"\",\n  \"body\": \"Currently available version of CDS data brings 3 CDS datasets to the CGC: [GECCO](https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001315.v1.p1), [PPTC](https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001437.v1.p1) and [LCCC-1108](https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001713.v1.p1). See the [details and history of CDS data updates](page:cds-data) on the CGC.\"\n}\n[/block]\n## About the CDS\n\nThe [Cancer Data Service (CDS)](https://datacommons.cancer.gov/repository/cancer-data-service) is a data repository under the NCI's Cancer Research Data Commons (CRDC) infrastructure for storing cancer research data generated by NCI funded programs. Its data is stored in the [Database for Genotypes and Phenotypes (dbGaP)](https://www.ncbi.nlm.nih.gov/gap/) database provided by National Center for Biotechnology Information (NCBI). CDS hosts datasets that contain controlled access data, with access permissions being controlled by dbGaP.\n\nThe process of importing CDS data to the Cancer Genomics Cloud (CGC) consists of the following two stages:\n\n* Searching for data and downloading a manifest file from [dbGaP](https://www.ncbi.nlm.nih.gov/Traces/study/).\n* Importing files to the CGC based on the downloaded manifest file.\n\n[block:callout]\n{\n  \"type\": \"warning\",\n  \"title\": \"\",\n  \"body\": \"To import data CDS data to the CGC, you need to meet the following prerequisites:\\n* Be logged in to the CGC using an [eRA Commons account](https://docs.cancergenomicscloud.org/docs/sign-up-for-the-cgc#section-register-via-an-external-account) with access to controlled data.\\n* Perform the CDS data import procedure in a [controlled project](https://docs.cancergenomicscloud.org/docs/projects-on-the-cgc#section-controlled-data-projects) on the CGC.\"\n}\n[/block]\n\nPlease note that the Seven Bridges team strives to keep the data available for import on the CGC aligned with CDS data updates. However, CDS updates are not instantly available for import on the CGC. Find out the [currently available release](page:cds-data#section-currently-available-release-on-the-cgc) of CDS data on the CGC and see the complete [history of updates](page:cds-data#section-update-history).\n\n## Download CDS manifest files from dbGaP\n\nManifest files contain information about the data you want to import in the second stage of this process.\n\nTo download a manifest file:\n1. Open the [dbGaP website](https://www.ncbi.nlm.nih.gov/Traces/study/).\n2. In the **Accession** field enter the accession of your choice and click **Search**. The list of search results opens.\n3. (Optional) In the **Filters List** section on the left, select the criteria to narrow down the result list.\n\nNow you can proceed to do the following:\n\n* Download the manifest file _for the entire set of data_ returned for the accession:\n\n1. In the **Select** section, click **Metadata** in the **Total** table row. This downloads the manifest file for _all_ data for the accession.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/e1a9412-cds-integration-1.png\",\n        \"cds-integration-1.png\",\n        1187,\n        700,\n        \"#333\"\n      ]\n    }\n  ]\n}\n[/block]\n\n* Select specific items from the list and download manifest file _for the selected items only_:\n\n1. Scroll down to see the list of items that match the search and filtering criteria.\n2. Check the boxes next to items you want to select.\n3. In the **Select** section, click **Metadata** in the **Selected** table row. This downloads the manifest file _for the selected data only_.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/bb4fd5b-cds-integration-2.png\",\n        \"cds-integration-2.png\",\n        1190,\n        700,\n        \"#333\"\n      ]\n    }\n  ]\n}\n[/block]\n## Import CDS data to the CGC\n\nWhen you have downloaded a manifest file from [dbGaP](https://www.ncbi.nlm.nih.gov/Traces/study/), follow the steps below to import the data to the CGC:\n1. Navigate to a [controlled project](doc:projects-on-the-cgc#section-controlled-data-projects) on the CGC or [create](doc:create-a-project) one.\n2. Once in the project, click the **Files** tab.\n3. Click **+ Add files**.\n4. Select the **Import from a manifest file** tab on the right.\n5. In the **Import files from**dropdown, select**Cancer Data Service (CDS)**.\n6. Click the **Select file** button and select the downloaded manifest file.\n7. (Optional) In the **Add tags** field add the keywords (tags) that describe the imported items.\n8. **Resolve naming conflicts** - Select the action to be taken if a naming conflict occurs. Available actions are **Skip** (default option) and **Auto-rename**. Read more about [naming conflicts resolution](doc:upload-from-an-ftp-server#section-resolving-naming-conflicts).\n9. Click **Import**. The file import process starts and you are taken to the **Files** tab.\n\n[block:callout]\n{\n  \"type\": \"info\",\n  \"title\": \"\",\n  \"body\": \"The manifest file does not get imported along with the data files. To upload a manifest file generated for files on [dbGaP](https://www.ncbi.nlm.nih.gov/Traces/study/), please use one of our [standard upload methods](doc:upload-to-the-cgc).\"\n}\n[/block]","updates":[],"order":999,"isReference":false,"hidden":false,"sync_unique":"","link_url":"","link_external":false,"_id":"5fdcdd1ea0f9e2003f6a1cb4","createdAt":"2020-12-18T16:47:26.179Z","user":"5767bc73bb15f40e00a28777","category":{"sync":{"isSync":false,"url":""},"pages":["56268a69b1c2630d00b112b0","56268a85c2781f0d00364bbc","56268a92c2781f0d00364bbe","5637e0a0cfaa870d00cdeb6a","5637e0c3fbe1c50d008cb06a","5637e164f7e3990d007b2c41"],"title":"BRING DATA TO THE CGC","slug":"bring-your-private-data","order":8,"from_sync":false,"reference":false,"_id":"55faf932a8a7770d00c2c0bf","version":"55faf11ba62ba1170021a9aa","__v":6,"createdAt":"2015-09-17T17:32:34.286Z","project":"55faf11ba62ba1170021a9a7"},"version":{"version":"1.0","version_clean":"1.0.0","codename":"","is_stable":true,"is_beta":true,"is_hidden":false,"is_deprecated":false,"categories":["55faf11ca62ba1170021a9ab","55faf8f4d0e22017005b8272","55faf91aa62ba1170021a9b5","55faf929a8a7770d00c2c0bd","55faf932a8a7770d00c2c0bf","55faf94b17b9d00d00969f47","55faf958d0e22017005b8274","55faf95fa8a7770d00c2c0c0","55faf96917b9d00d00969f48","55faf970a8a7770d00c2c0c1","55faf98c825d5f19001fa3a6","55faf99aa62ba1170021a9b8","55faf99fa62ba1170021a9b9","55faf9aa17b9d00d00969f49","55faf9b6a8a7770d00c2c0c3","55faf9bda62ba1170021a9ba","5604570090ee490d00440551","5637e8b2fbe1c50d008cb078","5649bb624fa1460d00780add","5671974d1b6b730d008b4823","5671979d60c8e70d006c9760","568e8eef70ca1f0d0035808e","56d0a2081ecc471500f1795e","56d4a0adde40c70b00823ea3","56d96b03dd90610b00270849","56fbb83d8f21c817002af880","573c811bee2b3b2200422be1","576bc92afb62dd20001cda85","5771811e27a5c20e00030dcd","5785191af3a10c0e009b75b0","57bdf84d5d48411900cd8dc0","57ff5c5dc135231700aed806","5804caf792398f0f00e77521","58458b4fba4f1c0f009692bb","586d3c287c6b5b2300c05055","58ef66d88646742f009a0216","58f5d52d7891630f00fe4e77","59a555bccdbd85001bfb1442","5a2a81f688574d001e9934f5","5b080c8d7833b20003ddbb6f","5c222bed4bc358002f21459a","5c22412594a2a5005cc9e919","5c41ae1c33592700190a291e","5c8a525e2ba7b2003f9b153c","5cbf14d58c79c700ef2b502e","5db6f03a6e187c006f667fa4","5f894c7d3b0894006477ca01"],"_id":"55faf11ba62ba1170021a9aa","releaseDate":"2015-09-17T16:58:03.490Z","createdAt":"2015-09-17T16:58:03.490Z","project":"55faf11ba62ba1170021a9a7","__v":47},"project":"55faf11ba62ba1170021a9a7","__v":0}
[block:callout] { "type": "info", "title": "", "body": "Currently available version of CDS data brings 3 CDS datasets to the CGC: [GECCO](https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001315.v1.p1), [PPTC](https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001437.v1.p1) and [LCCC-1108](https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001713.v1.p1). See the [details and history of CDS data updates](page:cds-data) on the CGC." } [/block] ## About the CDS The [Cancer Data Service (CDS)](https://datacommons.cancer.gov/repository/cancer-data-service) is a data repository under the NCI's Cancer Research Data Commons (CRDC) infrastructure for storing cancer research data generated by NCI funded programs. Its data is stored in the [Database for Genotypes and Phenotypes (dbGaP)](https://www.ncbi.nlm.nih.gov/gap/) database provided by National Center for Biotechnology Information (NCBI). CDS hosts datasets that contain controlled access data, with access permissions being controlled by dbGaP. The process of importing CDS data to the Cancer Genomics Cloud (CGC) consists of the following two stages: * Searching for data and downloading a manifest file from [dbGaP](https://www.ncbi.nlm.nih.gov/Traces/study/). * Importing files to the CGC based on the downloaded manifest file. [block:callout] { "type": "warning", "title": "", "body": "To import data CDS data to the CGC, you need to meet the following prerequisites:\n* Be logged in to the CGC using an [eRA Commons account](https://docs.cancergenomicscloud.org/docs/sign-up-for-the-cgc#section-register-via-an-external-account) with access to controlled data.\n* Perform the CDS data import procedure in a [controlled project](https://docs.cancergenomicscloud.org/docs/projects-on-the-cgc#section-controlled-data-projects) on the CGC." } [/block] Please note that the Seven Bridges team strives to keep the data available for import on the CGC aligned with CDS data updates. However, CDS updates are not instantly available for import on the CGC. Find out the [currently available release](page:cds-data#section-currently-available-release-on-the-cgc) of CDS data on the CGC and see the complete [history of updates](page:cds-data#section-update-history). ## Download CDS manifest files from dbGaP Manifest files contain information about the data you want to import in the second stage of this process. To download a manifest file: 1. Open the [dbGaP website](https://www.ncbi.nlm.nih.gov/Traces/study/). 2. In the **Accession** field enter the accession of your choice and click **Search**. The list of search results opens. 3. (Optional) In the **Filters List** section on the left, select the criteria to narrow down the result list. Now you can proceed to do the following: * Download the manifest file _for the entire set of data_ returned for the accession: 1. In the **Select** section, click **Metadata** in the **Total** table row. This downloads the manifest file for _all_ data for the accession. [block:image] { "images": [ { "image": [ "https://files.readme.io/e1a9412-cds-integration-1.png", "cds-integration-1.png", 1187, 700, "#333" ] } ] } [/block] * Select specific items from the list and download manifest file _for the selected items only_: 1. Scroll down to see the list of items that match the search and filtering criteria. 2. Check the boxes next to items you want to select. 3. In the **Select** section, click **Metadata** in the **Selected** table row. This downloads the manifest file _for the selected data only_. [block:image] { "images": [ { "image": [ "https://files.readme.io/bb4fd5b-cds-integration-2.png", "cds-integration-2.png", 1190, 700, "#333" ] } ] } [/block] ## Import CDS data to the CGC When you have downloaded a manifest file from [dbGaP](https://www.ncbi.nlm.nih.gov/Traces/study/), follow the steps below to import the data to the CGC: 1. Navigate to a [controlled project](doc:projects-on-the-cgc#section-controlled-data-projects) on the CGC or [create](doc:create-a-project) one. 2. Once in the project, click the **Files** tab. 3. Click **+ Add files**. 4. Select the **Import from a manifest file** tab on the right. 5. In the **Import files from**dropdown, select**Cancer Data Service (CDS)**. 6. Click the **Select file** button and select the downloaded manifest file. 7. (Optional) In the **Add tags** field add the keywords (tags) that describe the imported items. 8. **Resolve naming conflicts** - Select the action to be taken if a naming conflict occurs. Available actions are **Skip** (default option) and **Auto-rename**. Read more about [naming conflicts resolution](doc:upload-from-an-ftp-server#section-resolving-naming-conflicts). 9. Click **Import**. The file import process starts and you are taken to the **Files** tab. [block:callout] { "type": "info", "title": "", "body": "The manifest file does not get imported along with the data files. To upload a manifest file generated for files on [dbGaP](https://www.ncbi.nlm.nih.gov/Traces/study/), please use one of our [standard upload methods](doc:upload-to-the-cgc)." } [/block]