{"metadata":{"image":[],"title":"","description":""},"api":{"url":"","auth":"required","results":{"codes":[]},"settings":"","params":[]},"next":{"description":"","pages":[]},"title":"Import data from the PDC","type":"basic","slug":"import-from-the-pdc","excerpt":"","body":"[block:callout]\n{\n  \"type\": \"info\",\n  \"body\": \"Currently available version of PDC data on the CGC corresponds to the **PDC Data Release of 12-20-2019**. Get more information about [updates of PDC data on the CGC](page:pdc-data).\"\n}\n[/block]\n## About the PDC\n\nThe NCI Cancer Research Data Commons (CRDC) aims to create a scalable infrastructure that provides secure access to many different data types across scientific domains, allowing users to analyze, share, and store results, leveraging the storage and elastic compute of the cloud. As a node in this CRDC ecosystem, the Proteomic Data Commons (PDC) is a pilot project to democratize access to cancer-related proteomic datasets as well as to provide sustainable computational support to the cancer research community.<sup>[1](https://pdc.cancer.gov/pdc/about)</sup>\n\nThe process of importing files from the PDC to the Cancer Genomics Cloud (CGC) consists of the following two stages:\n\n* Downloading a manifest file from the [PDC website](https://pdc.cancer.gov/pdc/browse).\n* Importing files to the CGC based on the downloaded manifest file.\n\n## Downloading manifest files from the PDC\n\nManifest files that are downloaded from the PDC contain information about the data you want to import in the second stage of this process.\n\nTo download a manifest file from the PDC:\n1. Open the [PDC website](https://pdc.cancer.gov/pdc/browse).\n2. Select the **Files** tab below the chart. A list of all files is displayed below.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/422ce91-pdc-integration-1.png\",\n        \"pdc-integration-1.png\",\n        1425,\n        676,\n        \"#eaedee\"\n      ]\n    }\n  ]\n}\n[/block]\n3. (Optional) In the **Filters** pane, use the available filtering options to narrow down the search results.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/5c75195-pdc-integration-2.png\",\n        \"pdc-integration-2.png\",\n        1425,\n        676,\n        \"#eaedee\"\n      ]\n    }\n  ]\n}\n[/block]\n4. Check the boxes next to the files you want to download.\n5. Click **Export File Manifest** in the top-right corner above the table. A manifest file in the CSV format is downloaded to your computer. Please keep the file as it will be used in the following stage of the import process.\n\n## Import files from the PDC to the CGC\n1. [Navigate to a project on the CGC](doc:view-a-project).\n2. Click the **Files** tab.\n3. Click **+ Add files**.\n4. Select the **Import from a manifest file** tab on the right.\n5. In the **Import files from** dropdown, select **Proteomic Data Commons (PDC)**.\n6. Click the **Select file** button and select the downloaded manifest file.\n7. (Optional) In the **Add tags** field add the keywords (tags) that describe the imported items.\n8. **Resolve naming conflicts** - Select the action to be taken if a naming conflict occurs. Available actions are **Skip** (default option) and **Auto-rename**. Read more about [naming conflicts resolution](doc:upload-from-an-ftp-server#section-resolving-naming-conflicts).\n9. Click **Import**. The file import process starts and you are taken to the **Files** tab.\n[block:callout]\n{\n  \"type\": \"info\",\n  \"title\": \"\",\n  \"body\": \"The manifest file does not get imported along with the data files. To upload a manifest file generated for files on the PDC Data Portal, or any other type of manifest file such as study, biospecimens, clinical or genes manifest, please use one of our [standard upload methods](doc:upload-to-the-cgc).\"\n}\n[/block]\n### Metadata\nYou can use the PDC API to retrieve metadata for imported PDC data. For an example of how to fetch metadata, please see this [tutorial](doc:fetch-metadata-from-the-pdc-api).","updates":[],"order":6,"isReference":false,"hidden":false,"sync_unique":"","link_url":"","link_external":false,"_id":"5d480e7fd35b9b0011a19c9c","project":"55faf11ba62ba1170021a9a7","version":{"version":"1.0","version_clean":"1.0.0","codename":"","is_stable":true,"is_beta":true,"is_hidden":false,"is_deprecated":false,"categories":["55faf11ca62ba1170021a9ab","55faf8f4d0e22017005b8272","55faf91aa62ba1170021a9b5","55faf929a8a7770d00c2c0bd","55faf932a8a7770d00c2c0bf","55faf94b17b9d00d00969f47","55faf958d0e22017005b8274","55faf95fa8a7770d00c2c0c0","55faf96917b9d00d00969f48","55faf970a8a7770d00c2c0c1","55faf98c825d5f19001fa3a6","55faf99aa62ba1170021a9b8","55faf99fa62ba1170021a9b9","55faf9aa17b9d00d00969f49","55faf9b6a8a7770d00c2c0c3","55faf9bda62ba1170021a9ba","5604570090ee490d00440551","5637e8b2fbe1c50d008cb078","5649bb624fa1460d00780add","5671974d1b6b730d008b4823","5671979d60c8e70d006c9760","568e8eef70ca1f0d0035808e","56d0a2081ecc471500f1795e","56d4a0adde40c70b00823ea3","56d96b03dd90610b00270849","56fbb83d8f21c817002af880","573c811bee2b3b2200422be1","576bc92afb62dd20001cda85","5771811e27a5c20e00030dcd","5785191af3a10c0e009b75b0","57bdf84d5d48411900cd8dc0","57ff5c5dc135231700aed806","5804caf792398f0f00e77521","58458b4fba4f1c0f009692bb","586d3c287c6b5b2300c05055","58ef66d88646742f009a0216","58f5d52d7891630f00fe4e77","59a555bccdbd85001bfb1442","5a2a81f688574d001e9934f5","5b080c8d7833b20003ddbb6f","5c222bed4bc358002f21459a","5c22412594a2a5005cc9e919","5c41ae1c33592700190a291e","5c8a525e2ba7b2003f9b153c","5cbf14d58c79c700ef2b502e","5db6f03a6e187c006f667fa4"],"_id":"55faf11ba62ba1170021a9aa","releaseDate":"2015-09-17T16:58:03.490Z","createdAt":"2015-09-17T16:58:03.490Z","project":"55faf11ba62ba1170021a9a7","__v":46},"category":{"sync":{"isSync":false,"url":""},"pages":["56268a69b1c2630d00b112b0","56268a85c2781f0d00364bbc","56268a92c2781f0d00364bbe","5637e0a0cfaa870d00cdeb6a","5637e0c3fbe1c50d008cb06a","5637e164f7e3990d007b2c41"],"title":"BRING DATA TO THE CGC","slug":"bring-your-private-data","order":8,"from_sync":false,"reference":false,"_id":"55faf932a8a7770d00c2c0bf","version":"55faf11ba62ba1170021a9aa","__v":6,"createdAt":"2015-09-17T17:32:34.286Z","project":"55faf11ba62ba1170021a9a7"},"user":"5767bc73bb15f40e00a28777","createdAt":"2019-08-05T11:09:51.217Z","__v":0,"parentDoc":null}

Import data from the PDC


[block:callout] { "type": "info", "body": "Currently available version of PDC data on the CGC corresponds to the **PDC Data Release of 12-20-2019**. Get more information about [updates of PDC data on the CGC](page:pdc-data)." } [/block] ## About the PDC The NCI Cancer Research Data Commons (CRDC) aims to create a scalable infrastructure that provides secure access to many different data types across scientific domains, allowing users to analyze, share, and store results, leveraging the storage and elastic compute of the cloud. As a node in this CRDC ecosystem, the Proteomic Data Commons (PDC) is a pilot project to democratize access to cancer-related proteomic datasets as well as to provide sustainable computational support to the cancer research community.<sup>[1](https://pdc.cancer.gov/pdc/about)</sup> The process of importing files from the PDC to the Cancer Genomics Cloud (CGC) consists of the following two stages: * Downloading a manifest file from the [PDC website](https://pdc.cancer.gov/pdc/browse). * Importing files to the CGC based on the downloaded manifest file. ## Downloading manifest files from the PDC Manifest files that are downloaded from the PDC contain information about the data you want to import in the second stage of this process. To download a manifest file from the PDC: 1. Open the [PDC website](https://pdc.cancer.gov/pdc/browse). 2. Select the **Files** tab below the chart. A list of all files is displayed below. [block:image] { "images": [ { "image": [ "https://files.readme.io/422ce91-pdc-integration-1.png", "pdc-integration-1.png", 1425, 676, "#eaedee" ] } ] } [/block] 3. (Optional) In the **Filters** pane, use the available filtering options to narrow down the search results. [block:image] { "images": [ { "image": [ "https://files.readme.io/5c75195-pdc-integration-2.png", "pdc-integration-2.png", 1425, 676, "#eaedee" ] } ] } [/block] 4. Check the boxes next to the files you want to download. 5. Click **Export File Manifest** in the top-right corner above the table. A manifest file in the CSV format is downloaded to your computer. Please keep the file as it will be used in the following stage of the import process. ## Import files from the PDC to the CGC 1. [Navigate to a project on the CGC](doc:view-a-project). 2. Click the **Files** tab. 3. Click **+ Add files**. 4. Select the **Import from a manifest file** tab on the right. 5. In the **Import files from** dropdown, select **Proteomic Data Commons (PDC)**. 6. Click the **Select file** button and select the downloaded manifest file. 7. (Optional) In the **Add tags** field add the keywords (tags) that describe the imported items. 8. **Resolve naming conflicts** - Select the action to be taken if a naming conflict occurs. Available actions are **Skip** (default option) and **Auto-rename**. Read more about [naming conflicts resolution](doc:upload-from-an-ftp-server#section-resolving-naming-conflicts). 9. Click **Import**. The file import process starts and you are taken to the **Files** tab. [block:callout] { "type": "info", "title": "", "body": "The manifest file does not get imported along with the data files. To upload a manifest file generated for files on the PDC Data Portal, or any other type of manifest file such as study, biospecimens, clinical or genes manifest, please use one of our [standard upload methods](doc:upload-to-the-cgc)." } [/block] ### Metadata You can use the PDC API to retrieve metadata for imported PDC data. For an example of how to fetch metadata, please see this [tutorial](doc:fetch-metadata-from-the-pdc-api).