Import data via CDA

Overview

The Cancer Data Aggregator (CDA) aggregates diverse data types generated by NCI-funded programs into a single source that can be used to discover, query, retrieve, and aggregate data according to a variety of search parameters, such as participant, sample, tissue, disease, or study. The integration of CDA with the CGC allows you to query GDC and PDC datasets through CDA, to get the exact lists of files that you can import to the CGC and use in your further analyses.

The process of importing data obtained through CDA consists of the following two stages:

  • Querying the CDA to get the needed data through an interactive analysis on the CGC
  • Importing the resulting data to the CGC

Querying the CDA to get data

Querying for CDA data on the CGC is done programatically, through an interactive analysis that is readily available. To access the analysis, follow the steps below:

  1. On the main menu bar click Public projects > Cancer Data Aggregator.
  2. In the Analyses pane on the right open the Data Cruncher tab.
  3. Click the Cancer Data Aggregator Import analysis. Analysis details are loaded.
  4. In the top-right corner click Copy.
  5. Enter the name for the copied analysis and select the target project. 
  6. Click Copy. The analysis is now copied.
  7. Open the project where the copied analysis is located.
  8. In the project, click the Interactive Analysis tab.
  9. On the Data Cruncher card click Open. The list of available analyses opens.
  10. Locate the copied Cancer Data Aggregatoranalysis and click  next to it on the right. Your analysis will start loading and you can follow the loading progress.
  11. Once the analysis has been loaded, click  to open the analysis editor (JupyterLab). Follow the instructions in the editor to browse and import CDA data.

Importing the resulting data to the CGC

Once you have obtained the file import links resulting from your queries, the files are imported to the CGC in one of the following ways: