Import data from ICDC

About ICDC

Integrated Canine Data Commons (ICDC), a cloud-based repository of canine cancer data and was established to further research on human cancers by enabling comparative analysis with canine cancer. The data in the ICDC is sourced from multiple different programs and projects; all focused on canine subjects. The data is harmonized into an integrated data model and then made available to the research community.

Data that is available for import on the CGC is now instantly aligned with ICDC data updates. With the power of global standards for data sharing all data that ICDC provides access to can be imported and used on the CGC platform.

The process of importing files from ICDC to the Cancer Genomics Cloud (CGC) is simple and can be done by a push of a button on the ICDC portal.

Users that are already familiar with the old process can use the following two stages:

  • Downloading a manifest file from the ICDC website.
  • Importing files to the CGC based on the downloaded manifest file.

Import files from ICDC to the CGC

To import ICDC files you must first find the data of interest on the ICDC portal:

  1. Open the ICDC website.
  2. Click Explore in the main navigation bar at the top of the page.
  3. In the filters section on the left, select criteria to narrow down the cases list.
  1. Scroll down to see the list of cases that match the criteria.
  1. Click Study Files.
  2. Select the files and click Add Selected Files.
  3. Click MY FILES in the top right corner of the page. The information about the number of selected files is also shown.
  1. Select the files which you want to import on CGC.

  2. Click Available Export Options and choose "Export to Cancer Genomics Cloud". You will then land on CGC where you can continue with the procedure of importing files.

  1. Choose the destination project and set the rule for resolving naming conflicts.
  2. (Optional) Specify tags for these files.
  3. Tick the box to comply with the rules for using these files.
  4. Click Import data.

The files are now imported to your project.

Import files using a manifest file

An alternative way of importing files from the ICDC portal is by using the manifest file. Follow these steps:

  1. After selecting files you want to import (see above for more information as the procedure is the same) and adding files to the My Files section, click Download File Manifest.
  1. In the popup that is displayed next, click Continue to Download File Manifest.

  2. Save the manifest to your computer. The rest of the steps are done on the CGC.

  3. Access the project you want to import the files to.

  4. Click the Files tab.

  5. Click Add files and choose "GA4GH Data Repository Service (DRS)".

  6. Next, click From a manifest file.

  7. Click Browse manifest and locate the manifest file you have previously downloaded from the ICDC Portal.

  8. Click Submit.

The files are now imported to your project.

Metadata

Metadata assigned to each ICDC file is now managed by ICDC portal and is set during the import process. If you want to get more metadata from ICDC, you can use the ICDC API, which provides easy and open access to ICDC metadata. One of the ways to get ICDC metadata on the CGC is to use Data Cruncher to query the ICDC API.

Reference files

The Public Reference Files repository on the CGC contains two ICDC-related files that are ready to be used in e.g. RNA-seq alignment workflows with ICDC data provided as inputs. Those files are:

  • Canis_familiaris.CanFam3.1.dna.toplevel.fa (reference genome)
  • Canis_familiaris.CanFam3.1.98.gtf (gene annotation file)