Bulk RNA-Seq Transcription Profiling of HSV-1 Infected Cells

Overview

This Public Project serves as an example and provides scaffolding for Bulk RNA-Seq Processing and downstream Differential Expression analysis.

These data were obtained from experiments with Human fibroblast cells either infected or mock-infected with Herpes simplex virus (HSV-1), and are publicly available.

Two groups, 3 of each treatment type, will be analyzed using the Bulk RNA-Seq processing pipeline Tool, starting from raw FASTQ reads and outputting a citable report with summary statistics and key graphs.
Analysis substeps and total analysis outputs are retained for further exploration and post-hoc testing. The results will show the two groups expressing different Transcriptome profiles for ~25% of genes.

This page contains detailed, step-by-step instructions to follow while carrying out the analysis through the Bulk RNA-Seq Transcription Profiling of HSV-1 Infected Cells.

To learn more about this project, or how to adapt it to your own research, please reach out to [email protected].

Set up the project and run the workflow

To perform the Bulk RNA-Seq analysis, you will first copy the public project and then execute the workflow therein. Public projects contain all necessary files, instructions, and tools necessary to perform the specified analysis(es).
In order to perform the analysis, you will need to make your own copy as the Public Project serves as a repository for all users, and as such, is not editable.

Copy the public project

  1. Click Public Projects on the top navigation bar.
  2. Locate the "Bulk RNA-Seq Transcription Profiling of HSV-1 Infected Hepatocellular Carcinoma Cells" public project.
  3. Click Copy project in the lower right corner.
  4. Under "Billing group", select your pilot funds group.
  5. Click Copy project.

The public project is now copied to your projects.

Run the Bulk RNA-Seq Workflow

Overview

The Cancer Genomics Cloud offers a repository of publicly available apps suitable for many different types of data processing and analysis.

Apps include both tools (individual bioinformatics utilities) and workflows (chains or pipelines of connected tools). Publicly available apps are maintained by the Seven Bridges bioinformatics team and represent the latest tool versions.

The Workflow will need to be parameterized for the ‘Bulk RNA-Seq sequencing pipeline’ to produce Differential Expression (DE) Transcriptome results.

You will supply the data (human fibroblast cells infected and mock infected with HSV-1) included in the copy of the project, and run the pipeline to produce significant DE results.

Procedure

  1. Click the Projects tab on the top navigation bar and click your new copy of Bulk RNA-Seq for HSV-1. The Project Dashboard will appear.
  2. Click the Apps tab.

  1. Click ▶︎Run to the right of the "Differential Expression - Salmon + DESeq2" workflow. A new window will appear.
  2. The next step is selecting input files for the task:
    • for "FASTQ read files": click Select file(s), select all available FASTA files and click Save selection.
    • for "GTF annotation file": click Select file(s), select "GRCh38ERCC.ensembl95.gtf" and click Save selection.
    • for “Transcript FASTA or Salmon Index”: click Select file(s), select "GRCh38ERCC.ensembl95.transcriptome.gentrome.salmon-1.2.0-index-archive.tar" and click Save selection.

  1. Under "App Settings":
    • for "Covariate of Interest" parameter enter "Genotype"
    • for "Factor level - reference" parameter enter "WT"
    • for "Factor level - test" parameter enter "KD"
  2. Click Run. The Workflow will start processing the data and running analyses.

Interpret your results

The results will be summarized in the "SRR905.DEAnalysis.deseq2.1.26.0.summary_report.b64html“ file under "HTML report” section of the results.

Individual results and iterative analyses contributing to the final results are available under the other Output Settings section.

Inspection of the results will reveal Differential Expression of the Transcriptome between infected and mock infected groups. These results are expressed in a number of different ways, including Principal Component Analysis plots, Cluster Dendrograms, and Gene tables.