Using CGC for Workshops and Courses

Overview

There are many challenges associated with teaching cancer research concepts to students in a hands-on workshop or course format. Course instructors need mechanisms to share course materials with the students, including test data, methods, and example analyses.

Course instructors also need to provide students with up-to-date software packages for carrying out the hands-on exercises. Finally, course instructors benefit from the ability to monitor student progress and review final work. All of this becomes even more challenging when courses and workshops are held virtually.

Likewise, students need access to the course materials (including the files and code) as well as a resource with sufficient computational power. Students may require technical help from course instructors as they work through the exercises.

CGC provides a solution for the challenges described above, offering course instructors and students the necessary technical infrastructure for teaching, sharing class materials, and learning. This allows the instructors and students to focus on the science, instead of the technical infrastructure.

CGC Success Stories

Professors mentioned in the stories below, have used the CGC to teach complex bioinformatics concepts. The students, on the other hand, have learned the coursework and are enabled to continue using their CGC account on future research projects on CGC.

Masters in Health Informatics and Data Science Class at Georgetown University

Georgetown University Master’s degree in Health Informatics and Data Science is designed to teach students to apply informatics and data science to lead the transformation of healthcare.

Our long term collaborators Dr. Yuriy Gusev and Ms. Krithika Bhuvaneshwar lead this Master’s program. They have been using the CGC platform for the past 3 years to train the next generation of data scientists.

To make it easier for students, they decided to use Seven Bridges’ collection of tools. The Seven Bridges CGC platform presents the tools and data in a very user-friendly, graphical interface where students have this low barrier to understand and start using the pipelines, which otherwise would require a lot of additional training.

Using the project infrastructure, they have created a project for each of their students and teach them how to copy data, pipelines, and how to collaborate with the CGC. They are able to follow their progress and troubleshoot in real time.

Over the past 3 years, Dr. Gusev has used the CGC to train students on how to perform RNAseq analysis and imaging machine learning algorithms in the cloud. Each class has an average of 15-20 students, and the CGC is used for at least two of the classes in the Masters program.

For more details on Dr. Gusev’s experience, you can find a recorded webinar on our website.

The Seven Bridges team provided support to participants prior to and during the workshop, assisting with account setup and troubleshooting. This enabled the course instructors to focus on the content and teaching while the Seven Bridges team focused on the infrastructure and platform account logistics.

Stats 581 Undergraduate class at Purdue University

Together with our collaborator Dr Min Zhang, Seven Bridges prepared a lecture series of 4 classes of ~50 min, to teach undergraduate students how to use the CGC platform to perform end-to-end analysis of bulk RNA-seq and single-cell data RNA-seq data.

During the first lecture, the participants learned how to change their billing group, how to create a project, and easily transfer data from one project or collaborator into their own.

During the following lectures, the students and attendees learned how to transfer open access data from two SRA datasets to their projects, perform RNA-seq analysis starting with alignment of sequences and do differential expression analysis from publicly available data.

They also learned to utilize popular CGC single-cell tools, such as Seurat, to identify single cell gene expression differences.

Prior to the first lecture the Seven Bridges team instructed all of the registered students to create platform accounts and added them to the professor billing group.

During the series, the Seven Bridges team provided the slides, reading materials and expected pipelines to allow the participants to get started on their own. The lecture series had an average of 30 attendees, and the recorded sessions were made available at the Purdue and CGC websites.

Procedure

Setting up course materials on the platform

Course and workshop organizers can share data and code with participants by using platform projects. The organizers can choose to make the course materials available to all users on the platform or limit distribution to the registered participants.

Distributing materials to specific participants

In cases where there is a registration fee for a workshop or course, organizers may want to limit distribution of the course materials to only those individuals who registered for the course. In this case, organizers should set up a Course Materials project with all the needed data and methods.

For small courses of less than 20 participants, the participants can be added as members of the Course Materials project. All course participants can work together in the same project and view each other’s analyses.

It becomes difficult to organize the work of more than 20 people in one project. For that reason, we recommend giving each participant their own copy of the Course Materials project when there are more than 20 participants in a course. Instructors and support staff can be added as members of the project as needed for assistance.

Sharing materials with all platform users

In cases where the workshop or course materials could benefit researchers beyond those who registered for the event, the materials can be made available to CGC users through a “public project.” Public projects are listed in the top navigation bar of the platform. Organizers set up a private project with all of the needed data and methods for the event. Once the project is finalized, the Seven Bridges team creates a “public project” using the content provided. Public projects can only contain open access data that can be shared with all platform users.

Files

Course instructors should provide students with test files for the course. These test files should be added to the Files tab of the project. Course instructors can upload/import files to the Platform, following CGC data protection policies and these steps.

Open access hosted datasets can also be used. For example, CGC hosts files from the 1000 Genomes dataset and makes these available to all users on the platform. Learn more about the hosted datasets.

Analysis methods

Course instructors can make methods available within projects using either interactive notebooks in the Data Cruncher feature and/or through Common Workflow Language tools and workflows. If course instructors want to expose all of the code in a set of methods, they can set up RStudio, Jupyterlab, or SAS notebooks.

To access the code, the participant would launch the notebook. If course instructors want to show participants how to scale up analyses (batch processing) and ensure reproducibility, they can include CWL versions of the methods in the project as well. For more information on CWL, see this blog post on working with CWL.

Determine how cloud costs will be supported

Participants will incur cloud costs while using CGC. In order to create projects and run analyses on the platform, participants must be a member of a platform billing group. A billing group provides a mechanism for the participant cloud costs to be captured and paid for.

We recommend that course organizers use one platform billing group to support the cloud costs of all event participants. The Seven Bridges team creates the billing group in advance of the workshop and ensures that all participants are set up as “members” of the billing group.

Course or workshop organizers have two options for supporting the costs of the billing group:

  • Apply for CGC Cloud Credits
  • Set up a payment mechanism with Seven Bridges

Apply for CGC Cloud Credits

CGC offers a cloud credits program for cancer researchers. Every researcher that is a new user can request $300 in pilot cloud credits. In addition, researchers can apply collaborative projects, provided in the form of cloud credit credits. We recommend applying for a collaborative project to support the cost of multiple participants in a workshop or course. For more information on how to apply for a collaborative project, see our website here https://www.cancergenomicscloud.org/collaborative-funds.

Steps:

  1. Estimate the cloud costs for the event
    a) Set up data and code on CGC.
    b) Run through the exercises to determine the expected cloud costs for one participant
    c) Estimate the total costs by scaling up to the expected number of participants
    d) Refer to tutorial on Estimating and Managing Cloud Costs
    e) Reach out [email protected] if you need assistance
  2. Submit application for cloud credits
    a) Go to the CGC website.
    b) Submit an application for a “Collaborative Grant”.
  3. Seven Bridges CGC team will be notified if your application is approved. If approved, Seven Bridges will create a platform billing group with the approved cloud credit amount. You will be notified about the status of your application as well as once your billing group is available.

Set up a payment mechanism with Seven Bridges

Researchers can pay for their own cloud costs on the CGC by providing a purchase order or a credit card number. Course instructors can work with the Seven Bridges team to set up a billing group. Course instructors have the choice to be invoiced after the workshop completes or to pre-pay.

Steps:

  1. Estimate the cloud costs for the event
    a) Set up data and code on CGC
    b) Run through the exercises to determine the expected cloud costs for individual participants
    c) Estimate the total costs by scaling up to the expected number of participants
    d) Refer to tutorial on Estimating and Managing Cloud Costs
    e) Reach out [email protected] if you need assistance
  2. Email [email protected] and indicate that you would like to host a workshop or course on the platform and need a platform billing group. The support team will inform the CGC Program Manager who will reach out to you with next steps.

Platform accounts for participants

Course organizers manage registration for the course. The Seven Bridges team manages the process of course participants getting set up with accounts on CGC.

2 weeks before the event
Course organizers send the Seven Bridges Team the final list of participants and their email addresses.

Seven Bridges Team emails the participants instructions on how to create accounts. Participants are asked to send their platform usernames to the Seven Bridges Team. This step confirms that the individual has completed account creation and enables the Seven Bridges Team to add the individual to the platform billing group that will be used to support cloud costs for the event.

1 week before the event

For participants who have not yet created platform accounts:

  • Seven Bridges Team follows up with a reminder email.

For participants who have created platform accounts:

  • Seven Bridges Team adds them to the platform billing group that will be used to support the cloud costs for the event.
  • Seven Bridges Team distributes an introductory tutorial and asks participants to work through the instructions prior to the event.

Day of the event

A Seven Bridges Community Engagement Manager is available to ensure the students are successful. The Seven Bridges Support Team is also available to troubleshoot any issues that may come up. Students are also encouraged to attend office hours with Seven Bridges, on Tuesdays 10am and Thursdays 2pm ET.

For more information

If you are interested in using CGC for a workshop or course, please write to us at [email protected]. We are eager to help you get the most out of the platform.