Case Explorer in GRch38

Hi! I used to be able to find my gene of interest (C1ORF61) using the Case Explorer but now I can't find it any more. Is there any gene re-naming in this latest version? Thanks!

Get diagnosis data

Hi! I'm trying to download the diagnosis data to see if I can find some correlations with my gene of interest. In the Data browser I can see the Diagnosis Details but despite making subgroups for the data import I can't get those details in the metadata file. Is there any way to extract them? Thanks

Problems using bash script in a CGC tool

Hello! I have created a tool based on Samtools, I used the following repository: images.sbgenomics.com/marouf/samtools:1.3 which i found in the public Samtools app. My goal is to make a tool that extracts the sample's name starting from .bam files, because i need them in the subsequent tool of my workflow. As input i'll give to samtools an array composed by two files (1 Tumor and 1 Normal tissue from the same patient), and i want the tool to discriminate if the name extracted belongs to a tumor sample, or a normal sample. So i wrote this bash script: for i in /sbgenomics/Projects/<myprojectpath>/*.bam; do if [ `/opt/samtools-1.3/samtools view -H $i | grep '^@RG' | sed "s/.*SM:............-\(...\)-.*/\1/g" | uniq` == "01A" ]; then /opt/samtools-1.3/samtools view -H $i | grep '^@RG' | sed "s/.*SM:\([^\t]*\).*/\1/g" | uniq; fi; done > tumor_name.txt In other words, for every bam in my folder (1 tumor and 1 normal) it should extract the 3 numbers that identify the sample type, compare them to "01A" (which is specific for tumor samples), if they are correct then it prints the entire sample name and puts it into a file. it returns the subsequent error log: 2017-10-13T18:10:34.886498415Z sh: 1: [: 11A: unexpected operator 2017-10-13T18:10:34.891268970Z sh: 1: [: 01A: unexpected operator 11A and 01A should be the "3 number ID" extracted as the first argument of the "if loop" (01A for the tumor sample, 11A for the normal sample), so apparently it seems that the if statement doesn't like them as arguments, as well as the opened square brackets. In the end, the tool returns me the file tumor_name.txt which is unfortunately empty (reasonably because the if statement didn't work). I thought that should have put #!/bin/bash before the "for loop" as my script is using bash commands. However when i use #!/bin/bash the standard output command ">" stops working, and this doesn't make sense to me. I tried a simple bash script to test it: #!/bin/bash echo 123 > file.txt and it doesn't work, while echo 123 > file.txt (without #!/bin/bash) works perfectly. Any help? Am i missing something in order to use bash scripts in CGC? Thank you very much in advance!

Local file upload fails with cgc_uploader

Initializing upload... Starting upload of 1 file(s) to MY-PROJECT The upload stays at 0.00%, then fails. I'm using the auth token from the Developer Dashboard. The command-line: ~/cgc-uploader/bin/cgc-uploader.sh -t MY-TOKEN -p MY-USERNAME/MY-PROJECT MY-FILE (Side note: I'm unable to use the GUI uploader, because my IT department blocks it as an unknown app.)

How to add data from a GCS bucket?

There's world-readable data in a Google Cloud Storage bucket. I can access it from curl from the https URL without authenticating, or from boto or gsutil using the gs:// URL. Is there a way to add this data to my CGC project?

Expression of long non-coding RNAs

Hi! Which is the best way to get access to the expression of long non-coding RNAs? Is there any processed data (I have already checked on the case explorer) or I should reanalyze the raw data using something like Cufflinks?

How can I download metadata on a list of files

I would like to download the meta-data for a set of files. For examples, I have four gene-wise readcounts text files. Each file has associated metadata: platform type, reference genome, disease, investigation etc. I would like to download the associated metadata for a set of files, as a text file, onto my local machine. Is it possible to do this? Thanks, Anjan

How are EC2 instance types assigned to a task

How does SBG assign an EC2 instance type to a task? Is this done solely on the resource (CPU, memory) requirements specified by the user, or is there any other optimization going on? Thanks, Anjan

Multiple TCGA BAM for same sample

HI there, I've noticed that some TCGA samples have multiple RNA-Seq BAM files, what would be the difference between the files? If it's convenient to use one over the other, what criteria should be used to select such file. Example: TCGA-STAD RNA-seq TCGA-BR-8368-01A-11R-2343-13_rnaseq.bam 14.5GB _1_TCGA-BR-8368-01A-11R-2343-13_rnaseq.bam 14.7GB Thanks, Franco

Specifying a bowtie index in command line

I wanted to post my question and solution here in case anyone searches for it in the future. I was trying to wrap a script that requires a path and prefix for the bowtie index files. I was submitting them as an input array of files, but the script was unable to find them. What I didn't realize is that the files were in the Projects directory and the script was in the working directory, so in the Stage Input section of Edit Input Property, I had to choose Link, which creates symlinks in the working directory. I didn't immediately realize this because the script could find my fastq input file without any staging. I've since noticed that the STAR aligner tool uses a tarball of the index files, and Darko also adds: "As an extra note, an alternative to the approach you're currently using would be to create a single .tar file containing all the output files of bowtie-build run. For example, when invoking Bowtie indexer, use a command like: bowtie-build ucsc_hg19.fasta ucsc_hg19 && tar -cf ucsc_hg19.tar *.ebwt You would then use the resulting .tar file as a single reference bundle, replacing the file array input port with a single file input. You would have to stage it and add an extra base command (before the actual call to mapper.pl) containing a JavaScript expression that returns a string with the reference bundle file name. For example (assuming you're using 'reference' as the ID of the respective input port): { var reference_bundle = $job.inputs.reference.path.split('/').slice(-1) return 'tar -xf ' + reference_bundle + ' ; ' } Finally, you would modify the JavaScript expression generating the value of -p argument of mapper.pl to get the basename of the .tar bundle (which must be the same as that of the Bowtie index files)."