TCGA diagnostic slides - stain types

I'm working on a research project that involves the analysis of MSI/MSS status in H&E stained diagnostic slides from the COAD and READ cohort in the TCGA database. I've collected around 600 slides so far, but I've encountered some difficulties in finding information about the staining procedure used. While most of the slides appear to have been stained using the standard H&E stain, some of the slides seem to be stained differently, with a more purple/blue hue. Is this information about the staining procedure available in the TCGA database? Are there any tools (like a python package or something) that could identify the stain type from the diagnostic slide? Any suggestions or advice would be greatly appreciated. Thanks

Different instance types for different workflow steps?

Is it possible to have different steps in a workflow run on different instance types? Or is it always just the single instance type? I have a pipeline that contains some single threaded steps, like biobambam bamtofastq, and some that are multithreaded, like bowtie2 and samtools sort. Seems like a waste of resources to run a c4.2xlarge for a single thread. And a bit more cumbersome to have to run the pipeline in different steps and manually cleanup.

Search filename in the GDC data portal

Hi, 3 years ago I downloaded tsv files of indels and SNVs of TCGA samples from the GDC data portal. Right now I want to find the barcode-id of each sample. When I search for the specific file names in the GDC there is no result. An SNV file name for example: c84b676a-a409-517f-9920-b63119f1f717 An Indels file name for example: 067a20c4-a4da-5899-810f-5768d0a43555 Does anyone have a solution? Thanks!

Can you transfer files from CGC to GEO?

Can you transfer files from CGC to GEO?

GRAF benchmarking

Hey! Hope you're well! We ran the GRAF pipeline for NA12878 with this inputs and take the ouput and ran hap.py comparing the results with the gold standard downloaded from this link: https://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/release/NA12878_HG001/latest/GRCh38/ and obtain: 56.42 % of precision for all indels and 84.01% of precision for all snps. Based on the publication of GRAF and what we've seen on the blog, the expected precision was near 99% but we are not being able to reproduce these results. Do you know what could be wrong? Thanks in advance for your help! Best, Maria Teresita Laguinge

Problem of permissions executing a tool

I'm running a tool that uses perl and python scripts, when I ran the tool, I get an error of permission: 2022-08-06T17:55:11.440911533Z Can't exec "annotate_variation.pl": Permission denied at /sbgenomics/workspaces/94457834-7ccb-45da-84c3-ee44297388a4/tasks/f9019512-ba43-4ff9-9839-889e8cbf3290/intervar_cwl/table_annovar.pl line 444. Do you know why this could be and how to fix it?

I can't view the Mouse reference genome in the Genome Browser

I have been trying to view the aligned files in the genome browser but I am not able to view the mouse mm10 reference genome. It gives this as the error: Unable to recognize genome reference from the BAM header. Using Human hg19.

problem downloading SGDP data with wget

Hi, I'm trying to download a SGDP file via wget and the download link. I get error 403. Could you please explain the nature of the error? [[email protected] sgdp]$ wget https://sb-datasets-us-east-1.s3-fips.us-east-1.amazonaws.com/cgl-sgdd-reorg/SGDP/REMAP_hs37d5/LP6005441-DNA_A01.annotated.nh.vcf.gz?x-username=enabieva&x-env=cgc&x-requestId=f1d09f8c-754d-4a6b-b2c8-d252cd81e390&x-project=sevenbridges%2Fsimons-genome-diversity-project-sgdp&response-content-disposition=attachment%3Bfilename%3DLP6005441-DNA_A01.annotated.nh.vcf.gz&response-content-type=application%2Foctet-stream&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20220323T140118Z&X-Amz-SignedHeaders=host&X-Amz-Expires=172799&X-Amz-Credential=AKIAJQD4ZMI5SNVG2A2A%2F20220323%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Signature=96f71ce17b3a206fce4fd77c116a7f81ce85e53de78467c016b61ebb4d6cf4c5 [2] 9275 [3] 9276 [4] 9277 [5] 9279 [6] 9280 -bash: x-env=cgc: co mmand not found [7] 9281 -bash: x-requestId=f1d09f8c-754d-4a6b-b2c8-d252cd81e390: command not found [8] 9282 -bash: x-project=sevenbridges%2Fsimons-genome-diversity-project-sgdp: command not found -bash: response-content-disposition=attachment%3Bfilename%3DLP6005441-DNA_A01.annotated.nh.vcf.gz: command not found [9] 9283 -bash: response-content-type=application%2Foctet-stream: command not found -bash: X-Amz-Algorithm=AWS4-HMAC-SHA256: command not found [10] 9284 [11] 9285 -bash: X-Amz-Date=20220323T140118Z: command not found -bash: X-Amz-SignedHeaders=host: command not found [12] 9286 -bash: X-Amz-Expires=172799: command not found -bash: X-Amz-Credential=AKIAJQD4ZMI5SNVG2A2A%2F20220323%2Fus-east-1%2Fs3%2Faws4_request: command not found --2022-03-23 17:41:00-- https://sb-datasets-us-east-1.s3-fips.us-east-1.amazonaws.com/cgl-sgdd-reorg/SGDP/REMAP_hs37d5/LP6005441-DNA_A01.annotated.nh.vcf.gz?x-username=enabieva -bash: X-Amz-Signature=96f71ce17b3a206fce4fd77c116a7f81ce85e53de78467c016b61ebb4d6cf4c5: command not found [3] Exit 127 x-env=cgc [4] Exit 127 x-requestId=f1d09f8c-754d-4a6b-b2c8-d252cd81e390 [5] Exit 127 x-project=sevenbridges%2Fsimons-genome-diversity-project-sgdp [6] Exit 127 response-content-disposition=attachment%3Bfilename%3DLP6005441-DNA_A01.annotated.nh.vcf.gz [7] Exit 127 response-content-type=application%2Foctet-stream [8] Exit 127 X-Amz-Algorithm=AWS4-HMAC-SHA256 [9] Exit 127 X-Amz-Date=20220323T140118Z [10] Exit 127 X-Amz-SignedHeaders=host [11] Exit 127 X-Amz-Expires=172799 [12]- Exit 127 X-Amz-Credential=AKIAJQD4ZMI5SNVG2A2A%2F20220323%2Fus-east-1%2Fs3%2Faws4_request [[email protected] sgdp]$ Resolving sb-datasets-us-east-1.s3-fips.us-east-1.amazonaws.com... 52.217.94.241 Connecting to sb-datasets-us-east-1.s3-fips.us-east-1.amazonaws.com|52.217.94.241|:443... connected. HTTP request sent, awaiting response... 403 Forbidden 2022-03-23 17:41:01 ERROR 403: Forbidden. [2]- Exit 8 wget https://sb-datasets-us-east-1.s3-fips.us-east-1.amazonaws.com/cgl-sgdd-reorg/SGDP/REMAP_hs37d5/LP6005441-DNA_A01.annotated.nh.vcf.gz?x-username=enabieva

problem downloading SGDP data

Hi, I'm trying to download a SGDP file via wget and the download link. I get error 403: [[email protected] sgdp]$ wget https://sb-datasets-us-east-1.s3-fips.us-east-1.amazonaws.com/cgl-sgdd-reorg/SGDP/REMAP_hs37d5/LP6005441-DNA_A01.annotated.nh.vcf.gz?x-username=enabieva&x-env=cgc&x-requestId=f1d09f8c-754d-4a6b-b2c8-d252cd81e390&x-project=sevenbridges%2Fsimons-genome-diversity-project-sgdp&response-content-disposition=attachment%3Bfilename%3DLP6005441-DNA_A01.annotated.nh.vcf.gz&response-content-type=application%2Foctet-stream&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20220323T140118Z&X-Amz-SignedHeaders=host&X-Amz-Expires=172799&X-Amz-Credential=AKIAJQD4ZMI5SNVG2A2A%2F20220323%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Signature=96f71ce17b3a206fce4fd77c116a7f81ce85e53de78467c016b61ebb4d6cf4c5 [2] 9275 [3] 9276 [4] 9277 [5] 9279 [6] 9280 -bash: x-env=cgc: co mmand not found [7] 9281 -bash: x-requestId=f1d09f8c-754d-4a6b-b2c8-d252cd81e390: command not found [8] 9282 -bash: x-project=sevenbridges%2Fsimons-genome-diversity-project-sgdp: command not found -bash: response-content-disposition=attachment%3Bfilename%3DLP6005441-DNA_A01.annotated.nh.vcf.gz: command not found [9] 9283 -bash: response-content-type=application%2Foctet-stream: command not found -bash: X-Amz-Algorithm=AWS4-HMAC-SHA256: command not found [10] 9284 [11] 9285 -bash: X-Amz-Date=20220323T140118Z: command not found -bash: X-Amz-SignedHeaders=host: command not found [12] 9286 -bash: X-Amz-Expires=172799: command not found -bash: X-Amz-Credential=AKIAJQD4ZMI5SNVG2A2A%2F20220323%2Fus-east-1%2Fs3%2Faws4_request: command not found --2022-03-23 17:41:00-- https://sb-datasets-us-east-1.s3-fips.us-east-1.amazonaws.com/cgl-sgdd-reorg/SGDP/REMAP_hs37d5/LP6005441-DNA_A01.annotated.nh.vcf.gz?x-username=enabieva -bash: X-Amz-Signature=96f71ce17b3a206fce4fd77c116a7f81ce85e53de78467c016b61ebb4d6cf4c5: command not found [3] Exit 127 x-env=cgc [4] Exit 127 x-requestId=f1d09f8c-754d-4a6b-b2c8-d252cd81e390 [5] Exit 127 x-project=sevenbridges%2Fsimons-genome-diversity-project-sgdp [6] Exit 127 response-content-disposition=attachment%3Bfilename%3DLP6005441-DNA_A01.annotated.nh.vcf.gz [7] Exit 127 response-content-type=application%2Foctet-stream [8] Exit 127 X-Amz-Algorithm=AWS4-HMAC-SHA256 [9] Exit 127 X-Amz-Date=20220323T140118Z [10] Exit 127 X-Amz-SignedHeaders=host [11] Exit 127 X-Amz-Expires=172799 [12]- Exit 127 X-Amz-Credential=AKIAJQD4ZMI5SNVG2A2A%2F20220323%2Fus-east-1%2Fs3%2Faws4_request [[email protected] sgdp]$ Resolving sb-datasets-us-east-1.s3-fips.us-east-1.amazonaws.com... 52.217.94.241 Connecting to sb-datasets-us-east-1.s3-fips.us-east-1.amazonaws.com|52.217.94.241|:443... connected. HTTP request sent, awaiting response... 403 Forbidden 2022-03-23 17:41:01 ERROR 403: Forbidden. [2]- Exit 8 wget https://sb-datasets-us-east-1.s3-fips.us-east-1.amazonaws.com/cgl-sgdd-reorg/SGDP/REMAP_hs37d5/LP6005441-DNA_A01.annotated.nh.vcf.gz?x-username=enabieva

Problem in ICGC Data Portal

For 3 days now, when I enter the ICGC Data Portal I get the error: "This site can’t be reached" Anyone know what the problem is? Is there another way to enter the portal? Thanks