Varscan2 work flow from BAM producing too few somatic mutation calls?

Posted in TCGA data on the CGC by Stella Lee Thu Jan 30 2020 00:58:55 GMT+0000 (Coordinated Universal Time)·4·Viewed 42 times

Hi! I am recently using the Varscan2 workflow from BAM to do somatic mutation calling of TCGA GRch38 BAM files. However, the output high confidence vcf files is only a few kb large. One of the patient I was looking, TCGA-AR-A1AO has around 6000 mutations called in the MuTect vcf but only have 300 mutations in my output. I didn't change any parameters. I wonder if it's the problem of input files but I was just using the tumor-normal bam in TCGA.
Marko Marinkovic
Jan 30, 2020

Hi Stella,

Here's a suggestion from our Bioinformatics team:

It could be that the number of variants you are seeing when doing MuTect is a number of all variants, those that passed and those that failed the filtering step. The output you are using for VarScan2 is the high confidence output which gives you only the higher confidence variants thus reducing recall but increasing precision. If you want to use more relaxed filtering you can either set the desired values for filtering on VarScan2 ProcessSomatic SNPs and/or VarScan2 ProcessSomatic INDELs or you can just pull out the “Somatic variants” inputs from those tools, go through the same steps (VCFtools concat -> VCFtools sort -> SBG VCF Reorder) and get an output of raw somatic variants.

Please let us know if you need anything else.

Thanks,
Marko

Stella Lee
Jan 30, 2020

Hi Marko,

Thank you for your reply!

I actually pull out the raw somatic variants call file(not only the high confidence ones) and find that it still only has around 600 SNVs called.

At the same time, the Varscan VCF produced by GDC have around 3000 calls. I think that is a reasonable amount. I also compared the parameters with those on GDC website and they are all the same. Now I just don't know how to troubleshoot any more...

Thank you!

Yiyun

Stella Lee
Jan 30, 2020

Hi Marko,

I have figured out the problem. It is that the Mutect output is unfiltered. Thank you so much for you help!

Best,
Stella

Marko Marinkovic
Jan 30, 2020

No problem, feel free to ask if you need anything else.

Best,
Marko

  
Markdown is allowed