TCGA BAM file size inconsistency with GDC?

Posted in TCGA data on the CGC by Yiyun Rao Fri Feb 07 2020 21:47:12 GMT+0000 (Coordinated Universal Time)·Viewed 18 times

Hi, I'm doing somatic mutation calling of a TCGA patient TCGA-AR-A1AO and am using the BAM file in databrower. I'm using the BAM files TCGA-AR-A1AO-10A-01D-A12Q-09_IlluminaGA-DNASeq_exome_gdc_realn.bam uuid:30f1d9e3-e6a5-44b6-846c-1497806d301c size: 27.03GB TCGA-AR-A1AO-01A-01D-A12Q-09_IlluminaGA-DNASeq_exome_gdc_realn.bam uuid: 33eeb804-ca8b-491e-8221-a285743be692 size: 25.53GB However, on GDC portal, the files are 29.02GB and 27.41GB respectively. I wonder if those files are really up to date as the file sizes are different and my somatic mutation calling result using Varscan2 is missing variants comparing to GDC results(Under same parameters and inputs.) It is just confusing so I am troubleshooting right now. Woule you please help me on this? Thanks! Best, Stella
  
Markdown is allowed