Modify Read-only file in terminal

Posted in TCGA data on the CGC by Yiyun Rao Tue Feb 04 2020 22:55:06 GMT+0000 (Coordinated Universal Time)·1·Viewed 70 times

Hi! I am analyzing the TCGA MAF files using terminal in Data cruncher. Some MAF files are .gz format and I have to unzip them. But it's a read-only file system. Would you please help find a solution? Thank you! Best, Yiyun
Feb 18, 2020

Hi Yiyun,

Here's a suggestion from our Data Cruncher engineering team:

The folder in which the project files are mounted is read-only to prevent unwated changes to the files located on the platform. So you will not be able to use that folder to create or modify files there.

It depends on what your approach is to decompressing (using the terminal or through python) but all of the options should allow you to specify in some way the output directory in which the decompressed file will be. For instance, if using the terminal, it can be done though piping, for instance

cat /sbgenomics/project-files/<filename>.ext.gz | gzip -d > /sbgenomics/workspace/<filename>.ext

Or through python by using the gzip module. Using shutil will allow decompression of large files in chunks without running out of memory:

import gzip
import shutil
with open('/sbgenomics/project-files/<filename>.ext.gz', 'rb') as f_in, gzip.open('/sbgenomics/workspace/<filename>.ext', 'wb') as f_out:
    shutil.copyfileobj(f_in, f_out)

Hope this helps.

  
Markdown is allowed