Back to All

Specifying a bowtie index in command line

I wanted to post my question and solution here in case anyone searches for it in the future. I was trying to wrap a script that requires a path and prefix for the bowtie index files. I was submitting them as an input array of files, but the script was unable to find them.

What I didn't realize is that the files were in the Projects directory and the script was in the working directory, so in the Stage Input section of Edit Input Property, I had to choose Link, which creates symlinks in the working directory. I didn't immediately realize this because the script could find my fastq input file without any staging.

I've since noticed that the STAR aligner tool uses a tarball of the index files, and Darko also adds:

"As an extra note, an alternative to the approach you're currently using would be to create a single .tar file containing all the output files of bowtie-build run. For example, when invoking Bowtie indexer, use a command like:

bowtie-build ucsc_hg19.fasta ucsc_hg19 && tar -cf ucsc_hg19.tar *.ebwt

You would then use the resulting .tar file as a single reference bundle, replacing the file array input port with a single file input. You would have to stage it and add an extra base command (before the actual call to mapper.pl) containing a JavaScript expression that returns a string with the reference bundle file name. For example (assuming you're using 'reference' as the ID of the respective input port):

{
var reference_bundle = $job.inputs.reference.path.split('/').slice(-1)
return 'tar -xf ' + reference_bundle + ' ; '
}

Finally, you would modify the JavaScript expression generating the value of -p argument of mapper.pl to get the basename of the .tar bundle (which must be the same as that of the Bowtie index files)."