{"_id":"56268a69b1c2630d00b112b0","version":{"_id":"55faf11ba62ba1170021a9aa","project":"55faf11ba62ba1170021a9a7","__v":37,"createdAt":"2015-09-17T16:58:03.490Z","releaseDate":"2015-09-17T16:58:03.490Z","categories":["55faf11ca62ba1170021a9ab","55faf8f4d0e22017005b8272","55faf91aa62ba1170021a9b5","55faf929a8a7770d00c2c0bd","55faf932a8a7770d00c2c0bf","55faf94b17b9d00d00969f47","55faf958d0e22017005b8274","55faf95fa8a7770d00c2c0c0","55faf96917b9d00d00969f48","55faf970a8a7770d00c2c0c1","55faf98c825d5f19001fa3a6","55faf99aa62ba1170021a9b8","55faf99fa62ba1170021a9b9","55faf9aa17b9d00d00969f49","55faf9b6a8a7770d00c2c0c3","55faf9bda62ba1170021a9ba","5604570090ee490d00440551","5637e8b2fbe1c50d008cb078","5649bb624fa1460d00780add","5671974d1b6b730d008b4823","5671979d60c8e70d006c9760","568e8eef70ca1f0d0035808e","56d0a2081ecc471500f1795e","56d4a0adde40c70b00823ea3","56d96b03dd90610b00270849","56fbb83d8f21c817002af880","573c811bee2b3b2200422be1","576bc92afb62dd20001cda85","5771811e27a5c20e00030dcd","5785191af3a10c0e009b75b0","57bdf84d5d48411900cd8dc0","57ff5c5dc135231700aed806","5804caf792398f0f00e77521","58458b4fba4f1c0f009692bb","586d3c287c6b5b2300c05055","58ef66d88646742f009a0216","58f5d52d7891630f00fe4e77"],"is_deprecated":false,"is_hidden":false,"is_beta":true,"is_stable":true,"codename":"","version_clean":"1.0.0","version":"1.0"},"__v":36,"category":{"_id":"55faf932a8a7770d00c2c0bf","pages":["56268a69b1c2630d00b112b0","56268a85c2781f0d00364bbc","56268a92c2781f0d00364bbe","5637e0a0cfaa870d00cdeb6a","5637e0c3fbe1c50d008cb06a","5637e164f7e3990d007b2c41"],"version":"55faf11ba62ba1170021a9aa","__v":6,"project":"55faf11ba62ba1170021a9a7","sync":{"url":"","isSync":false},"reference":false,"createdAt":"2015-09-17T17:32:34.286Z","from_sync":false,"order":8,"slug":"bring-your-private-data","title":"BRING YOUR PRIVATE DATA"},"user":"5613e4f8fdd08f2b00437620","parentDoc":null,"project":"55faf11ba62ba1170021a9a7","updates":[],"next":{"pages":[],"description":""},"createdAt":"2015-10-20T18:39:37.971Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":2,"body":"Large datasets hosted on corporate or academic clusters and workstations can be uploaded to the CGC using the command line uploader. This is a fast and secure upload client that has been optimized to efficiently upload files to the CGC, taking advantage of parallelization where possible.\n\n##Prerequisites\n\nThe Command Line Uploader requires Java 1.7 or newer. \n\nTo check the version of Java you're running\n1. Issue the following command in the terminal: `$ java -version`.\n2. Look for the version number in the first line of the output. It should look something like this:\n\n`java version \"1.8.0_20\"`\n`Java(TM) SE Runtime Environment (build 1.8.0_20-b26)`\n`Java HotSpot(TM) 64-Bit Server VM (build 25.20-b23, mixed mode)`\n\nLearn how to [download Java](https://java.com/en/download/help/download_options.xml) from their documentation.\n\n<div align=\"right\"><a href=\"#top\">top</a></div>\n\n##Install the Command Line Uploader\n\n1. Download the [Command Line Uploader](https://cgc.sbgenomics.com/cgc-uploader/cgc-uploader.tgz).\n2. Unpack the uploader to a directory of your choice. Your home directory is a good default location. To do this, enter: \n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"$ tar zxvf ~/Downloads/cgc-uploader.tgz -C ~\",\n      \"language\": \"text\",\n      \"name\": \"Unpack: \"\n    }\n  ]\n}\n[/block]\n3. Run the uploader with the -h switch to list all the available command line options (listed below) and their usage: \n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"$ ~/cgc-uploader/bin/cgc-uploader.sh -h\",\n      \"language\": \"text\",\n      \"name\": \"Run:\"\n    }\n  ]\n}\n[/block]\n##Command line options\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"cgc-uploader.sh [-h] [-l] [-p id] [-t token] [-u username] [-x url] file [--tag \\\"enter tag here\\\"]\",\n      \"language\": \"text\",\n      \"name\": \"Syntax\"\n    }\n  ]\n}\n[/block]\n\n[block:parameters]\n{\n  \"data\": {\n    \"h-0\": \"Option\",\n    \"h-1\": \"Description\",\n    \"0-0\": \"-h\\n--help\\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\",\n    \"0-1\": \"This option prints a short usage summary and will cause the uploader to ignore any other options. It will exit with a status of 0 (see Exit Statuses below).\",\n    \"1-0\": \"-l\\n--list-projects\",\n    \"1-1\": \"This option gives you a list of projects available as upload targets with the following two columns:\\n\\n* **The project identifier** which is used for specifying the target project (e.g. `rfranklin/samtools` where \\\"rfranklin\\\" is the project owner and \\\"samtools\\\" is the name of the project).\\n* **The project name** (e.g. “my new project”).\",\n    \"2-0\": \"-p <id>\\n--project <id>\",\n    \"2-1\": \"This option is used for choosing the target project by specifying its unique identifier, e.g. `username/projectname`.\\n\\nUse the `--list-projects` option to find the project identifier for your project.\\n\\nTo upload files to a project, you must be a member of that project and must have the write permission granted by the project administrator. \\n\\nThis option is mandatory.\",\n    \"3-0\": \"-t <token>\\n--token <token>\",\n    \"3-1\": \"This option is used to specify an authorization token. This option overrides the credentials read from the configuration file (see below).\",\n    \"4-0\": \"-u <username>\\n--user <username>\",\n    \"4-1\": \"This option is used to specify a username for the CGC. If omitted and not using the --token option, you will be prompted for a username.\\n*This option can be used only by users who haven't connected their CGC account with an eRA account.*\",\n    \"5-0\": \"-x <url>\\n--proxy <url>\",\n    \"5-1\": \"This option specifies a proxy server through which the uploader should connect. The proxy parameter should be of the form proto://[username:password:::at:::]host[:port]. proto can be 'http’ or 'socks’. HTTP proxies must allow the CONNECT command to port 443. SOCKS proxies can be both SOCKS4 and SOCKS5.\\n  * username and password are optional and will be used if the proxy requires authentication.\\n  * host is required.\\n  * port is optional. If omitted, the uploader will use 8080 for HTTP and 1080 for SOCKS proxies.\",\n    \"6-0\": \"--tag\",\n    \"6-1\": \"Use this option to enter tags composed of strings for your files. Format your tags as such: `--tag \\\"first tag here\\\" --tag \\\"second tag here\\\" --tag \\\"third tag here\\\"`\",\n    \"7-0\": \"--list-tags\",\n    \"7-1\": \"Use this option to list all the tags in your destination project.\",\n    \"8-0\": \"-mm\\n--manifest-metadata\",\n    \"8-1\": \"Use this option to upload multiple files and set their metadata using the [manifest file](doc:set-metadata-using-the-command-line-uploader#section-set-metadata-for-multiple-files-using-a-manifest-file).\\n\\nTo only apply individual metadata fields from the manifest, list them after the `--manifest-metadata` option, e.g. `--manifest-metadata sample paired_end`.\\n\\nThis option should be used in combination with the `--manifest-file` option.\",\n    \"9-0\": \"-mf\\n--manifest-file\",\n    \"9-1\": \"Use this option to specify the name of the manifest file. This option should be used in combination with `--manifest-metadata` , e.g. `--manifest-metadata --manifest-file filename.csv`.\\n \\nTo only upload files while omitting the metadata use `--manifest-file filename.csv`.\",\n    \"10-0\": \"--dry-run\",\n    \"10-1\": \"Use this option to only output data in the terminal and check the settings without uploading anything.\\n\\nTo output information about specific metadata fields, list them after the `--dry-run` option, e.g. `--dry-run sample library`.\"\n  },\n  \"cols\": 2,\n  \"rows\": 11\n}\n[/block]\n\n[block:callout]\n{\n  \"type\": \"info\",\n  \"title\": \"Tagging files\",\n  \"body\": \"You can also tag files as you upload them [via the CGC uploader](upload-via-the-cgc-uploader) or [from an FTP or HTTP(S) server](upload-from-an-ftp-server).\\n\\nLearn more about [tagging your files](tag-your-files) in general on the Platform and how tags are beneficial to organizing your data.\"\n}\n[/block]\n\n[block:callout]\n{\n  \"type\": \"success\",\n  \"body\": \"Only the following characters are allowed in file names:\\n* alphanumeric characters (lowercase and uppercase letters of the English alphabet and numbers 0 - 9),\\n* underscore (`_`),\\n* dash (`-`),\\n* dot (`.`).\"\n}\n[/block]\n##Authentication\n[block:callout]\n{\n  \"type\": \"success\",\n  \"body\": \"You can obtain an authentication token for your CGC account from the Developer Dashboard.\",\n  \"title\": \"Getting the authentication token\"\n}\n[/block]\nThe CGC command line uploader looks for credentials in the following locations in order:\nIf the -u username option is given, the uploader prompts for and reads the password from standard input.\n\n1. If the -t token option is given, the uploader uses your authentication token.\n2. The uploader looks for an authentication token in the configuration file (see below).\n3. The uploader looks for the username and password in the configuration file (see below).\n4. The uploader prompts for and reads the username and password from standard input.\n\n##Configuration files\nTo avoid providing your credentials each time you use the uploader, you can store them in a configuration file. This file is called .cgcrc and resides in your home directory. This location varies across operating systems, but would typically be:\n\n/home/$USER/.cgcrc on UNIX; \n/Users/$USER/.cgcrc on OS X; \nC:\\Documents and Settings\\%USERNAME% on Windows XP, 2000 and 2003; and\nC:\\Users\\%USERNAME% on Windows Vista, 7, 8 and 10.\n\nThe .cgcrc configuration file should contain key-value pairs of the following form:\n`username = johndoe`\n`password = supersecret123`\n`auth-token = ec43d6dce3c54193ac18e3855f734ccf`\n\n[block:callout]\n{\n  \"type\": \"info\",\n  \"body\": \"you can only  <a href=\\\"https://gds.nih.gov/pdf/Trusted_Partner_Checklist.pdf\\\" target=\\\"blank\\\">specify the auth-token key-value pair</a>. \\n\\nIf you haven't connected your CGC account with an eRA Commons username, you can use either the auth-token or username and password keys in the .cgcrc file.\",\n  \"title\": \"If you log in to the CGC using your eRA Commons credentials...\"\n}\n[/block]\nYou can specify the username and password, or the authentication token, or both. If both are given then the authentication token will take precedence. The uploader will use these values only if no other authentication options are provided on the command line. \n[block:callout]\n{\n  \"type\": \"info\",\n  \"body\": \"Please keep in mind that `.sbgrc` configuration file may only contain a single set of credentials. If multiple \\\"username\\\", \\\"password\\\" or \\\"token\\\" lines are encountered, the uploader will disregard all values but the last.\"\n}\n[/block]\n##Metadata\nFiles on the CGC are accompanied by metadata describing, amongst other things, their file type, origin, a sample ID and information about the sequencing technology used to create it. This metadata is often required by tools and workflows, and must be set before a file becomes fully usable. You can [ use the command line uploader](doc:set-metadata-using-the-command-line-uploader) to set some or all metadata during the upload, or [set it manually](doc:set-metadata-using-the-visual-interface) later.\n\n##Exit statuses\n[block:parameters]\n{\n  \"data\": {\n    \"h-0\": \"Code\",\n    \"h-1\": \"Status\",\n    \"0-0\": \"0\",\n    \"1-0\": \"100\",\n    \"2-0\": \"101\",\n    \"3-0\": \"102\",\n    \"4-0\": \"103\",\n    \"5-0\": \"104\",\n    \"6-0\": \"200\",\n    \"0-1\": \"Normal termination. The upload has either finished successfully, or usage information was written to standard output.\",\n    \"1-1\": \"The upload has failed in the pre-processing phase or the uploader was unable to initialize it properly.\",\n    \"2-1\": \"Input arguments were not properly set.\",\n    \"3-1\": \"Mandatory options were not set.\",\n    \"4-1\": \"Authentication error; invalid user credentials were used.\",\n    \"5-1\": \"Bad metadata file.\",\n    \"6-1\": \"Abnormal termination; an unknown error caused the upload to fail.\"\n  },\n  \"cols\": 2,\n  \"rows\": 7\n}\n[/block]\nSuppose you want to use the command line to upload the FASTQ file sample1.fastq, which has the associated metadata file, sample1.fastq.meta, to a project whose ID is 1234. Then, you should enter:\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"cgc-uploader$ bin/cgc-uploader.sh -t $AUTH_TOKEN -p 1234 sample1.fastq --tag \\\"fastq\\\" --tag \\\"sample 1\\\"\",\n      \"language\": \"text\",\n      \"name\": \"Example:\"\n    }\n  ]\n}\n[/block]\nreplacing $AUTH_TOKEN with your own auth token.\n\n[block:callout]\n{\n  \"type\": \"success\",\n  \"body\": \"As shown in the example above, don't forget to change directory to the one containing the cgc-uploader, and to prefix the executable name with `bin/`.\"\n}\n[/block]","excerpt":"","slug":"upload-via-the-command-line","type":"basic","title":"Upload via the command line"}

Upload via the command line


Large datasets hosted on corporate or academic clusters and workstations can be uploaded to the CGC using the command line uploader. This is a fast and secure upload client that has been optimized to efficiently upload files to the CGC, taking advantage of parallelization where possible. ##Prerequisites The Command Line Uploader requires Java 1.7 or newer. To check the version of Java you're running 1. Issue the following command in the terminal: `$ java -version`. 2. Look for the version number in the first line of the output. It should look something like this: `java version "1.8.0_20"` `Java(TM) SE Runtime Environment (build 1.8.0_20-b26)` `Java HotSpot(TM) 64-Bit Server VM (build 25.20-b23, mixed mode)` Learn how to [download Java](https://java.com/en/download/help/download_options.xml) from their documentation. <div align="right"><a href="#top">top</a></div> ##Install the Command Line Uploader 1. Download the [Command Line Uploader](https://cgc.sbgenomics.com/cgc-uploader/cgc-uploader.tgz). 2. Unpack the uploader to a directory of your choice. Your home directory is a good default location. To do this, enter: [block:code] { "codes": [ { "code": "$ tar zxvf ~/Downloads/cgc-uploader.tgz -C ~", "language": "text", "name": "Unpack: " } ] } [/block] 3. Run the uploader with the -h switch to list all the available command line options (listed below) and their usage: [block:code] { "codes": [ { "code": "$ ~/cgc-uploader/bin/cgc-uploader.sh -h", "language": "text", "name": "Run:" } ] } [/block] ##Command line options [block:code] { "codes": [ { "code": "cgc-uploader.sh [-h] [-l] [-p id] [-t token] [-u username] [-x url] file [--tag \"enter tag here\"]", "language": "text", "name": "Syntax" } ] } [/block] [block:parameters] { "data": { "h-0": "Option", "h-1": "Description", "0-0": "-h\n--help\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;", "0-1": "This option prints a short usage summary and will cause the uploader to ignore any other options. It will exit with a status of 0 (see Exit Statuses below).", "1-0": "-l\n--list-projects", "1-1": "This option gives you a list of projects available as upload targets with the following two columns:\n\n* **The project identifier** which is used for specifying the target project (e.g. `rfranklin/samtools` where \"rfranklin\" is the project owner and \"samtools\" is the name of the project).\n* **The project name** (e.g. “my new project”).", "2-0": "-p <id>\n--project <id>", "2-1": "This option is used for choosing the target project by specifying its unique identifier, e.g. `username/projectname`.\n\nUse the `--list-projects` option to find the project identifier for your project.\n\nTo upload files to a project, you must be a member of that project and must have the write permission granted by the project administrator. \n\nThis option is mandatory.", "3-0": "-t <token>\n--token <token>", "3-1": "This option is used to specify an authorization token. This option overrides the credentials read from the configuration file (see below).", "4-0": "-u <username>\n--user <username>", "4-1": "This option is used to specify a username for the CGC. If omitted and not using the --token option, you will be prompted for a username.\n*This option can be used only by users who haven't connected their CGC account with an eRA account.*", "5-0": "-x <url>\n--proxy <url>", "5-1": "This option specifies a proxy server through which the uploader should connect. The proxy parameter should be of the form proto://[username:password@]host[:port]. proto can be 'http’ or 'socks’. HTTP proxies must allow the CONNECT command to port 443. SOCKS proxies can be both SOCKS4 and SOCKS5.\n * username and password are optional and will be used if the proxy requires authentication.\n * host is required.\n * port is optional. If omitted, the uploader will use 8080 for HTTP and 1080 for SOCKS proxies.", "6-0": "--tag", "6-1": "Use this option to enter tags composed of strings for your files. Format your tags as such: `--tag \"first tag here\" --tag \"second tag here\" --tag \"third tag here\"`", "7-0": "--list-tags", "7-1": "Use this option to list all the tags in your destination project.", "8-0": "-mm\n--manifest-metadata", "8-1": "Use this option to upload multiple files and set their metadata using the [manifest file](doc:set-metadata-using-the-command-line-uploader#section-set-metadata-for-multiple-files-using-a-manifest-file).\n\nTo only apply individual metadata fields from the manifest, list them after the `--manifest-metadata` option, e.g. `--manifest-metadata sample paired_end`.\n\nThis option should be used in combination with the `--manifest-file` option.", "9-0": "-mf\n--manifest-file", "9-1": "Use this option to specify the name of the manifest file. This option should be used in combination with `--manifest-metadata` , e.g. `--manifest-metadata --manifest-file filename.csv`.\n \nTo only upload files while omitting the metadata use `--manifest-file filename.csv`.", "10-0": "--dry-run", "10-1": "Use this option to only output data in the terminal and check the settings without uploading anything.\n\nTo output information about specific metadata fields, list them after the `--dry-run` option, e.g. `--dry-run sample library`." }, "cols": 2, "rows": 11 } [/block] [block:callout] { "type": "info", "title": "Tagging files", "body": "You can also tag files as you upload them [via the CGC uploader](upload-via-the-cgc-uploader) or [from an FTP or HTTP(S) server](upload-from-an-ftp-server).\n\nLearn more about [tagging your files](tag-your-files) in general on the Platform and how tags are beneficial to organizing your data." } [/block] [block:callout] { "type": "success", "body": "Only the following characters are allowed in file names:\n* alphanumeric characters (lowercase and uppercase letters of the English alphabet and numbers 0 - 9),\n* underscore (`_`),\n* dash (`-`),\n* dot (`.`)." } [/block] ##Authentication [block:callout] { "type": "success", "body": "You can obtain an authentication token for your CGC account from the Developer Dashboard.", "title": "Getting the authentication token" } [/block] The CGC command line uploader looks for credentials in the following locations in order: If the -u username option is given, the uploader prompts for and reads the password from standard input. 1. If the -t token option is given, the uploader uses your authentication token. 2. The uploader looks for an authentication token in the configuration file (see below). 3. The uploader looks for the username and password in the configuration file (see below). 4. The uploader prompts for and reads the username and password from standard input. ##Configuration files To avoid providing your credentials each time you use the uploader, you can store them in a configuration file. This file is called .cgcrc and resides in your home directory. This location varies across operating systems, but would typically be: /home/$USER/.cgcrc on UNIX; /Users/$USER/.cgcrc on OS X; C:\Documents and Settings\%USERNAME% on Windows XP, 2000 and 2003; and C:\Users\%USERNAME% on Windows Vista, 7, 8 and 10. The .cgcrc configuration file should contain key-value pairs of the following form: `username = johndoe` `password = supersecret123` `auth-token = ec43d6dce3c54193ac18e3855f734ccf` [block:callout] { "type": "info", "body": "you can only <a href=\"https://gds.nih.gov/pdf/Trusted_Partner_Checklist.pdf\" target=\"blank\">specify the auth-token key-value pair</a>. \n\nIf you haven't connected your CGC account with an eRA Commons username, you can use either the auth-token or username and password keys in the .cgcrc file.", "title": "If you log in to the CGC using your eRA Commons credentials..." } [/block] You can specify the username and password, or the authentication token, or both. If both are given then the authentication token will take precedence. The uploader will use these values only if no other authentication options are provided on the command line. [block:callout] { "type": "info", "body": "Please keep in mind that `.sbgrc` configuration file may only contain a single set of credentials. If multiple \"username\", \"password\" or \"token\" lines are encountered, the uploader will disregard all values but the last." } [/block] ##Metadata Files on the CGC are accompanied by metadata describing, amongst other things, their file type, origin, a sample ID and information about the sequencing technology used to create it. This metadata is often required by tools and workflows, and must be set before a file becomes fully usable. You can [ use the command line uploader](doc:set-metadata-using-the-command-line-uploader) to set some or all metadata during the upload, or [set it manually](doc:set-metadata-using-the-visual-interface) later. ##Exit statuses [block:parameters] { "data": { "h-0": "Code", "h-1": "Status", "0-0": "0", "1-0": "100", "2-0": "101", "3-0": "102", "4-0": "103", "5-0": "104", "6-0": "200", "0-1": "Normal termination. The upload has either finished successfully, or usage information was written to standard output.", "1-1": "The upload has failed in the pre-processing phase or the uploader was unable to initialize it properly.", "2-1": "Input arguments were not properly set.", "3-1": "Mandatory options were not set.", "4-1": "Authentication error; invalid user credentials were used.", "5-1": "Bad metadata file.", "6-1": "Abnormal termination; an unknown error caused the upload to fail." }, "cols": 2, "rows": 7 } [/block] Suppose you want to use the command line to upload the FASTQ file sample1.fastq, which has the associated metadata file, sample1.fastq.meta, to a project whose ID is 1234. Then, you should enter: [block:code] { "codes": [ { "code": "cgc-uploader$ bin/cgc-uploader.sh -t $AUTH_TOKEN -p 1234 sample1.fastq --tag \"fastq\" --tag \"sample 1\"", "language": "text", "name": "Example:" } ] } [/block] replacing $AUTH_TOKEN with your own auth token. [block:callout] { "type": "success", "body": "As shown in the example above, don't forget to change directory to the one containing the cgc-uploader, and to prefix the executable name with `bin/`." } [/block]