{"_id":"5785237d5ae9c20e00bc263f","parentDoc":null,"category":{"_id":"5785191af3a10c0e009b75b0","version":"55faf11ba62ba1170021a9aa","__v":0,"project":"55faf11ba62ba1170021a9a7","sync":{"url":"","isSync":false},"reference":false,"createdAt":"2016-07-12T16:21:46.337Z","from_sync":false,"order":27,"slug":"connect-cloud-storage","title":"CONNECT CLOUD STORAGE"},"user":"5613e4f8fdd08f2b00437620","__v":3,"version":{"_id":"55faf11ba62ba1170021a9aa","project":"55faf11ba62ba1170021a9a7","__v":40,"createdAt":"2015-09-17T16:58:03.490Z","releaseDate":"2015-09-17T16:58:03.490Z","categories":["55faf11ca62ba1170021a9ab","55faf8f4d0e22017005b8272","55faf91aa62ba1170021a9b5","55faf929a8a7770d00c2c0bd","55faf932a8a7770d00c2c0bf","55faf94b17b9d00d00969f47","55faf958d0e22017005b8274","55faf95fa8a7770d00c2c0c0","55faf96917b9d00d00969f48","55faf970a8a7770d00c2c0c1","55faf98c825d5f19001fa3a6","55faf99aa62ba1170021a9b8","55faf99fa62ba1170021a9b9","55faf9aa17b9d00d00969f49","55faf9b6a8a7770d00c2c0c3","55faf9bda62ba1170021a9ba","5604570090ee490d00440551","5637e8b2fbe1c50d008cb078","5649bb624fa1460d00780add","5671974d1b6b730d008b4823","5671979d60c8e70d006c9760","568e8eef70ca1f0d0035808e","56d0a2081ecc471500f1795e","56d4a0adde40c70b00823ea3","56d96b03dd90610b00270849","56fbb83d8f21c817002af880","573c811bee2b3b2200422be1","576bc92afb62dd20001cda85","5771811e27a5c20e00030dcd","5785191af3a10c0e009b75b0","57bdf84d5d48411900cd8dc0","57ff5c5dc135231700aed806","5804caf792398f0f00e77521","58458b4fba4f1c0f009692bb","586d3c287c6b5b2300c05055","58ef66d88646742f009a0216","58f5d52d7891630f00fe4e77","59a555bccdbd85001bfb1442","5a2a81f688574d001e9934f5","5b080c8d7833b20003ddbb6f"],"is_deprecated":false,"is_hidden":false,"is_beta":true,"is_stable":true,"codename":"","version_clean":"1.0.0","version":"1.0"},"githubsync":"","project":"55faf11ba62ba1170021a9a7","updates":[],"next":{"pages":[],"description":""},"createdAt":"2016-07-12T17:06:05.211Z","link_external":false,"link_url":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":6,"body":"[block:callout]\n{\n  \"type\": \"warning\",\n  \"title\": \"On this page:\",\n  \"body\": \"* [Overview](#overview)\\n* [Procedure](#procedure)\\n* [Prerequisites](#prerequisites)\\n* [Step 1: Add an S3 bucket as a volume](#register)\\n * [1a: Create an IAM (Identity and Access Management) user](#create)\\n * [1b: Create access keys for the IAM user](#authorize)\\n * [1c: Attach your volume to the CGC](#attach)\\n* [Step 2: Make an object from the bucket available on the CGC](#import)\\n * [2a: Launch an import job](#launch-import)\\n * [2b: Check if the import job has completed](#check-import)\\n* [Step 3: Move a file from the CGC to the bucket](#move-file)\\n * [3a: Upload a file to a project](#upload-file)\\n * [3b: Move a file from your project on the CGC to the bucket](#move-file-from-project)\\n * [3c: Check if the export job has completed](#check-export)\"\n}\n[/block]\n<a name=\"overview\"></a>\n##Overview\n\nThe Volumes API contains two types of calls: one to connect and manage cloud storage, and the other to import and export data to and from a connected cloud account.\n\nBefore you can start working with your cloud storage via the CGC, you need to authorize the CGC to access and query objects on that cloud storage on your behalf. This is done by creating a \"volume\". A volume enables you to treat the cloud repository associated with it as external storage for the CGC. You can 'import' files from the volume to the CGC to use them as inputs for computation. Similarly, you can write files from the CGC to your cloud storage by 'exporting' them to your volume. Learn more about [working with volumes](volumes).\n\nThe CGC uses Amazon Web Services as a cloud infrastructure provider. This affects the cloud storage you can access and associate with your CGC account. For instance, you have full read-write access to your data stored in Amazon Web Services' S3 and read-only access to data stored in Google Cloud Storage. \n\n<a name=\"procedure\"></a>\n##Procedure\n\nThis short tutorial will guide you through setting up a volume. You'll connect your Amazon S3 bucket as a volume, make an object from the bucket available on the CGC, then move a file from the CGC to the bucket.\n\nOnce a volume is created, you can issue import and export operations to make data appear on the CGC or to move your CGC files to the underlying cloud storage provider.\n[block:callout]\n{\n  \"type\": \"info\",\n  \"body\": \"In this tutorial we assume you want to connect to an Amazon S3 bucket. The procedure will be slightly different for other cloud storage providers, such as a [Google Cloud Storage bucket](google-cloud-storage-tutorial). For more information, please refer to our list of [supported cloud storage providers](supported-cloud-storage-providers).\"\n}\n[/block]\n<a name=\"prerequisites\"></a>\n\n##Prerequisites\n\nTo complete this tutorial, you will need:\n1. An <a href=\"https://aws.amazon.com/\" target=\"blank\">Amazon Web Services (AWS)</a> account\n2. One or more buckets on this AWS account\n3. One or more objects (files) in your target bucket\n4. An authentication token for the CGC. Learn more about [getting your authentication token](get-your-authentication-token).\n\n<a name=\"register\"></a>\n##Step 1: Add an S3 bucket as a volume\n\nTo set up a volume, you have to first register an AWS S3 bucket as a volume. Volumes mediate access between the CGC and your <a href=\"http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingBucket.html\" target=\"blank\">buckets</a>, which are local units of storage in AWS.\n\nRegister an AWS S3 bucket as a volume through the following steps below.\n[block:callout]\n{\n  \"type\": \"success\",\n  \"body\": \"*(Optional* You can also provide your [KMS ID](https://aws.amazon.com/documentation/kms/) if you opt to use KMS for your encryption.\"\n}\n[/block]\n<a name=\"create\"></a>\n###1a: Create an IAM (Identity and Access Management) user\n\nFollow AWS documentation for [directions on creating an IAM user](http://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html#id_users_create_console).\n\n<a name=\"authorize\"></a>\n###1b: Create access keys for the IAM user\n\n1. In the list of IAM users, locate the IAM user you created above. Click the username to configure your options.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/2afd959-cgc-aws-cloud-storage-tutorial-1.png\",\n        \"cgc-aws-cloud-storage-tutorial-1.png\",\n        1440,\n        442,\n        \"#e3e1e2\"\n      ]\n    }\n  ]\n}\n[/block]\n2. Click the **Security credentials** tab.\n3. In the **Access keys** section click **Create access key**. You get two keys, **Access key ID** and **Secret access key**.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/f10477c-create-access-key.png\",\n        \"create-access-key.png\",\n        1440,\n        682,\n        \"#2d3442\"\n      ]\n    }\n  ]\n}\n[/block]\n4. Copy the credentials for later use. Be sure to keep your credentials somewhere safe. You can also click **Download .csv file** to obtain them in a file named **accessKeys.csv**.\n\n<a name=\"attach\"></a>\n### 1c: Attach your volume to the CGC\n\n1. From the main menu bar on the CGC, select **Data** > **Volumes**.\n2. Click **Attach volume**. If you already have attached volumes, in the top-right corner click **Connect Storage**.\n3. Select **amazon web services**.\n4. Enter **Access key ID** and **Secret access key** you obtained in section [1b](#authorize) above.\n5. Click **Next**.\n6. In the **Bucket name** field enter the name of the S3 bucket you wish to connect. **Volume name** is the display name of the volume on the CGC and will be generated automatically.\n7. (Optional) Enter volume description.\n8. Set access privileges for the volume. Available options are:\n    * **Read only (RO)** - You will be able to read files, but won't be able to add them to the volume.\n    * **Read and Write (RW)** - You will be able to read files and also add files to the volume.\n9. Click **Next**. You are now taken to the generated policy.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/fd5c7a4-policy-generator.png\",\n        \"policy-generator.png\",\n        636,\n        572,\n        \"#e1ecec\"\n      ]\n    }\n  ]\n}\n[/block]\n10. Copy the content of the box.\n11. In the list of IAM users in AWS Management Console, locate the IAM user you created in section [1a](#create). Click the username to configure your options.\n12. On the **Permissions** tab, click **Add inline policy**, as shown below.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/adb58b2-cgc-aws-cloud-storage-tutorial-2.png\",\n        \"cgc-aws-cloud-storage-tutorial-2.png\",\n        1440,\n        682,\n        \"#293548\"\n      ]\n    }\n  ]\n}\n[/block]\n13. Select the **JSON** tab and replace the existing content by pasting the code you copied in step 10.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/c335626-cgc-aws-cloud-storage-tutorial-3a.png\",\n        \"cgc-aws-cloud-storage-tutorial-3a.png\",\n        1440,\n        682,\n        \"#272f3e\"\n      ]\n    }\n  ]\n}\n[/block]\n14. Click **Review policy**.\n15. Enter a descriptive policy name, e.g. **sb-access-policy**. Note that you can only use alphanumerics and the following characters: **+=,.:::at:::-_ **.\n16. Click **Create policy**.\n17. Then, go back to the CGC and click **Next** in the wizard.\n18. In the **Endpoint** field enter **s3.amazonaws.com**. Leave default values for other settings.\n19. Click **Next**. You can now review your volume connection settings.\n20. Finally, click **Connect**. Your volume should now be connected to the CGC and visible in the list of volumes.\n\n<a name=\"import\"></a>\n##Step 2: Make an object from the bucket available on the CGC\n\nNow that we have a volume, we can make data objects from the bucket associated with the volume available as \"aliases\" on the CGC. Aliases point to files stored on your cloud storage bucket and can be copied, executed, and organized like normal files on the CGC. We call this operation \"importing\". Learn more about working with [aliases](doc:aliases).\n\nTo import a data object from your volume as an alias on the CGC, follow the steps below.\n\n<a name=\"launch-import\"></a>\n###2a: Launch an import job\n\nTo import a file, make the API request to [start an import job](start-an-import-job-v2) as shown below. In the body of the request, include the key-value pairs in the table below.\n[block:parameters]\n{\n  \"data\": {\n    \"h-0\": \"Key\",\n    \"h-1\": \"Description of value\",\n    \"0-0\": \"`volume_id`\\n_required_\",\n    \"0-1\": \"Volume ID from which to import the file. This consists of your username followed by the volume's name, such as `rfranklin/sb-volume-demo`.\",\n    \"1-0\": \"`location`\\n_required_\",\n    \"2-0\": \"`destination`\\n_required_\",\n    \"3-0\": \"`project`\\n_required_\",\n    \"4-0\": \"`name`\",\n    \"5-0\": \"`overwrite`\",\n    \"5-1\": \"Specify as `true` to overwrite the file if the file with the same name already exists in the destination.\",\n    \"4-1\": \"The name of the alias to create. This name should be unique to the project. If the name is already in use in the project, you should use the overwrite query parameter in this call to force any file with that name to be deleted before the alias is created.\\n\\nIf name is omitted, the alias name will default to the last segment of the complete location (including the `prefix`) on the volume. Segments are considered to be separated with forward slashes ('/').\",\n    \"3-1\": \"The project in which to create the alias. This consists of your username followed by your project's short name, such as `rfranklin/my-project`.\",\n    \"2-1\": \"This object should describe the CGC destination for the imported file.\",\n    \"1-1\": \"Volume-specific location pointing to the file to import. This location should be recognizable to the underlying cloud service as a valid key or path to the file.\\n\\nPlease note that if this volume was configured with a `prefix` parameter when it was created, the `prefix` will be prepended to location before attempting to locate the file on the volume.\"\n  },\n  \"cols\": 2,\n  \"rows\": 6\n}\n[/block]\n\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"POST /v2/storage/imports HTTP/1.1\\nHost: cgc-api.sbgenomics.com\\nX-SBG-Auth-Token: 3259c50e1ac5426ea8f1273259740f74\\ncontent-type: application/json\",\n      \"language\": \"http\",\n      \"name\": \"Start an import job\"\n    }\n  ]\n}\n[/block]\n\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"{ \\n   \\\"source\\\":{ \\n      \\\"volume\\\":\\\"rfranklin/sb-volume-demo\\\",\\n      \\\"location\\\":\\\"example_human_Illumina.pe_1.fastq\\\"\\n   },\\n   \\\"destination\\\":{ \\n      \\\"project\\\":\\\"rfranklin/my-project\\\",\\n      \\\"name\\\":\\\"my_imported_example_human_Illumina.pe_1.fastq\\\"\\n   },\\n   \\\"overwrite\\\": true\\n}\",\n      \"language\": \"json\",\n      \"name\": \"Start an import job request body\"\n    }\n  ]\n}\n[/block]\nThe returned response details the status of your import, as shown below.\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"{\\n  \\\"href\\\": \\\"https://cgc-api.sbgenomics.com/v2/storage/imports/SbBQeWpfZ445jxgIBioYfnJ5oBl86nFN\\\",\\n  \\\"id\\\": \\\"SbBQeWpfZ445jxgIBioYfnJ5oBl86nFN\\\",\\n  \\\"state\\\": \\\"PENDING\\\",\\n  \\\"overwrite\\\": true,\\n  \\\"source\\\": {\\n    \\\"volume\\\": \\\"rfranklin/sb-volume-demo\\\",\\n    \\\"location\\\": \\\"example_human_Illumina.pe_1.fastq\\\"\\n  },\\n  \\\"destination\\\": {\\n    \\\"project\\\": \\\"rfranklin/my-project\\\",\\n    \\\"name\\\": \\\"my_uploaded_example_human_Illumina.pe_1.fastq\\\"\\n  }\\n}\",\n      \"language\": \"json\",\n      \"name\": \"Response body\"\n    }\n  ]\n}\n[/block]\nLocate the `id` property in the response and copy this value to your clipboard. This `id` is an identifier for the import job, and we will need it in the following step.\n\n<a name=\"check-import\"></a>\n###2b: Check if the import job has completed\n\nTo check if the import job has completed, make the API request to [get details of an import job](get-details-of-an-import-job-v2), as shown below. Simply append the import job `id` obtained in the step above to the path.\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"GET /v2/storage/imports/SbBQeWpfZ445jxgIBioYfnJ5oBl86nFN HTTP/1.1\\nHost: cgc-api.sbgenomics.com\\nX-SBG-Auth-Token: 3259c50e1ac5426ea8f1273259740f74\\ncontent-type: application/json\",\n      \"language\": \"http\",\n      \"name\": \"Get details of an import job\"\n    }\n  ]\n}\n[/block]\nThe returned response details the `state` of your import. If the state is `COMPLETED`, your import has successfully finished. If the `state` is `PENDING`, wait a few seconds and repeat this step.\n\nYou should now have a freshly-created alias in your project. To verify that a file has been imported, visit this project in your browser and look for a file with the same name as the key of the object in your bucket.\n\n<a name=\"move-file\"></a>\n##Step 3: Move a file from the CGC to the bucket\n\nYou've successfully created an alias on the CGC for a file in your S3 bucket. You can also move files from the CGC into your connected S3 bucket. This operation is known as 'exporting' to the volume associated with the bucket. Please keep in mind that public files, files belonging to CGC-hosted datasets, archived files, and aliases cannot be exported. For more information, please see [working with aliases](aliases).\n\nFollow the steps below to move a file from CGC to an object in your bucket.\n\n<a name=\"upload-file\"></a>\n###3a: Upload a file to a project\n\nBefore you can export a file from the CGC, you must upload a file to a project. To upload a file, follow the steps below:\n\n1. Upload a file to your project using the [command line uploader](upload-via-the-command-line), the [CGC Uploader](upload-via-the-seven-bridges-uploader), using an[ FTP or HTTP(S) server](upload-from-an-ftp-server), or the [API](upload-files).\n2. Locate and copy the file ID. From the various upload mechanisms, you can find the file ID as follows:\n  * **Command line uploader** - In the output of the command line uploader, note that the first column in the line that corresponds to the uploaded file. This is the uploaded file's ID.\n  * **CGC Uploader** - Once the file has uploaded, locate the file in the Files tab of the relevant project. Click on the file's name. A new page with details about your file should open. Locate the last segment of this page's URL, following /files/. This is the uploaded file's ID. For example, the file ID of https://cgc.sbgenomics.com/u/rfranklin/volumes-api-project/files/577d4c35e4b05e75806f2853/` is `577d4c35e4b05e75806f2853`.\n  * **FTP or HTTP(S) server **- Locate the file's ID in the same way as for files uploaded using the CGC Uploader.\n  * **API** - Issue the API request to [List all files within a project](list-files-in-a-project). The IDs of each file are listed next to the key id in the response body.\n\n<a name=\"move-file-from-project\"></a>\n###3b: Move a file from your project on the CGC to the bucket\n\nWhen you export a file from the CGC to your volume, you are writing to your S3 bucket.\n\nMake the API request to [start an export job](start-an-export-job-v2) to move a file from the CGC to your bucket, as shown below. In the body of your request, include the key-value pairs from the table below.\n[block:parameters]\n{\n  \"data\": {\n    \"h-0\": \"Key\",\n    \"h-1\": \"Value\",\n    \"0-0\": \"`source`\\n_required_\",\n    \"0-1\": \"This object should describe the source from which the file should be exported.\",\n    \"1-0\": \"`file`\\n_required_\",\n    \"1-1\": \"The CGC-assigned ID of the file for export.\",\n    \"2-0\": \"`destination`\\n_required_\",\n    \"3-0\": \"`volume`\\n_required_\",\n    \"4-0\": \"`location`\\n_required_\",\n    \"5-0\": \"`properties`\",\n    \"6-0\": \"`sse_algorithm`\\n_default: AES256_\",\n    \"2-1\": \"This object should describe the destination to which the file will be exported.\",\n    \"3-1\": \"The ID of the volume to which the file will be exported.\",\n    \"4-1\": \"Volume-specific location to which the file will be exported. This location should be recognizable to the underlying cloud service as a valid key or path to a new file.\\n\\nPlease note that if this volume has been configured with a `prefix` parameter, the value of `prefix` will be prepended to `location` before attempting to create the file on the volume.\",\n    \"5-1\": \"Service-specific properties of the export.\\nThese values override the defaults from the volume.\",\n    \"6-1\": \"S3 server-side encryption to use when exporting to this bucket.\\n\\nSupported values:\\n  * `AES256` (SSE-S3 encryption)\\n  * `aws:kms`\\n  * `null` (no server-side encryption)\",\n    \"7-1\": \"Provide your AWS KMS ID here if you specify `aws:kms` as your `sse_algorithm`. Learn more about [AWS KMS](https://aws.amazon.com/documentation/kms/).\",\n    \"7-0\": \"`sse_aws_kms_key_id`\"\n  },\n  \"cols\": 2,\n  \"rows\": 8\n}\n[/block]\n\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"POST /v2/storage/exports HTTP/1.1\\nHost: cgc-api.sbgenomics.com\\nX-SBG-Auth-Token: 3259c50e1ac5426ea8f1273259740f74\\ncontent-type: application/json\",\n      \"language\": \"http\",\n      \"name\": \"Start an export job\"\n    }\n  ]\n}\n[/block]\n\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"{\\n \\\"source\\\": {\\n   \\\"file\\\": \\\"576159f7f5b4e1de6ae9b5f0\\\"\\n },\\n \\\"destination\\\": {\\n   \\\"volume\\\": \\\"rfranlin/sb-volume-demo\\\",\\n   \\\"location\\\": \\\"\\\"\\n },\\n\\\"properties\\\": {\\n   \\\"sse_algorithm\\\": \\\"AES256\\\"\\n }\\n}\",\n      \"language\": \"json\",\n      \"name\": \"Start an export job request body\"\n    }\n  ]\n}\n[/block]\nThe returned response details the status of your import, as shown below.\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"{\\n  \\\"href\\\": \\\"https://cgc-api.sbgenomics.com/v2/storage/exports/2fzgXdc7zqeYFMiVvTCZdLBKgUpKdUhn\\\",\\n  \\\"id\\\": \\\"2fzgXdc7zqeYFMiVvTCZdLBKgUpKdUhn\\\",\\n  \\\"state\\\": \\\"PENDING\\\",\\n  \\\"source\\\": {\\n    \\\"file\\\": \\\"576159f7f5b4e1de6ae9b5f0\\\"\\n  },\\n  \\\"destination\\\": {\\n    \\\"volume\\\": \\\"rfranklin/sb-volume-demo\\\",\\n    \\\"location\\\": \\\"output.vcf\\\"\\n  },\\n  \\\"started_on\\\": \\\"2016-06-15T19:17:39Z\\\",\\n  \\\"properties\\\": {\\n    \\\"sse_algorithm\\\": \\\"AES256\\\",\\n    \\\"aws_storage_class\\\": \\\"STANDARD\\\",\\n    \\\"aws_canned_acl\\\": \\\"public-read\\\"\\n  },\\n  \\\"overwrite\\\": false\\n}\",\n      \"language\": \"json\",\n      \"name\": \"Response body\"\n    }\n  ]\n}\n[/block]\nLocate the `id` property in the response and copy this to your clipboard. This `id` is the identifier for the export job, and we will use it in the next step to verify that the job has completed.\n\n<a name=\"check-export\"></a>\n###3c: Check if the export job has completed\n\nTo check the status of your export job, make the API request to [get details of an export job](http://docs.sevenbridges.com/docs/get-details-of-an-export-job-v2). Append the export `id` you obtained in the step above after the path.\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"GET /v2/storage/exports/2fzgXdc7zqeYFMiVvTCZdLBKgUpKdUhn HTTP/1.1\\nHost: cgc-api.sbgenomics.com\\nX-SBG-Auth-Token: 3259c50e1ac5426ea8f1273259740f74\\ncontent-type: application/json\",\n      \"language\": \"http\",\n      \"name\": \"Get details of an export\"\n    }\n  ]\n}\n[/block]\nThe returned response details the `state` of your export. If the `state` is `COMPLETED`, your export has successfully finished. If the `state` is `PENDING`, wait a few seconds and repeat this step.\n\nYour bucket now contains the file that was uploaded to the CGC in step 1. To verify that a file has been exported, visit your project on the CGC and locate the file you originally uploaded. It should be marked as an alias. This means that the content of the file has been moved from storage on the CGC to your S3 bucket, and that the CGC file record been updated accordingly.\n\nCongratulations! You've now registered an S3 bucket as a volume, imported a file from the volume to the CGC, and exported a file from the CGC to the volume. Learn more about [connecting your cloud storage from our Knowledge Center](connecting-cloud-storage-overview).","excerpt":"","slug":"aws-cloud-storage-tutorial","type":"basic","title":"AWS Cloud storage tutorial"}

AWS Cloud storage tutorial


[block:callout] { "type": "warning", "title": "On this page:", "body": "* [Overview](#overview)\n* [Procedure](#procedure)\n* [Prerequisites](#prerequisites)\n* [Step 1: Add an S3 bucket as a volume](#register)\n * [1a: Create an IAM (Identity and Access Management) user](#create)\n * [1b: Create access keys for the IAM user](#authorize)\n * [1c: Attach your volume to the CGC](#attach)\n* [Step 2: Make an object from the bucket available on the CGC](#import)\n * [2a: Launch an import job](#launch-import)\n * [2b: Check if the import job has completed](#check-import)\n* [Step 3: Move a file from the CGC to the bucket](#move-file)\n * [3a: Upload a file to a project](#upload-file)\n * [3b: Move a file from your project on the CGC to the bucket](#move-file-from-project)\n * [3c: Check if the export job has completed](#check-export)" } [/block] <a name="overview"></a> ##Overview The Volumes API contains two types of calls: one to connect and manage cloud storage, and the other to import and export data to and from a connected cloud account. Before you can start working with your cloud storage via the CGC, you need to authorize the CGC to access and query objects on that cloud storage on your behalf. This is done by creating a "volume". A volume enables you to treat the cloud repository associated with it as external storage for the CGC. You can 'import' files from the volume to the CGC to use them as inputs for computation. Similarly, you can write files from the CGC to your cloud storage by 'exporting' them to your volume. Learn more about [working with volumes](volumes). The CGC uses Amazon Web Services as a cloud infrastructure provider. This affects the cloud storage you can access and associate with your CGC account. For instance, you have full read-write access to your data stored in Amazon Web Services' S3 and read-only access to data stored in Google Cloud Storage. <a name="procedure"></a> ##Procedure This short tutorial will guide you through setting up a volume. You'll connect your Amazon S3 bucket as a volume, make an object from the bucket available on the CGC, then move a file from the CGC to the bucket. Once a volume is created, you can issue import and export operations to make data appear on the CGC or to move your CGC files to the underlying cloud storage provider. [block:callout] { "type": "info", "body": "In this tutorial we assume you want to connect to an Amazon S3 bucket. The procedure will be slightly different for other cloud storage providers, such as a [Google Cloud Storage bucket](google-cloud-storage-tutorial). For more information, please refer to our list of [supported cloud storage providers](supported-cloud-storage-providers)." } [/block] <a name="prerequisites"></a> ##Prerequisites To complete this tutorial, you will need: 1. An <a href="https://aws.amazon.com/" target="blank">Amazon Web Services (AWS)</a> account 2. One or more buckets on this AWS account 3. One or more objects (files) in your target bucket 4. An authentication token for the CGC. Learn more about [getting your authentication token](get-your-authentication-token). <a name="register"></a> ##Step 1: Add an S3 bucket as a volume To set up a volume, you have to first register an AWS S3 bucket as a volume. Volumes mediate access between the CGC and your <a href="http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingBucket.html" target="blank">buckets</a>, which are local units of storage in AWS. Register an AWS S3 bucket as a volume through the following steps below. [block:callout] { "type": "success", "body": "*(Optional* You can also provide your [KMS ID](https://aws.amazon.com/documentation/kms/) if you opt to use KMS for your encryption." } [/block] <a name="create"></a> ###1a: Create an IAM (Identity and Access Management) user Follow AWS documentation for [directions on creating an IAM user](http://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html#id_users_create_console). <a name="authorize"></a> ###1b: Create access keys for the IAM user 1. In the list of IAM users, locate the IAM user you created above. Click the username to configure your options. [block:image] { "images": [ { "image": [ "https://files.readme.io/2afd959-cgc-aws-cloud-storage-tutorial-1.png", "cgc-aws-cloud-storage-tutorial-1.png", 1440, 442, "#e3e1e2" ] } ] } [/block] 2. Click the **Security credentials** tab. 3. In the **Access keys** section click **Create access key**. You get two keys, **Access key ID** and **Secret access key**. [block:image] { "images": [ { "image": [ "https://files.readme.io/f10477c-create-access-key.png", "create-access-key.png", 1440, 682, "#2d3442" ] } ] } [/block] 4. Copy the credentials for later use. Be sure to keep your credentials somewhere safe. You can also click **Download .csv file** to obtain them in a file named **accessKeys.csv**. <a name="attach"></a> ### 1c: Attach your volume to the CGC 1. From the main menu bar on the CGC, select **Data** > **Volumes**. 2. Click **Attach volume**. If you already have attached volumes, in the top-right corner click **Connect Storage**. 3. Select **amazon web services**. 4. Enter **Access key ID** and **Secret access key** you obtained in section [1b](#authorize) above. 5. Click **Next**. 6. In the **Bucket name** field enter the name of the S3 bucket you wish to connect. **Volume name** is the display name of the volume on the CGC and will be generated automatically. 7. (Optional) Enter volume description. 8. Set access privileges for the volume. Available options are: * **Read only (RO)** - You will be able to read files, but won't be able to add them to the volume. * **Read and Write (RW)** - You will be able to read files and also add files to the volume. 9. Click **Next**. You are now taken to the generated policy. [block:image] { "images": [ { "image": [ "https://files.readme.io/fd5c7a4-policy-generator.png", "policy-generator.png", 636, 572, "#e1ecec" ] } ] } [/block] 10. Copy the content of the box. 11. In the list of IAM users in AWS Management Console, locate the IAM user you created in section [1a](#create). Click the username to configure your options. 12. On the **Permissions** tab, click **Add inline policy**, as shown below. [block:image] { "images": [ { "image": [ "https://files.readme.io/adb58b2-cgc-aws-cloud-storage-tutorial-2.png", "cgc-aws-cloud-storage-tutorial-2.png", 1440, 682, "#293548" ] } ] } [/block] 13. Select the **JSON** tab and replace the existing content by pasting the code you copied in step 10. [block:image] { "images": [ { "image": [ "https://files.readme.io/c335626-cgc-aws-cloud-storage-tutorial-3a.png", "cgc-aws-cloud-storage-tutorial-3a.png", 1440, 682, "#272f3e" ] } ] } [/block] 14. Click **Review policy**. 15. Enter a descriptive policy name, e.g. **sb-access-policy**. Note that you can only use alphanumerics and the following characters: **+=,.@-_ **. 16. Click **Create policy**. 17. Then, go back to the CGC and click **Next** in the wizard. 18. In the **Endpoint** field enter **s3.amazonaws.com**. Leave default values for other settings. 19. Click **Next**. You can now review your volume connection settings. 20. Finally, click **Connect**. Your volume should now be connected to the CGC and visible in the list of volumes. <a name="import"></a> ##Step 2: Make an object from the bucket available on the CGC Now that we have a volume, we can make data objects from the bucket associated with the volume available as "aliases" on the CGC. Aliases point to files stored on your cloud storage bucket and can be copied, executed, and organized like normal files on the CGC. We call this operation "importing". Learn more about working with [aliases](doc:aliases). To import a data object from your volume as an alias on the CGC, follow the steps below. <a name="launch-import"></a> ###2a: Launch an import job To import a file, make the API request to [start an import job](start-an-import-job-v2) as shown below. In the body of the request, include the key-value pairs in the table below. [block:parameters] { "data": { "h-0": "Key", "h-1": "Description of value", "0-0": "`volume_id`\n_required_", "0-1": "Volume ID from which to import the file. This consists of your username followed by the volume's name, such as `rfranklin/sb-volume-demo`.", "1-0": "`location`\n_required_", "2-0": "`destination`\n_required_", "3-0": "`project`\n_required_", "4-0": "`name`", "5-0": "`overwrite`", "5-1": "Specify as `true` to overwrite the file if the file with the same name already exists in the destination.", "4-1": "The name of the alias to create. This name should be unique to the project. If the name is already in use in the project, you should use the overwrite query parameter in this call to force any file with that name to be deleted before the alias is created.\n\nIf name is omitted, the alias name will default to the last segment of the complete location (including the `prefix`) on the volume. Segments are considered to be separated with forward slashes ('/').", "3-1": "The project in which to create the alias. This consists of your username followed by your project's short name, such as `rfranklin/my-project`.", "2-1": "This object should describe the CGC destination for the imported file.", "1-1": "Volume-specific location pointing to the file to import. This location should be recognizable to the underlying cloud service as a valid key or path to the file.\n\nPlease note that if this volume was configured with a `prefix` parameter when it was created, the `prefix` will be prepended to location before attempting to locate the file on the volume." }, "cols": 2, "rows": 6 } [/block] [block:code] { "codes": [ { "code": "POST /v2/storage/imports HTTP/1.1\nHost: cgc-api.sbgenomics.com\nX-SBG-Auth-Token: 3259c50e1ac5426ea8f1273259740f74\ncontent-type: application/json", "language": "http", "name": "Start an import job" } ] } [/block] [block:code] { "codes": [ { "code": "{ \n \"source\":{ \n \"volume\":\"rfranklin/sb-volume-demo\",\n \"location\":\"example_human_Illumina.pe_1.fastq\"\n },\n \"destination\":{ \n \"project\":\"rfranklin/my-project\",\n \"name\":\"my_imported_example_human_Illumina.pe_1.fastq\"\n },\n \"overwrite\": true\n}", "language": "json", "name": "Start an import job request body" } ] } [/block] The returned response details the status of your import, as shown below. [block:code] { "codes": [ { "code": "{\n \"href\": \"https://cgc-api.sbgenomics.com/v2/storage/imports/SbBQeWpfZ445jxgIBioYfnJ5oBl86nFN\",\n \"id\": \"SbBQeWpfZ445jxgIBioYfnJ5oBl86nFN\",\n \"state\": \"PENDING\",\n \"overwrite\": true,\n \"source\": {\n \"volume\": \"rfranklin/sb-volume-demo\",\n \"location\": \"example_human_Illumina.pe_1.fastq\"\n },\n \"destination\": {\n \"project\": \"rfranklin/my-project\",\n \"name\": \"my_uploaded_example_human_Illumina.pe_1.fastq\"\n }\n}", "language": "json", "name": "Response body" } ] } [/block] Locate the `id` property in the response and copy this value to your clipboard. This `id` is an identifier for the import job, and we will need it in the following step. <a name="check-import"></a> ###2b: Check if the import job has completed To check if the import job has completed, make the API request to [get details of an import job](get-details-of-an-import-job-v2), as shown below. Simply append the import job `id` obtained in the step above to the path. [block:code] { "codes": [ { "code": "GET /v2/storage/imports/SbBQeWpfZ445jxgIBioYfnJ5oBl86nFN HTTP/1.1\nHost: cgc-api.sbgenomics.com\nX-SBG-Auth-Token: 3259c50e1ac5426ea8f1273259740f74\ncontent-type: application/json", "language": "http", "name": "Get details of an import job" } ] } [/block] The returned response details the `state` of your import. If the state is `COMPLETED`, your import has successfully finished. If the `state` is `PENDING`, wait a few seconds and repeat this step. You should now have a freshly-created alias in your project. To verify that a file has been imported, visit this project in your browser and look for a file with the same name as the key of the object in your bucket. <a name="move-file"></a> ##Step 3: Move a file from the CGC to the bucket You've successfully created an alias on the CGC for a file in your S3 bucket. You can also move files from the CGC into your connected S3 bucket. This operation is known as 'exporting' to the volume associated with the bucket. Please keep in mind that public files, files belonging to CGC-hosted datasets, archived files, and aliases cannot be exported. For more information, please see [working with aliases](aliases). Follow the steps below to move a file from CGC to an object in your bucket. <a name="upload-file"></a> ###3a: Upload a file to a project Before you can export a file from the CGC, you must upload a file to a project. To upload a file, follow the steps below: 1. Upload a file to your project using the [command line uploader](upload-via-the-command-line), the [CGC Uploader](upload-via-the-seven-bridges-uploader), using an[ FTP or HTTP(S) server](upload-from-an-ftp-server), or the [API](upload-files). 2. Locate and copy the file ID. From the various upload mechanisms, you can find the file ID as follows: * **Command line uploader** - In the output of the command line uploader, note that the first column in the line that corresponds to the uploaded file. This is the uploaded file's ID. * **CGC Uploader** - Once the file has uploaded, locate the file in the Files tab of the relevant project. Click on the file's name. A new page with details about your file should open. Locate the last segment of this page's URL, following /files/. This is the uploaded file's ID. For example, the file ID of https://cgc.sbgenomics.com/u/rfranklin/volumes-api-project/files/577d4c35e4b05e75806f2853/` is `577d4c35e4b05e75806f2853`. * **FTP or HTTP(S) server **- Locate the file's ID in the same way as for files uploaded using the CGC Uploader. * **API** - Issue the API request to [List all files within a project](list-files-in-a-project). The IDs of each file are listed next to the key id in the response body. <a name="move-file-from-project"></a> ###3b: Move a file from your project on the CGC to the bucket When you export a file from the CGC to your volume, you are writing to your S3 bucket. Make the API request to [start an export job](start-an-export-job-v2) to move a file from the CGC to your bucket, as shown below. In the body of your request, include the key-value pairs from the table below. [block:parameters] { "data": { "h-0": "Key", "h-1": "Value", "0-0": "`source`\n_required_", "0-1": "This object should describe the source from which the file should be exported.", "1-0": "`file`\n_required_", "1-1": "The CGC-assigned ID of the file for export.", "2-0": "`destination`\n_required_", "3-0": "`volume`\n_required_", "4-0": "`location`\n_required_", "5-0": "`properties`", "6-0": "`sse_algorithm`\n_default: AES256_", "2-1": "This object should describe the destination to which the file will be exported.", "3-1": "The ID of the volume to which the file will be exported.", "4-1": "Volume-specific location to which the file will be exported. This location should be recognizable to the underlying cloud service as a valid key or path to a new file.\n\nPlease note that if this volume has been configured with a `prefix` parameter, the value of `prefix` will be prepended to `location` before attempting to create the file on the volume.", "5-1": "Service-specific properties of the export.\nThese values override the defaults from the volume.", "6-1": "S3 server-side encryption to use when exporting to this bucket.\n\nSupported values:\n * `AES256` (SSE-S3 encryption)\n * `aws:kms`\n * `null` (no server-side encryption)", "7-1": "Provide your AWS KMS ID here if you specify `aws:kms` as your `sse_algorithm`. Learn more about [AWS KMS](https://aws.amazon.com/documentation/kms/).", "7-0": "`sse_aws_kms_key_id`" }, "cols": 2, "rows": 8 } [/block] [block:code] { "codes": [ { "code": "POST /v2/storage/exports HTTP/1.1\nHost: cgc-api.sbgenomics.com\nX-SBG-Auth-Token: 3259c50e1ac5426ea8f1273259740f74\ncontent-type: application/json", "language": "http", "name": "Start an export job" } ] } [/block] [block:code] { "codes": [ { "code": "{\n \"source\": {\n \"file\": \"576159f7f5b4e1de6ae9b5f0\"\n },\n \"destination\": {\n \"volume\": \"rfranlin/sb-volume-demo\",\n \"location\": \"\"\n },\n\"properties\": {\n \"sse_algorithm\": \"AES256\"\n }\n}", "language": "json", "name": "Start an export job request body" } ] } [/block] The returned response details the status of your import, as shown below. [block:code] { "codes": [ { "code": "{\n \"href\": \"https://cgc-api.sbgenomics.com/v2/storage/exports/2fzgXdc7zqeYFMiVvTCZdLBKgUpKdUhn\",\n \"id\": \"2fzgXdc7zqeYFMiVvTCZdLBKgUpKdUhn\",\n \"state\": \"PENDING\",\n \"source\": {\n \"file\": \"576159f7f5b4e1de6ae9b5f0\"\n },\n \"destination\": {\n \"volume\": \"rfranklin/sb-volume-demo\",\n \"location\": \"output.vcf\"\n },\n \"started_on\": \"2016-06-15T19:17:39Z\",\n \"properties\": {\n \"sse_algorithm\": \"AES256\",\n \"aws_storage_class\": \"STANDARD\",\n \"aws_canned_acl\": \"public-read\"\n },\n \"overwrite\": false\n}", "language": "json", "name": "Response body" } ] } [/block] Locate the `id` property in the response and copy this to your clipboard. This `id` is the identifier for the export job, and we will use it in the next step to verify that the job has completed. <a name="check-export"></a> ###3c: Check if the export job has completed To check the status of your export job, make the API request to [get details of an export job](http://docs.sevenbridges.com/docs/get-details-of-an-export-job-v2). Append the export `id` you obtained in the step above after the path. [block:code] { "codes": [ { "code": "GET /v2/storage/exports/2fzgXdc7zqeYFMiVvTCZdLBKgUpKdUhn HTTP/1.1\nHost: cgc-api.sbgenomics.com\nX-SBG-Auth-Token: 3259c50e1ac5426ea8f1273259740f74\ncontent-type: application/json", "language": "http", "name": "Get details of an export" } ] } [/block] The returned response details the `state` of your export. If the `state` is `COMPLETED`, your export has successfully finished. If the `state` is `PENDING`, wait a few seconds and repeat this step. Your bucket now contains the file that was uploaded to the CGC in step 1. To verify that a file has been exported, visit your project on the CGC and locate the file you originally uploaded. It should be marked as an alias. This means that the content of the file has been moved from storage on the CGC to your S3 bucket, and that the CGC file record been updated accordingly. Congratulations! You've now registered an S3 bucket as a volume, imported a file from the volume to the CGC, and exported a file from the CGC to the volume. Learn more about [connecting your cloud storage from our Knowledge Center](connecting-cloud-storage-overview).