{"metadata":{"image":[],"title":"","description":""},"api":{"url":"","auth":"required","results":{"codes":[]},"settings":"","params":[]},"next":{"description":"","pages":[]},"title":"Data Studio analysis editor","type":"basic","slug":"about-files-in-a-data-cruncher-analysis","excerpt":"","body":"Once you [start the analysis](doc:run-an-analysis-using-data-cruncher) and the initialization process is completed, you will be automatically taken to the editor. The editor provides native interface for the chosen analysis environment and the additional CGC navigation and content management options.\n[block:callout]\n{\n  \"type\": \"info\",\n  \"title\": \"\",\n  \"body\": \"Separate domain (**sb-cgc-cruncher.com**) is used for serving Data Studio editors, which provides better security isolation and privacy control of your favorite third-party integrated development environment.\"\n}\n[/block]\nData Studio uses the cloud infrastructure to run your analyses - each analysis execution is run using a virtual computing environment also known as _instance_.\n\nTo help you navigate through your working space and provide you with easy control over your data, this is a directory structure that is automatically set up on an instance when an analysis is started:\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"/sbgenomics\\n|-- workspace\\n|-- project-files\\n|-- output-files\",\n      \"language\": \"text\"\n    }\n  ]\n}\n[/block]\n## Manage your notebooks and scripts (`workspace`)\n\nThis is the default directory where your analysis takes place. All your notebooks and scripts created using your editor of choice are automatically created here. You can also:\n* Upload files to the analysis workspace directly from the local machine using the native upload option in your editor of choice.\n* Download files into the analysis workspace directly from a location on the Internet (using **cURL** or **wget**in the terminal, for example).\n[block:callout]\n{\n  \"type\": \"info\",\n  \"title\": \"\",\n  \"body\": \"All workspace content will be automatically available for each new analysis run.\"\n}\n[/block]\nThe image illustrating how workspace files are displayed in JupyterLab (left) and RStudio (right):\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/fac9de9-about-data-cruncher-workspace-and-files-2.png\",\n        \"about-data-cruncher-workspace-and-files-2.png\",\n        1670,\n        592,\n        \"#f6f6f4\"\n      ]\n    }\n  ]\n}\n[/block]\nOnce the analysis is stopped, all workspace content will automatically be saved and available via the [analysis details page](doc:view-and-edit-data-cruncher-analysis-details) for preview.\n\n## Find your inputs (`project-files`)\n\nThe `project-files` directory provides a convenient way to use files from the _project in which your analysis is located_. All files that are available under the **Files** tab when viewing a project through the visual interface are mounted inside the `/sbgenomics/project-files/` folder in a Data Studio analysis.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/188ab2e-cruncher-project-files-other.png\",\n        \"cruncher-project-files-other.png\",\n        1446,\n        761,\n        \"#333\"\n      ]\n    }\n  ]\n}\n[/block]\nTo reference a project file in your Data Studio analysis, simple use its `/sbgenomics/project-files/<file-name>` or `/sbgenomics/project-files/<folder-name>/<file-name>` path. For example, if there is a file named `hapmap_3.3.hg38.vcf` in your project and you need to reference the file in your Data Studio analysis code, you would do it by entering the `/sbgenomics/project-files/hapmap_3.3.hg38.vcf `path. If the file is not located at the root of your project files, but is in a subfolder named for example `vcf`, the path would be `/sbgenomics/project-files/vcf/hapmap_3.3.hg38.vcf`. To see all project files that are available for use in your Data Studio analysis, open a terminal window in your Data Studio analysis and execute the following code:\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"ls /sbgenomics/project-files/\",\n      \"language\": \"shell\"\n    }\n  ]\n}\n[/block]\n\nThis lists all files available in the project in which you are executing your analysis.\n\nNote that all project files are read only, and are mounted using the Seven Bridges proprietary tool that enables actual content to be downloaded when files and file parts are needed in the analysis, rather than downloading all content at once. Since project files are mounted as read only, their content can be accessed, but can't be changed.\n\n## Save analysis outputs (`output-files`)\n\nThis is the target directory to store files that you want saved as outputs of your analysis. Once the analysis is stopped, all output files are being uploaded directly to your Project files so you can use and analyze them in other executions. Once saved, the files will be accessible at the root of your project files, directly under the **Files** tab when accessing them through the visual interface. If any naming conflicts with existing Project files are encountered, newly-saved files will be automatically renamed to avoid overwriting existing ones. All your analysis outputs are traceable and links are available via the [analysis details page](doc:view-and-edit-data-cruncher-analysis-details). To save analysis outputs, follow these steps:\n\n* _JupyterLab:_\n    1. Click **File** > **New** > **Terminal**. Terminal opens in your `workspace` directory.\n    2. Use the `cp` command to copy the files you want to save to the `output-files` directory, for example: `cp my_file.ext ../output-files/`.\n\n* _RStudio:_\n    1. In RStudio, open the **Terminal** tab.\n    2. Use the `cp` command to copy the files you want to save to the `output-files` directory, for example: `cp my_file.ext ../output-files/`.\n[block:callout]\n{\n  \"type\": \"info\",\n  \"title\": \"\",\n  \"body\": \"Please note that file saving takes place only while the analysis is being stopped. When you click **Stop**, this will trigger the saving process and the analysis status will change to **SAVING**. Once saving has been completed, the analysis status changes to **SAVED**.\"\n}\n[/block]","updates":[],"order":9,"isReference":false,"hidden":false,"sync_unique":"","link_url":"","link_external":false,"_id":"58f5d5cf914540250034e4bf","project":"55faf11ba62ba1170021a9a7","parentDoc":null,"user":"5767bc73bb15f40e00a28777","version":{"version":"1.0","version_clean":"1.0.0","codename":"","is_stable":true,"is_beta":true,"is_hidden":false,"is_deprecated":false,"categories":["55faf11ca62ba1170021a9ab","55faf8f4d0e22017005b8272","55faf91aa62ba1170021a9b5","55faf929a8a7770d00c2c0bd","55faf932a8a7770d00c2c0bf","55faf94b17b9d00d00969f47","55faf958d0e22017005b8274","55faf95fa8a7770d00c2c0c0","55faf96917b9d00d00969f48","55faf970a8a7770d00c2c0c1","55faf98c825d5f19001fa3a6","55faf99aa62ba1170021a9b8","55faf99fa62ba1170021a9b9","55faf9aa17b9d00d00969f49","55faf9b6a8a7770d00c2c0c3","55faf9bda62ba1170021a9ba","5604570090ee490d00440551","5637e8b2fbe1c50d008cb078","5649bb624fa1460d00780add","5671974d1b6b730d008b4823","5671979d60c8e70d006c9760","568e8eef70ca1f0d0035808e","56d0a2081ecc471500f1795e","56d4a0adde40c70b00823ea3","56d96b03dd90610b00270849","56fbb83d8f21c817002af880","573c811bee2b3b2200422be1","576bc92afb62dd20001cda85","5771811e27a5c20e00030dcd","5785191af3a10c0e009b75b0","57bdf84d5d48411900cd8dc0","57ff5c5dc135231700aed806","5804caf792398f0f00e77521","58458b4fba4f1c0f009692bb","586d3c287c6b5b2300c05055","58ef66d88646742f009a0216","58f5d52d7891630f00fe4e77","59a555bccdbd85001bfb1442","5a2a81f688574d001e9934f5","5b080c8d7833b20003ddbb6f","5c222bed4bc358002f21459a","5c22412594a2a5005cc9e919","5c41ae1c33592700190a291e","5c8a525e2ba7b2003f9b153c","5cbf14d58c79c700ef2b502e","5db6f03a6e187c006f667fa4","5f894c7d3b0894006477ca01","6176d5bf8f59c6001038c2f7"],"_id":"55faf11ba62ba1170021a9aa","releaseDate":"2015-09-17T16:58:03.490Z","createdAt":"2015-09-17T16:58:03.490Z","project":"55faf11ba62ba1170021a9a7","__v":48},"__v":0,"createdAt":"2017-04-18T09:01:03.544Z","category":{"sync":{"isSync":false,"url":""},"pages":[],"title":"DATA STUDIO","slug":"data-cruncher","order":42,"from_sync":false,"reference":false,"_id":"58f5d52d7891630f00fe4e77","project":"55faf11ba62ba1170021a9a7","version":"55faf11ba62ba1170021a9aa","__v":1,"createdAt":"2017-04-18T08:58:21.978Z"},"githubsync":""}

Data Studio analysis editor


Once you [start the analysis](doc:run-an-analysis-using-data-cruncher) and the initialization process is completed, you will be automatically taken to the editor. The editor provides native interface for the chosen analysis environment and the additional CGC navigation and content management options. [block:callout] { "type": "info", "title": "", "body": "Separate domain (**sb-cgc-cruncher.com**) is used for serving Data Studio editors, which provides better security isolation and privacy control of your favorite third-party integrated development environment." } [/block] Data Studio uses the cloud infrastructure to run your analyses - each analysis execution is run using a virtual computing environment also known as _instance_. To help you navigate through your working space and provide you with easy control over your data, this is a directory structure that is automatically set up on an instance when an analysis is started: [block:code] { "codes": [ { "code": "/sbgenomics\n|-- workspace\n|-- project-files\n|-- output-files", "language": "text" } ] } [/block] ## Manage your notebooks and scripts (`workspace`) This is the default directory where your analysis takes place. All your notebooks and scripts created using your editor of choice are automatically created here. You can also: * Upload files to the analysis workspace directly from the local machine using the native upload option in your editor of choice. * Download files into the analysis workspace directly from a location on the Internet (using **cURL** or **wget**in the terminal, for example). [block:callout] { "type": "info", "title": "", "body": "All workspace content will be automatically available for each new analysis run." } [/block] The image illustrating how workspace files are displayed in JupyterLab (left) and RStudio (right): [block:image] { "images": [ { "image": [ "https://files.readme.io/fac9de9-about-data-cruncher-workspace-and-files-2.png", "about-data-cruncher-workspace-and-files-2.png", 1670, 592, "#f6f6f4" ] } ] } [/block] Once the analysis is stopped, all workspace content will automatically be saved and available via the [analysis details page](doc:view-and-edit-data-cruncher-analysis-details) for preview. ## Find your inputs (`project-files`) The `project-files` directory provides a convenient way to use files from the _project in which your analysis is located_. All files that are available under the **Files** tab when viewing a project through the visual interface are mounted inside the `/sbgenomics/project-files/` folder in a Data Studio analysis. [block:image] { "images": [ { "image": [ "https://files.readme.io/188ab2e-cruncher-project-files-other.png", "cruncher-project-files-other.png", 1446, 761, "#333" ] } ] } [/block] To reference a project file in your Data Studio analysis, simple use its `/sbgenomics/project-files/<file-name>` or `/sbgenomics/project-files/<folder-name>/<file-name>` path. For example, if there is a file named `hapmap_3.3.hg38.vcf` in your project and you need to reference the file in your Data Studio analysis code, you would do it by entering the `/sbgenomics/project-files/hapmap_3.3.hg38.vcf `path. If the file is not located at the root of your project files, but is in a subfolder named for example `vcf`, the path would be `/sbgenomics/project-files/vcf/hapmap_3.3.hg38.vcf`. To see all project files that are available for use in your Data Studio analysis, open a terminal window in your Data Studio analysis and execute the following code: [block:code] { "codes": [ { "code": "ls /sbgenomics/project-files/", "language": "shell" } ] } [/block] This lists all files available in the project in which you are executing your analysis. Note that all project files are read only, and are mounted using the Seven Bridges proprietary tool that enables actual content to be downloaded when files and file parts are needed in the analysis, rather than downloading all content at once. Since project files are mounted as read only, their content can be accessed, but can't be changed. ## Save analysis outputs (`output-files`) This is the target directory to store files that you want saved as outputs of your analysis. Once the analysis is stopped, all output files are being uploaded directly to your Project files so you can use and analyze them in other executions. Once saved, the files will be accessible at the root of your project files, directly under the **Files** tab when accessing them through the visual interface. If any naming conflicts with existing Project files are encountered, newly-saved files will be automatically renamed to avoid overwriting existing ones. All your analysis outputs are traceable and links are available via the [analysis details page](doc:view-and-edit-data-cruncher-analysis-details). To save analysis outputs, follow these steps: * _JupyterLab:_ 1. Click **File** > **New** > **Terminal**. Terminal opens in your `workspace` directory. 2. Use the `cp` command to copy the files you want to save to the `output-files` directory, for example: `cp my_file.ext ../output-files/`. * _RStudio:_ 1. In RStudio, open the **Terminal** tab. 2. Use the `cp` command to copy the files you want to save to the `output-files` directory, for example: `cp my_file.ext ../output-files/`. [block:callout] { "type": "info", "title": "", "body": "Please note that file saving takes place only while the analysis is being stopped. When you click **Stop**, this will trigger the saving process and the analysis status will change to **SAVING**. Once saving has been completed, the analysis status changes to **SAVED**." } [/block]