{"_id":"57a47b650384cd1700808670","parentDoc":null,"project":"55faf11ba62ba1170021a9a7","category":{"_id":"55faf96917b9d00d00969f48","pages":["5626a5d644c87f0d00fe6396","5626a5e4e2ce610d004e3dd8","5626a89ce2ce610d004e3dde","56429a87f49bfa0d002f54e0"],"project":"55faf11ba62ba1170021a9a7","__v":4,"version":"55faf11ba62ba1170021a9aa","sync":{"url":"","isSync":false},"reference":false,"createdAt":"2015-09-17T17:33:29.016Z","from_sync":false,"order":18,"slug":"run-an-analysis","title":"RUN AN ANALYSIS"},"user":"566590c83889610d0008a253","__v":1,"version":{"_id":"55faf11ba62ba1170021a9aa","project":"55faf11ba62ba1170021a9a7","__v":38,"createdAt":"2015-09-17T16:58:03.490Z","releaseDate":"2015-09-17T16:58:03.490Z","categories":["55faf11ca62ba1170021a9ab","55faf8f4d0e22017005b8272","55faf91aa62ba1170021a9b5","55faf929a8a7770d00c2c0bd","55faf932a8a7770d00c2c0bf","55faf94b17b9d00d00969f47","55faf958d0e22017005b8274","55faf95fa8a7770d00c2c0c0","55faf96917b9d00d00969f48","55faf970a8a7770d00c2c0c1","55faf98c825d5f19001fa3a6","55faf99aa62ba1170021a9b8","55faf99fa62ba1170021a9b9","55faf9aa17b9d00d00969f49","55faf9b6a8a7770d00c2c0c3","55faf9bda62ba1170021a9ba","5604570090ee490d00440551","5637e8b2fbe1c50d008cb078","5649bb624fa1460d00780add","5671974d1b6b730d008b4823","5671979d60c8e70d006c9760","568e8eef70ca1f0d0035808e","56d0a2081ecc471500f1795e","56d4a0adde40c70b00823ea3","56d96b03dd90610b00270849","56fbb83d8f21c817002af880","573c811bee2b3b2200422be1","576bc92afb62dd20001cda85","5771811e27a5c20e00030dcd","5785191af3a10c0e009b75b0","57bdf84d5d48411900cd8dc0","57ff5c5dc135231700aed806","5804caf792398f0f00e77521","58458b4fba4f1c0f009692bb","586d3c287c6b5b2300c05055","58ef66d88646742f009a0216","58f5d52d7891630f00fe4e77","59a555bccdbd85001bfb1442"],"is_deprecated":false,"is_hidden":false,"is_beta":true,"is_stable":true,"codename":"","version_clean":"1.0.0","version":"1.0"},"updates":[],"next":{"pages":[],"description":""},"createdAt":"2016-08-05T11:41:25.723Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"settings":"","results":{"codes":[]},"auth":"required","params":[],"url":""},"isReference":false,"order":5,"body":"The **Batch Input** feature of workflows allows you to run identical analyses on different data, by entering multiple input files and grouping them by specified metadata criteria. This will result in multiple tasks: one for each group of files.\n\nUsing **Batch Input**, you can process multiple datasets with a single workflow containing the same parameter settings without having to set up the workflow multiple times.\n[block:callout]\n{\n  \"type\": \"warning\",\n  \"body\": \"* [Add a batch input node to a workflow](#section-add-a-batch-input-node-to-a-workflow)\\n* [Select input files for the batch task](#section-select-input-files-for-the-batch-task)\\n* [Grouping input files into batches](#section-grouping-input-files-into-batches)\\n  * [Batch by File](#section-batch-by-file)\\n  * [Batch by Sample, Library, Platform unit, or File segment](#section-batching-by-sample-library-platform-unit-or-file-segment)\\n     * [Batch by Sample](#section-batch-by-sample)\\n     * [Batch by Library](#section-batch-by-library)\\n     * [Batch by Platform Unit](#section-batch-by-platform-unit)\\n     * [Batch by File segment](#section-batch-by-file-segment)\\n* [Use multiple files in batch groups](#use-multiple-files-in-batch-groups)\",\n  \"title\": \"On this page:\"\n}\n[/block]\n\n##Add a batch input node to a workflow\n\nTo add a batch input node to a workflow: \n1. [Navigate to the desired project](view-a-project).\n2. Click the **Apps** tab.\n3. Click the edit icon <img src=\"https://files.readme.io/qj7P3A3Rn6t10e7B2Xjs_edit%20icon.png\"\nheight=\"20px\" width=\"auto\" align=\"inline\" style=\"margin:1px\"/> next to the desired workflow (or click Add Apps to add a new workflow). \n\nThe workflow editor will be displayed.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/9zKx4dQDRSWfVtQ5jReS_batch-input-n1.jpg\",\n        \"batch-input-n1.jpg\",\n        \"1327\",\n        \"675\",\n        \"#386084\",\n        \"\"\n      ],\n      \"border\": true\n    }\n  ]\n}\n[/block]\n\n[block:image]\n{\n  \"images\": [\n    {}\n  ]\n}\n[/block]\n5. Click the input nodes that you wish to convert into a batch input node. The **PARAMS** tab will be displayed on the right.\n6. Choose the criteria for grouping the input files, e.g. \"Sample ID\" (see below for [further information on the available criteria](#section-grouping-input-files-into-batches)). Note that the node icon will be changed to denote a batch input node <img src=\"https://files.readme.io/nfseOWgLSzCk7p1qCE84_batch-node-icon.jpg\"\nheight=\"35px\" width=\"auto\" align=\"inline\" style=\"margin:1px\"/> .\n7. Click **Save**.\n8. Click **Run** to create a **DRAFT** task. **DRAFT** task pages allow you to set up a workflow, select input files, and change app parameters before running a task.\n[block:callout]\n{\n  \"type\": \"info\",\n  \"body\": \"Only one **Batch Input **node can be added to any workflow.\"\n}\n[/block]\n<div align=\"right\"><a href=\"#top\">top</a></div>\n\n##Select input files for the batch task\n\nOnce you have added a batch input node to your workflow, you can select the input files for the task.\n\n1. [Select the reference and annotation files needed for the task, or accept the suggested files](doc:select-reference-files).\n2. Leave the selected criteria or choose a different one from the **Batch by** drop-down menu.\n3. Then, click **Pick files**.\n\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/HBlEKkTmRzVgBqgOSVJR_batch-input-n4.jpg\",\n        \"batch-input-n4.jpg\",\n        \"616\",\n        \"291\",\n        \"#dd9638\",\n        \"\"\n      ],\n      \"border\": true\n    }\n  ]\n}\n[/block]\n4. Choose the desired files and click **Select**.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/ZDLlHKDRc2AVLmIBz2aB_batch-input-n3.jpg\",\n        \"batch-input-n3.jpg\",\n        \"880\",\n        \"601\",\n        \"#dc9534\",\n        \"\"\n      ],\n      \"border\": true\n    }\n  ]\n}\n[/block]\nThe added files will be automatically organized into groups according to the selected criteria (e.g. sample ID). The number of groups will determine the number of sub-tasks (child tasks) which will be executed within this batch task (the parent task).\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/MnxTzFTUmzVb1Tt3nILw_batch-input-n5.jpg\",\n        \"batch-input-n5.jpg\",\n        \"672\",\n        \"692\",\n        \"#e49431\",\n        \"\"\n      ],\n      \"border\": true\n    }\n  ]\n}\n[/block]\n\n[block:callout]\n{\n  \"type\": \"info\",\n  \"body\": \"Click the **x** icon next to a group to remove that group of files.\"\n}\n[/block]\n5. Click **Run** to execute the workflow and then confirm.\n    The batch task (parent task) will start creating all of the sub-tasks (child tasks).\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/135b755-cgc_batch_tasks_creating.png\",\n        \"cgc_batch_tasks_creating.png\",\n        1440,\n        612,\n        \"#f5f5f5\"\n      ],\n      \"border\": true\n    }\n  ]\n}\n[/block]\nOnce all of the child task are created, the batch task will start running them. The batch task **Progress** message will be changed to **Running**, as shown below.\n[block:callout]\n{\n  \"type\": \"info\",\n  \"body\": \"In case any of the child tasks haven’t been created properly, the parent task will revert to **DRAFT** status and the related error message will be displayed.\"\n}\n[/block]\n\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/7d6e282-cgc_batch_tasks_list.png\",\n        \"cgc_batch_tasks_list.png\",\n        1424,\n        657,\n        \"#f0f2f3\"\n      ],\n      \"border\": true\n    }\n  ]\n}\n[/block]\nAll of the child tasks are individual tasks which you can manage just like any other task on the Platform. You can re-run or abort any of the child tasks.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/bc6a450-cgc_batch_tasks_statuses.png\",\n        \"cgc_batch_tasks_statuses.png\",\n        1423,\n        696,\n        \"#eff1f3\"\n      ],\n      \"border\": true\n    }\n  ]\n}\n[/block]\nThe status of the parent task will show the combined status for all of the child tasks. In the screenshot above, the parent task status shows:\n  * 3 child tasks in progress\n  * 1 child task aborted by the user\n  * 1 completed child task\n\nHere you can also:\n\n  * **Abort** - click <img src=\"https://files.readme.io/njhttFawQyqJ5Ga2uwcQ_abort-parent.jpg\"\nheight=\"30px\" width=\"auto\" align=\"inline\" style=\"margin:1px\"/> to abort the parent task (which will also abort all child tasks).\n  * **Edit and rerun** - click <img src=\"https://files.readme.io/3qkSHKvsSJaMJ67gdycU_edit-and-run-parent.jpg\"\nheight=\"30px\" width=\"auto\" align=\"inline\" style=\"margin:1px\"/> to run the parent task again.\n\n##Grouping input files into batches\n\nYou can either group input files by their metadata (File, Sample, Library, Platform unit, or File segment). The optimal grouping will depend on your experimental design.\n\nFor example, suppose you want to run the public Whole Genome Analysis workflow, and you have multiple FASTQ files from many samples (two paired end reads per sample, resulting in two files per sample). In this case, you might want to analyze files from each sample in batches. \n\nAs shown in the image below, click **Pick files** to simultaneously enter files for all the samples you want to analyze. Then, batch files by **Sample** from the **Batch by** drop-down menu next. This will result in exactly one task per pair of FASTQ files from each sample.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/b9777cc-Screen_Shot_2017-05-11_at_1.53.51_PM.png\",\n        \"Screen Shot 2017-05-11 at 1.53.51 PM.png\",\n        634,\n        140,\n        \"#f2f2f2\"\n      ],\n      \"border\": true\n    }\n  ]\n}\n[/block]\n###Batching by File\n\nTo batch by file, select **File** from the **Batch by** drop-down menu. Batching by **File** runs the workflow for each individual file, initiating a new child task for each input file.\n\n<div align=\"right\"><a href=\"#top\">top</a></div>\n\n###Batch by Case\n\nTo batch by Case ID, select **Case** from the Batch by drop-down menu. Batching by **Case** sorts the files by Case ID. It runs the workflow for each group of files, initiating a new child task for each group.\n\nBatching by **Case** groups files by their Case ID, a human-readable identifier, such as a number or a string that may contain metadata information, for a subject who has taken part in the investigation of study.\n\n<div align=\"right\"><a href=\"#top\">top</a></div>\n\n###Batching by Sample, Library, Platform unit, or File segment\n\nWhen batching by **Sample**, **Library**, **Platform unit**, or **File segment**, files are grouped by their value for a metadata field. Files are grouped following the hierarchy listed below:\n\n1. Sample ID\n2. Library ID\n3. Platform unit ID\n4. File fragment number\n5. Paired end (for paired end reads)  \n\nFor example, files batched by **Library** will first be sorted by Sample ID with a secondary sort by Library ID. Consider the files described in the table below:\n[block:parameters]\n{\n  \"data\": {\n    \"h-0\": \"File name\",\n    \"h-1\": \"Sample ID\",\n    \"h-2\": \"Library ID\",\n    \"0-0\": \"File 1\",\n    \"0-1\": \"A\",\n    \"0-2\": \"1\",\n    \"1-0\": \"File 2\",\n    \"1-1\": \"A\",\n    \"1-2\": \"1\",\n    \"2-0\": \"File 3\",\n    \"2-1\": \"B\",\n    \"2-2\": \"2\",\n    \"3-0\": \"File 4\",\n    \"3-1\": \"B\",\n    \"3-2\": \"1\",\n    \"4-0\": \"File 5\",\n    \"4-1\": \"B\",\n    \"4-2\": \"2\"\n  },\n  \"cols\": 3,\n  \"rows\": 5\n}\n[/block]\nThese five files will be sorted into 3 groups:\n\n  * Sample ID \"A\" with Library ID \"1\" (File 1 and File 2)\n  * Sample ID \"B\" with Library ID \"1\" (File 4)\n  * Sample ID \"B\" with Library ID \"2\" (File 3 and File 5) \n\nBy the same token, files batched by **File fragment** will first be sorted by Sample ID, then Library ID, Platform unit ID, and File fragment number. A separate child task will be created for each different grouping.\n\n<div align=\"right\"><a href=\"#top\">top</a></div>\n\n<a name=\"section-batch-by-sample\"></a>\n**Batch by Sample**\n\nTo batch by Sample ID, select **Sample** from the **Batch by** drop-down menu. Batching by **Sample** sorts the files by Sample ID.It runs the workflow for each group of files, initiating a new child task for each group.\n\nBatching by **Sample** groups files by their Sample ID, a human-readable identifier for a sample or specimen, which could contain some metadata information.\n\n<div align=\"right\"><a href=\"#top\">top</a></div>\n\n<a name=\"section-batch-by-library\"></a>\n**Batch by Library**\n\nTo batch by Library ID, select **Library** from the **Batch by** drop-down menu. Batching by **Library** sorts the files by Sample ID then Library ID. It runs the workflow for each group of files, initiating a new child task for each group.\n\nBatching by **Library** ultimately groups files by their Library ID, an identifier for the sequencing library preparation.\n \n<div align=\"right\"><a href=\"#top\">top</a></div>\n\n<a name=\"section-batch-by-platform-unit\"></a>\n**Batch by Platform unit**\n\nTo batch by Platform unit ID, select **Platform unit** from the **Batch by** drop-down menu. \n\nBatching by **Platform unit** sorts the files by Sample ID, then Library ID, and Platform unit ID. It runs the workflow for each group of files, initiating a new child task for each group.\n\nBatching by **Platform unit** ultimately groups files by their Platform unit ID, an identifier for lanes (Illumina), or for slides (SOLiD) in the case that a library was split and ran over multiple lanes on the flow cell or slides. The Platform unit ID refers to the lane ID or the slide ID.\n\n<div align=\"right\"><a href=\"#top\">top</a></div>\n\n<a name=\"section-batch-by-file-segment\"></a>\n**Batch by File segment**\n\nTo batch by File segment number, select **File segment **from the **Batch by** drop-down menu. Batching by **File segment** sorts the files by Sample ID, then Library ID, Platform unit ID, and File segment number. It runs the workflow for each group of files, initiating a new child task for each group.\n\nBatching by **File segment** ultimately groups files by their File segment number, which enumerates files that contain samples obtained using the same library and were sequenced on the same lane or slide, but whose content is written over several files.\n\nThe files are enumerated using file segment numbers. Grouping inputs by their file segment number means that all of the first file segments are processed together, and then all the second file segment, and so on. This method of grouping allows you to align large files along a reference genome.\n\n<a name=\"use-multiple-files-in-batch-groups\"></a>\n##Use multiple files in batch groups##\n\n\nTo allow batch groups to have multiple files instead of one, follow this procedure:\n\n1. Click the elipses menu above the the input node that you want to batch on.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/006e2d3-qs-cgc.png\",\n        \"qs-cgc.png\",\n        785,\n        543,\n        \"#1f538d\"\n      ],\n      \"border\": true\n    }\n  ]\n}\n[/block]\nThe following window will be displayed.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/4a0edb5-qs-cgc2.png\",\n        \"qs-cgc2.png\",\n        591,\n        636,\n        \"#242425\"\n      ],\n      \"border\": true\n    }\n  ]\n}\n[/block]\n2. Choose **Array** from the **Change Schema** menu (**Schema** tab).\n3. Click **Save**.\n\n<div align=\"right\"><a href=\"#top\">top</a></div>","excerpt":"<a name=\"top\"></a>","slug":"perform-a-batch-analysis-1","type":"basic","title":"Perform a batch analysis"}

Perform a batch analysis

<a name="top"></a>

The **Batch Input** feature of workflows allows you to run identical analyses on different data, by entering multiple input files and grouping them by specified metadata criteria. This will result in multiple tasks: one for each group of files. Using **Batch Input**, you can process multiple datasets with a single workflow containing the same parameter settings without having to set up the workflow multiple times. [block:callout] { "type": "warning", "body": "* [Add a batch input node to a workflow](#section-add-a-batch-input-node-to-a-workflow)\n* [Select input files for the batch task](#section-select-input-files-for-the-batch-task)\n* [Grouping input files into batches](#section-grouping-input-files-into-batches)\n * [Batch by File](#section-batch-by-file)\n * [Batch by Sample, Library, Platform unit, or File segment](#section-batching-by-sample-library-platform-unit-or-file-segment)\n * [Batch by Sample](#section-batch-by-sample)\n * [Batch by Library](#section-batch-by-library)\n * [Batch by Platform Unit](#section-batch-by-platform-unit)\n * [Batch by File segment](#section-batch-by-file-segment)\n* [Use multiple files in batch groups](#use-multiple-files-in-batch-groups)", "title": "On this page:" } [/block] ##Add a batch input node to a workflow To add a batch input node to a workflow: 1. [Navigate to the desired project](view-a-project). 2. Click the **Apps** tab. 3. Click the edit icon <img src="https://files.readme.io/qj7P3A3Rn6t10e7B2Xjs_edit%20icon.png" height="20px" width="auto" align="inline" style="margin:1px"/> next to the desired workflow (or click Add Apps to add a new workflow). The workflow editor will be displayed. [block:image] { "images": [ { "image": [ "https://files.readme.io/9zKx4dQDRSWfVtQ5jReS_batch-input-n1.jpg", "batch-input-n1.jpg", "1327", "675", "#386084", "" ], "border": true } ] } [/block] [block:image] { "images": [ {} ] } [/block] 5. Click the input nodes that you wish to convert into a batch input node. The **PARAMS** tab will be displayed on the right. 6. Choose the criteria for grouping the input files, e.g. "Sample ID" (see below for [further information on the available criteria](#section-grouping-input-files-into-batches)). Note that the node icon will be changed to denote a batch input node <img src="https://files.readme.io/nfseOWgLSzCk7p1qCE84_batch-node-icon.jpg" height="35px" width="auto" align="inline" style="margin:1px"/> . 7. Click **Save**. 8. Click **Run** to create a **DRAFT** task. **DRAFT** task pages allow you to set up a workflow, select input files, and change app parameters before running a task. [block:callout] { "type": "info", "body": "Only one **Batch Input **node can be added to any workflow." } [/block] <div align="right"><a href="#top">top</a></div> ##Select input files for the batch task Once you have added a batch input node to your workflow, you can select the input files for the task. 1. [Select the reference and annotation files needed for the task, or accept the suggested files](doc:select-reference-files). 2. Leave the selected criteria or choose a different one from the **Batch by** drop-down menu. 3. Then, click **Pick files**. [block:image] { "images": [ { "image": [ "https://files.readme.io/HBlEKkTmRzVgBqgOSVJR_batch-input-n4.jpg", "batch-input-n4.jpg", "616", "291", "#dd9638", "" ], "border": true } ] } [/block] 4. Choose the desired files and click **Select**. [block:image] { "images": [ { "image": [ "https://files.readme.io/ZDLlHKDRc2AVLmIBz2aB_batch-input-n3.jpg", "batch-input-n3.jpg", "880", "601", "#dc9534", "" ], "border": true } ] } [/block] The added files will be automatically organized into groups according to the selected criteria (e.g. sample ID). The number of groups will determine the number of sub-tasks (child tasks) which will be executed within this batch task (the parent task). [block:image] { "images": [ { "image": [ "https://files.readme.io/MnxTzFTUmzVb1Tt3nILw_batch-input-n5.jpg", "batch-input-n5.jpg", "672", "692", "#e49431", "" ], "border": true } ] } [/block] [block:callout] { "type": "info", "body": "Click the **x** icon next to a group to remove that group of files." } [/block] 5. Click **Run** to execute the workflow and then confirm. The batch task (parent task) will start creating all of the sub-tasks (child tasks). [block:image] { "images": [ { "image": [ "https://files.readme.io/135b755-cgc_batch_tasks_creating.png", "cgc_batch_tasks_creating.png", 1440, 612, "#f5f5f5" ], "border": true } ] } [/block] Once all of the child task are created, the batch task will start running them. The batch task **Progress** message will be changed to **Running**, as shown below. [block:callout] { "type": "info", "body": "In case any of the child tasks haven’t been created properly, the parent task will revert to **DRAFT** status and the related error message will be displayed." } [/block] [block:image] { "images": [ { "image": [ "https://files.readme.io/7d6e282-cgc_batch_tasks_list.png", "cgc_batch_tasks_list.png", 1424, 657, "#f0f2f3" ], "border": true } ] } [/block] All of the child tasks are individual tasks which you can manage just like any other task on the Platform. You can re-run or abort any of the child tasks. [block:image] { "images": [ { "image": [ "https://files.readme.io/bc6a450-cgc_batch_tasks_statuses.png", "cgc_batch_tasks_statuses.png", 1423, 696, "#eff1f3" ], "border": true } ] } [/block] The status of the parent task will show the combined status for all of the child tasks. In the screenshot above, the parent task status shows: * 3 child tasks in progress * 1 child task aborted by the user * 1 completed child task Here you can also: * **Abort** - click <img src="https://files.readme.io/njhttFawQyqJ5Ga2uwcQ_abort-parent.jpg" height="30px" width="auto" align="inline" style="margin:1px"/> to abort the parent task (which will also abort all child tasks). * **Edit and rerun** - click <img src="https://files.readme.io/3qkSHKvsSJaMJ67gdycU_edit-and-run-parent.jpg" height="30px" width="auto" align="inline" style="margin:1px"/> to run the parent task again. ##Grouping input files into batches You can either group input files by their metadata (File, Sample, Library, Platform unit, or File segment). The optimal grouping will depend on your experimental design. For example, suppose you want to run the public Whole Genome Analysis workflow, and you have multiple FASTQ files from many samples (two paired end reads per sample, resulting in two files per sample). In this case, you might want to analyze files from each sample in batches. As shown in the image below, click **Pick files** to simultaneously enter files for all the samples you want to analyze. Then, batch files by **Sample** from the **Batch by** drop-down menu next. This will result in exactly one task per pair of FASTQ files from each sample. [block:image] { "images": [ { "image": [ "https://files.readme.io/b9777cc-Screen_Shot_2017-05-11_at_1.53.51_PM.png", "Screen Shot 2017-05-11 at 1.53.51 PM.png", 634, 140, "#f2f2f2" ], "border": true } ] } [/block] ###Batching by File To batch by file, select **File** from the **Batch by** drop-down menu. Batching by **File** runs the workflow for each individual file, initiating a new child task for each input file. <div align="right"><a href="#top">top</a></div> ###Batch by Case To batch by Case ID, select **Case** from the Batch by drop-down menu. Batching by **Case** sorts the files by Case ID. It runs the workflow for each group of files, initiating a new child task for each group. Batching by **Case** groups files by their Case ID, a human-readable identifier, such as a number or a string that may contain metadata information, for a subject who has taken part in the investigation of study. <div align="right"><a href="#top">top</a></div> ###Batching by Sample, Library, Platform unit, or File segment When batching by **Sample**, **Library**, **Platform unit**, or **File segment**, files are grouped by their value for a metadata field. Files are grouped following the hierarchy listed below: 1. Sample ID 2. Library ID 3. Platform unit ID 4. File fragment number 5. Paired end (for paired end reads) For example, files batched by **Library** will first be sorted by Sample ID with a secondary sort by Library ID. Consider the files described in the table below: [block:parameters] { "data": { "h-0": "File name", "h-1": "Sample ID", "h-2": "Library ID", "0-0": "File 1", "0-1": "A", "0-2": "1", "1-0": "File 2", "1-1": "A", "1-2": "1", "2-0": "File 3", "2-1": "B", "2-2": "2", "3-0": "File 4", "3-1": "B", "3-2": "1", "4-0": "File 5", "4-1": "B", "4-2": "2" }, "cols": 3, "rows": 5 } [/block] These five files will be sorted into 3 groups: * Sample ID "A" with Library ID "1" (File 1 and File 2) * Sample ID "B" with Library ID "1" (File 4) * Sample ID "B" with Library ID "2" (File 3 and File 5) By the same token, files batched by **File fragment** will first be sorted by Sample ID, then Library ID, Platform unit ID, and File fragment number. A separate child task will be created for each different grouping. <div align="right"><a href="#top">top</a></div> <a name="section-batch-by-sample"></a> **Batch by Sample** To batch by Sample ID, select **Sample** from the **Batch by** drop-down menu. Batching by **Sample** sorts the files by Sample ID.It runs the workflow for each group of files, initiating a new child task for each group. Batching by **Sample** groups files by their Sample ID, a human-readable identifier for a sample or specimen, which could contain some metadata information. <div align="right"><a href="#top">top</a></div> <a name="section-batch-by-library"></a> **Batch by Library** To batch by Library ID, select **Library** from the **Batch by** drop-down menu. Batching by **Library** sorts the files by Sample ID then Library ID. It runs the workflow for each group of files, initiating a new child task for each group. Batching by **Library** ultimately groups files by their Library ID, an identifier for the sequencing library preparation. <div align="right"><a href="#top">top</a></div> <a name="section-batch-by-platform-unit"></a> **Batch by Platform unit** To batch by Platform unit ID, select **Platform unit** from the **Batch by** drop-down menu. Batching by **Platform unit** sorts the files by Sample ID, then Library ID, and Platform unit ID. It runs the workflow for each group of files, initiating a new child task for each group. Batching by **Platform unit** ultimately groups files by their Platform unit ID, an identifier for lanes (Illumina), or for slides (SOLiD) in the case that a library was split and ran over multiple lanes on the flow cell or slides. The Platform unit ID refers to the lane ID or the slide ID. <div align="right"><a href="#top">top</a></div> <a name="section-batch-by-file-segment"></a> **Batch by File segment** To batch by File segment number, select **File segment **from the **Batch by** drop-down menu. Batching by **File segment** sorts the files by Sample ID, then Library ID, Platform unit ID, and File segment number. It runs the workflow for each group of files, initiating a new child task for each group. Batching by **File segment** ultimately groups files by their File segment number, which enumerates files that contain samples obtained using the same library and were sequenced on the same lane or slide, but whose content is written over several files. The files are enumerated using file segment numbers. Grouping inputs by their file segment number means that all of the first file segments are processed together, and then all the second file segment, and so on. This method of grouping allows you to align large files along a reference genome. <a name="use-multiple-files-in-batch-groups"></a> ##Use multiple files in batch groups## To allow batch groups to have multiple files instead of one, follow this procedure: 1. Click the elipses menu above the the input node that you want to batch on. [block:image] { "images": [ { "image": [ "https://files.readme.io/006e2d3-qs-cgc.png", "qs-cgc.png", 785, 543, "#1f538d" ], "border": true } ] } [/block] The following window will be displayed. [block:image] { "images": [ { "image": [ "https://files.readme.io/4a0edb5-qs-cgc2.png", "qs-cgc2.png", 591, 636, "#242425" ], "border": true } ] } [/block] 2. Choose **Array** from the **Change Schema** menu (**Schema** tab). 3. Click **Save**. <div align="right"><a href="#top">top</a></div>