BDI Code Clinic, 28 October 2020 11am Srinivasa Rao – Snakemake
To book, please visit:
A link to the Microsoft Teams meeting will be sent in the week of 26 October
Snakemake is a workflow management tool to perform a number of related tasks (aka “rules” in Snakemake lingo) in an efficient, reproducible and readable way. It uses a simple vocabulary to define expected input, output, parameters, script and resources for each rule. In this tutorial, we will go through the following:
Snakemake vocabulary and syntax
Constructing a basic workflow with 1 rule
Various kinds of tasks (shell, R/python scripts, conda environments)
Parameters, threads, resources
Passing additional variables in a configuration file
By the end of the tutorial, we will learn how to build a slightly more complicated workflow with multiple rules, using some of the above concepts.
Some familiarity with Python 3 & basic Conda commands is essential. (e.g. setting up and starting a Conda environment, loading python modules, python data types, working with lists, dicts and tuples) There are plenty of resources on the web for Python 3, and go here for an intro to conda: docs.conda.io/projects/conda/en/latest/user-guide/getting-started.html
At a minimum, you should have Conda (Miniconda is fine), and Snakemake installed. See here for how to install these: snakemake.readthedocs.io/en/v3.10.2/getting_started/installation.html
Rao will use bash commands or R scripts in the tutorial, so Linux or MacOS is ideal. For the purpose of this tutorial alone, Windows Subsystem for Linux on Windows 10 should work, but may not work reliably.
This tutorial will be followed in a few weeks by Advanced workflow management with Snakemake which will cover advanced topics such as: passing parameters to scripts, custom functions, checkpoints, workflow visualisation, reports and more.