Skip to content

Cluster config

Francisco Zorrilla edited this page Mar 25, 2021 · 4 revisions

Overview

The cluster_config.json file is designed to interact with the SLURM workload manager and contains parameters relevant to job submissions such as the number of core n per job, the memory mem of RAM per job, you cluster account name, etc.

The cluster_config.json file looks something like this:

{
"__default__" : {
        "account" : "your_account_name_on_the_cluster",
        "time" : "0-06:00:00",
        "n" : 48,
        "tasks" : 1,
        "mem" : 180G,
        "name"      : "DL.{rule}",
        "output"    : "logs/{wildcards}.%N.{rule}.out.log",
},
}

Configuration

Most importantly, you should replace your_account_name_on_the_cluster with your account name on your cluster. The max runtime time, number of cores n, and memory mem are actually written to this config file when you use the metaGEM.sh script.

For example, if you run the following command:

metaGEM.sh -t qfilter -j 10 -c 2 -m 8 -t 2

Then the metaGEM.sh parser will submit 10 jobs with the cluster_config.json file configured like this:

{
"__default__" : {
        "account" : "satoshi",
        "time" : "0-02:00:00",
        "n" : 2,
        "tasks" : 1,
        "mem" : 8G,
        "name"      : "DL.{rule}",
        "output"    : "logs/{wildcards}.%N.{rule}.out.log",
},
}

As you can see, now the submitted jobs will request 2 cores + 8 GB RAM per jobs with a max runtime of 2 hours. The Snakemake log will be stored in a file called nohup.out, and the logs for the individual jobs will be stored in the subfolder called logs. Make sure that the logs folder already exists before submitting jobs, this can be done automatically by running:

metaGEM.sh -t createFolders

Note that this will also create subfolders for all entries under folder in the config.yaml file, which includes logs. If you just want to make sure that the logs folder exists then simply run:

mkdir -p logs

Manual job submission

If you are developing or testing new rules you may want to submit jobs without going through the metaGEM.sh parser. In this case you can run the following commands to submit jobs manually:

nohup snakemake all -j 200 -k --cluster-config cluster_config.json -c "sbatch -A {cluster.account} -p {cluster.part} --mem {cluster.mem} -t {cluster.time} -n {cluster.n} --ntasks {cluster.tasks} --cpus-per-task {cluster.n} --output {cluster.output}" &

The above code will submit 200 jobs based on the output of what is in the rule all in the Snakefile.