Skip to content

Commit

Permalink
Update slurm.conf
Browse files Browse the repository at this point in the history
Uncommented AccountingStorageEnforce
Added UnkillableStepTimeout to accomodate older water-cooled machines that take a longer time to start/kill jobs
Adjusted killwait from 30 to 90 for the same reason.
  • Loading branch information
stephandooper authored Aug 9, 2023
1 parent d3271c5 commit 8b4536b
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions roles/slurm/templates/etc/slurm/slurm.conf
Original file line number Diff line number Diff line change
Expand Up @@ -88,8 +88,9 @@ SlurmctldTimeout=120
SlurmdTimeout=300
InactiveLimit=0
MinJobAge=300
KillWait=30
KillWait=90
Waittime=0
UnkillableStepTimeout=180

# SCHEDULING
SchedulerType=sched/backfill
Expand Down Expand Up @@ -123,7 +124,7 @@ AccountingStorageTRES=gres/gpu
AccountingStorageType=accounting_storage/slurmdbd
AccountingStorageHost={{ groups["slurm-master"][0] }}
#AccountingStorageLoc=
#AccountingStorageEnforce=associations,limits,qos
AccountingStorageEnforce=associations,limits,qos
AccountingStorageUser={{ slurm_db_username }}
AccountingStoragePass=/var/run/munge/munge.socket.2

Expand Down

0 comments on commit 8b4536b

Please sign in to comment.