Command-line tools#

Gstat#

Gstat

Summarize the status of the jobs (wrapper around “squeue”) using (some of) the following fields:

Header

Description

“JobID”

Job-id

“User”

Username

“Account”

Account name

“Name”

Job name

“Tstart”

Time as which the job will start / has started

“Tleft”

Maximum duration left

“#node”

Number of nodes claimed

“#CPU”

Number of CPUs claimed

“MEM”

Memory claimed

“ST”

Status

“Partition”

Partition

“Host”

Hostname

“Dependency”

Dependency / dependencies

“WorkDir”

Working directory

Usage:

Gstat [options] Gstat [options] <JobId>…

Options:
-U

Limit output to the current user.

-u, --user=<NAME>

Limit output to user(s). Option may be repeated. Search by regex.

-j, --jobid=<NAME>

Limit output to job-id(s). Option may be repeated. Search by regex.

--root=<NAME>

Filter jobs whose workdir has this root.

-C, --cwd

Same as --root ..

-D, --max-depth=<N>

Filter jobs whose workdir has maximally this depth compared to --root.

--host=<NAME>

Limit output to host(s). Option may be repeated. Search by regex.

-a, --account=<NAME>

Limit output to account(s). Option may be repeated. Search by regex.

-n, --name=<NAME>

Limit output to job-name(s). Option may be repeated. Search by regex.

-w, --workdir=<NAME>

Limit output to job-name(s). Option may be repeated. Search by regex.

--status=<NAME>

Limit output to status. Option may be repeated. Search by regex.

-p, --partition=<NAME>

Limit output to partition(s). Option may be repeated. Search by regex.

-s, --sort=<NAME>

Sort by field. Option may be repeated. See description for header names.

-r, --reverse

Reverse sort.

-o, --output=<NAME>

Select output columns. Option may be repeated. See description for header names.

-e, --extra=<NAME>

Add columns. Option may be repeated. See description for header names.

--full-name

Show full user names.

-S, --summary

Print only summary.

--no-header

Suppress header.

--no-truncate

Print full columns, do not truncate based on terminal width.

--width=<N>

Set line-width (otherwise taken as terminal width).

--colors=<NAME>

Select color scheme from: “none”, “dark”. [default: “dark”]

-l, --list

Print selected column as list.

-J, --joblist

Print selected job-id(s) as list. Sort for Gstat -o jobid -l.

--abspath

Print directories as absolute directories (default: automatic, based on distance).

--relpath

Print directories as relative directories (default: automatic, based on distance).

--sep=<NAME>

Set column separator. [default: ” “]

--long

Print full information (each column is printed as a line).

--debug=<FILE>

Debug: read squeue -o "%all" from file.

-d, --print-dependency

Print the selected jobs as -d <jobid> -d <jobid> .... Use to for example Gsub *slurm `Gstat -d -U --partition "serial"`.

-h, --help

Show help.

--version

Show version.

(c - MIT) T.W.J. de Geus | tom@geus.me | www.geus.me | github.com/tdegeus/GooseSLURM

Ginfo#

Ginfo

Summarize the status of the compute nodes (wrapper around “sinfo”).

The following scores are computed of each node:

  • CPU% : The CPU load of the node relative to the number of jobs (cpu_load / cpu_used). Should always be ~1, anything else usually signals misuse.

  • Mem% : the amount of used memory relative to the average memory available per job ((mem_used / cpu_used) / (mem_tot / cpu_tot)). Can be > 1 for (several) heavy memory consumption jobs, but in principle any value is possible.

Usage:

Ginfo [options]

Options:
-U

Limit output to the current user.

-u, --user=<NAME>

Limit output to user(s). Option may be repeated. Search by regex.

-j, --jobid=<NAME>

Limit output to job-id(s). Option may be repeated. Search by regex.

--host=<NAME>

Limit output to host(s). Option may be repeated. Search by regex.

-f, --cfree=<NAME>

Limit output to free CPU(s). Option may be repeated. Search by regex.

-p, --partition=<NAME>

Limit output to partition(s). Option may be repeated. Search by regex.

-s, --sort=<NAME>

Sort by field. Option may be repeated. Use header names.

-r, --reverse

Reverse sort.

-o, --output=<NAME>

Select output columns. Option may be repeated. Use header names.

-S, --summary

Print only summary.

--no-header

Suppress header.

--no-truncate

Print full columns, do not truncate based on terminal width.

--width=<N>

Set line-width (otherwise taken as terminal width).

--colors=<NAME>

Select color scheme from: “none”, “dark”. [default: “dark”]

-l, --list

Print selected column as list.

--sep=<NAME>

Set column separator. [default: ” “]

--long

Print full information (each column is printed as a line).

–debug=<FILE> <FILE>

Debug: read sinfo -o "%all" and squeue -o "%all" from file.

-h, --help

Show help.

--version

Show version.

(c - MIT) T.W.J. de Geus | tom@geus.me | www.geus.me | github.com/tdegeus/GooseSLURM

Gps#

Gps

List memory usage per process.

Header

Description

“PID”

Process-id

“USER”

Username

“MEM”

Memory used

“%CPU”

Fraction of CPU capacity used

“TIME”

Duration of the command

“COMMAND”

Command

Tip

A nice use is to kill a command filtered on its name:

Gps --kill -c ".*mycommand.*"

Of course you should probably verify the selected pid(s) before killing them.

Usage:

Gps [options]

Options:
-U

Limit processes to the current user.

-u, --user=<NAME>

Limit processes to user(s). Option may be repeated. Search by regex.

-p, --pid=<NAME>

Limit processes to process-id. Option may be repeated. Search by regex.

-c, --command=<NAME>

Limit processes to command. Option may be repeated. Search by regex.

--include-me

Include the current process.

-s, --sort=<NAME>

Sort by field (selected by the header name).

-r, --reverse

Reverse sort.

-o, --output=<NAME>

Select output columns. Option may be repeated. See description for header names.

-9

Output list of PID separated by -9, such that you can kill them all at once, by kill -9 $(Gps -9 ...).

--kill

Kill selected processes.

--no-header

Suppress header.

--no-truncate

Print full columns, do not truncate based on terminal width.

--width=<N>

Set line-width (otherwise taken as terminal width).

--colors=<NAME>

Select color scheme from: “none”, “dark”. [default: “dark”]

-l, --list

Print selected column as list.

--sep=<NAME>

Set column separator. [default: ” “]

--long

Print full information (each column is printed as a line).

--debug=<FILE>

Debug: read squeue -o "%all" from file.

-h, --help

Show help.

--version

Show version.

(c - MIT) T.W.J. de Geus | tom@geus.me | www.geus.me | github.com/tdegeus/GooseSLURM

Gsub#

Gsub

Submit job-scripts and add the “–chdir” option to run the scripts from the directory in with the sbatch-file is stored. See https://slurm.schedmd.com/sbatch.html

Usage:

Gsub [options] <files>…

Arguments:

Job-scripts.

Options:
--dry-run

Print commands to screen, without executing.

--verbose

Verbose all commands and their output.

-Q, --quiet

Do no show progress-bar.

–log = FILENAME

Log the JobIDs to a YAML-file (updated after each submit). Existing log files are appended.

–delay = FLOAT

Seconds to wait between submitting jobs. [default: 0.1]

-r, –repeat = INT

Submit using dependencies such that the job will be repeated ‘n’ times. [default: 1]

--serial

Submit using dependencies such that jobs are run after each other.

-A, –account = ARG (sbatch option)

Account name.

-b, –begin = ARG (sbatch option)

Allocate at the later time. E.g. --begin=now+1hour.

–comment = ARG (sbatch option)

Arbitrary comment.

-c, –constraint = ARG (sbatch option)

Nodes can have features assigned to them by the Slurm administrator.

-d, –dependency = ARG (sbatch option)

Defer the start of this job until the specified dependencies have been satisfied completed.

-X, –exclude = ARG (sbatch option)

Exclude nodes.

–export = ARG (sbatch option)

Export environment variables.

–mem = ARG (sbatch option)

Memory allocation.

-w, –wait (sbatch option)

Do not exit until the submitted job terminates.

-h, --help

Show help.

--version

Show version.

(c - MIT) T.W.J. de Geus | tom@geus.me | www.geus.me | github.com/tdegeus/GooseSLURM

Gdel#

Gdel

Stop running jobs.

Usage:

Gdel [options] Gdel [options] <JobId>…

Arguments:

ID-number(s) of the job(s) to delete. (default: all user’s jobs)

Options:

Any option for Gstat.

-h, --help

Show help.

—version Show version.

(c - MIT) T.W.J. de Geus | tom@geus.me | www.geus.me | github.com/tdegeus/GooseSLURM

Gacct#

Display available information from sacct -j jobid... or sacct [OPTION].

  • The time fields can be specified as rich (e.g. -S="-1h" for jobs that started at maximum one hour ago) in addition to the default format of sacct.

  • As state use: running / r, completed / cd, failed / f, timeout / to, resizing / rs, deadline / dl, node_fail / nf.

  • The output can be returned in JSON format (--json).

  • Extra columns can be added (--extra), see sacct --helpformat. Commonly used are --extra="WorkDir".

  • The following column abbreviations are used (depending on the output): - State: ST - ExitCode: exit

Tip

Use seff jobid to get more information, such as memory usage.

usage: Gacct [-h] [-X] [-j] [--sep SEP] [--no-truncate] [--sort SORT]
             [--reverse] [--no-header] [--width WIDTH] [-o OUTPUT]
             [--infer INFER] [-e EXTRA] [--abspath] [--relpath] [--root ROOT]
             [-C] [-L] [-a] [-T] [-S STARTTIME] [-E ENDTIME] [-i NNODES]
             [-I NCPUS] [-k TIMELIMIT_MIN] [-K TIMELIMIT_MAX] [-r PARTITION]
             [-s STATE] [-N NODELIST] [-M CLUSTERS] [-A ACCOUNT] [-u USER]
             [--uid UID] [-U] [-g GROUP] [--gid GID] [--name NAME] [-q QOS]
             [-v]
             [jobid ...]

Positional Arguments#

jobid

JobID(s) to read.

options#

-X, --allocations

Include only main job.

Default: False

-j, --json

Print in JSON format.

Default: False

--sep

Column separator.

Default: ” “

--no-truncate

Print without fitting screen.

Default: False

--sort

Sort based on column.

Default: []

--reverse

Reverse order.

Default: False

--no-header

Do not print header.

Default: False

--width

Print width (default: read from terminal).

-o, --output

Output columns.

--infer

Read extra data from JOBID.infer.

-e, --extra

Extra columns.

Default: []

--abspath

Print directories as absolute (default: automatic).

Default: False

--relpath

Print directories as relative (default: automatic).

Default: False

--root

Filter jobs whose WorkDir has this root (WorkDir printed relative to this root).

-C, --cwd

Same as --root ..

Default: False

-L, --allclusters

Display jobs ran on all clusters.

Default: False

-a, --allusers

All users (default: only current user).

Default: False

-T, --truncate

Truncate time.

Default: False

-S, --starttime

Job started after time.

-E, --endtime

Job end before time.

-i, --nnodes

Jobs which ran on this many nodes.

-I, --ncpus

Jobs which ran on this many cpus.

-k, --timelimit-min

Only send data about jobs with this timelimit.

-K, --timelimit-max

Only send data about jobs with this timelimit.

-r, --partition

(Comma separated list of) partitions.

Default: []

-s, --state

Select state(s).

Default: []

-N, --nodelist

Select nodelist(s).

Default: []

-M, --clusters

Select cluster(s).

Default: []

-A, --account

Select account(s).

Default: []

-u, --user

Select username(s).

Default: []

--uid

Select user-id(s).

Default: []

-U

Select current user.

Default: False

-g, --group

Select group(s).

Default: []

--gid

Select group-id(s).

Default: []

--name

Select job-name(s).

Default: []

-q, --qos

Select qos(s).

Default: []

-v, --version

show program’s version number and exit