Command-line tools#

Gstat#

Gstat

Summarize the status of the jobs (wrapper around “squeue”) using (some of) the following fields:

Header

Description

“JobID”

Job-id

“User”

Username

“Account”

Account name

“Name”

Job name

“Tstart”

Time as which the job will start / has started

“Tleft”

Maximum duration left

“#node”

Number of nodes claimed

“#CPU”

Number of CPUs claimed

“MEM”

Memory claimed

“ST”

Status

“Partition”

Partition

“Host”

Hostname

“Dependency”

Dependency / dependencies

“WorkDir”

Working directory

Usage:

Header	Description
“JobID”	Job-id
“User”	Username
“Account”	Account name
“Name”	Job name
“Tstart”	Time as which the job will start / has started
“Tleft”	Maximum duration left
“#node”	Number of nodes claimed
“#CPU”	Number of CPUs claimed
“MEM”	Memory claimed
“ST”	Status
“Partition”	Partition
“Host”	Hostname
“Dependency”	Dependency / dependencies
“WorkDir”	Working directory

Gstat [options] Gstat [options] <JobId>…

Options:

-U: Limit output to the current user.
-u, --user=<NAME>: Limit output to user(s). Option may be repeated. Search by regex.
-j, --jobid=<NAME>: Limit output to job-id(s). Option may be repeated. Search by regex.
--root=<NAME>: Filter jobs whose workdir has this root.
-C, --cwd: Same as --root ..
-D, --max-depth=<N>: Filter jobs whose workdir has maximally this depth compared to --root.
--host=<NAME>: Limit output to host(s). Option may be repeated. Search by regex.
-a, --account=<NAME>: Limit output to account(s). Option may be repeated. Search by regex.
-n, --name=<NAME>: Limit output to job-name(s). Option may be repeated. Search by regex.
-w, --workdir=<NAME>: Limit output to job-name(s). Option may be repeated. Search by regex.
--status=<NAME>: Limit output to status. Option may be repeated. Search by regex.
-p, --partition=<NAME>: Limit output to partition(s). Option may be repeated. Search by regex.
-s, --sort=<NAME>: Sort by field. Option may be repeated. See description for header names.
-r, --reverse: Reverse sort.
-o, --output=<NAME>: Select output columns. Option may be repeated. See description for header names.
-e, --extra=<NAME>: Add columns. Option may be repeated. See description for header names.
--full-name: Show full user names.
-S, --summary: Print only summary.
--no-header: Suppress header.
--no-truncate: Print full columns, do not truncate based on terminal width.
--width=<N>: Set line-width (otherwise taken as terminal width).
--colors=<NAME>: Select color scheme from: “none”, “dark”. [default: “dark”]
-l, --list: Print selected column as list.
-J, --joblist: Print selected job-id(s) as list. Sort for Gstat -o jobid -l.
--abspath: Print directories as absolute directories (default: automatic, based on distance).
--relpath: Print directories as relative directories (default: automatic, based on distance).
--sep=<NAME>: Set column separator. [default: ” “]
--long: Print full information (each column is printed as a line).
--debug=<FILE>: Debug: read squeue -o "%all" from file.
-d, --print-dependency: Print the selected jobs as -d <jobid> -d <jobid> .... Use to for example Gsub *slurm `Gstat -d -U --partition "serial"`.
-h, --help: Show help.
--version: Show version.

(c - MIT) T.W.J. de Geus | tom@geus.me | www.geus.me | github.com/tdegeus/GooseSLURM

Ginfo#

Ginfo

Summarize the status of the compute nodes (wrapper around “sinfo”).

The following scores are computed of each node:

CPU% : The CPU load of the node relative to the number of jobs (cpu_load / cpu_used). Should always be ~1, anything else usually signals misuse.
Mem% : the amount of used memory relative to the average memory available per job ((mem_used / cpu_used) / (mem_tot / cpu_tot)). Can be > 1 for (several) heavy memory consumption jobs, but in principle any value is possible.

Usage:

Ginfo [options]

Options:

-U: Limit output to the current user.
-u, --user=<NAME>: Limit output to user(s). Option may be repeated. Search by regex.
-j, --jobid=<NAME>: Limit output to job-id(s). Option may be repeated. Search by regex.
--host=<NAME>: Limit output to host(s). Option may be repeated. Search by regex.
-f, --cfree=<NAME>: Limit output to free CPU(s). Option may be repeated. Search by regex.
-p, --partition=<NAME>: Limit output to partition(s). Option may be repeated. Search by regex.
-s, --sort=<NAME>: Sort by field. Option may be repeated. Use header names.
-r, --reverse: Reverse sort.
-o, --output=<NAME>: Select output columns. Option may be repeated. Use header names.
-S, --summary: Print only summary.
--no-header: Suppress header.
--no-truncate: Print full columns, do not truncate based on terminal width.
--width=<N>: Set line-width (otherwise taken as terminal width).
--colors=<NAME>: Select color scheme from: “none”, “dark”. [default: “dark”]
-l, --list: Print selected column as list.
--sep=<NAME>: Set column separator. [default: ” “]
--long: Print full information (each column is printed as a line).

–debug=<FILE> <FILE>: Debug: read sinfo -o "%all" and squeue -o "%all" from file.

-h, --help: Show help.
--version: Show version.

(c - MIT) T.W.J. de Geus | tom@geus.me | www.geus.me | github.com/tdegeus/GooseSLURM

Gps#

Gps

List memory usage per process.

Header

Description

“PID”

Process-id

“USER”

Username

“MEM”

Memory used

“%CPU”

Fraction of CPU capacity used

“TIME”

Duration of the command

“COMMAND”

Command

Header	Description
“PID”	Process-id
“USER”	Username
“MEM”	Memory used
“%CPU”	Fraction of CPU capacity used
“TIME”	Duration of the command
“COMMAND”	Command

Tip

A nice use is to kill a command filtered on its name:

Gps --kill -c ".*mycommand.*"

Of course you should probably verify the selected pid(s) before killing them.

Usage:

Gps [options]

Options:

-U: Limit processes to the current user.
-u, --user=<NAME>: Limit processes to user(s). Option may be repeated. Search by regex.
-p, --pid=<NAME>: Limit processes to process-id. Option may be repeated. Search by regex.
-c, --command=<NAME>: Limit processes to command. Option may be repeated. Search by regex.
--include-me: Include the current process.
-s, --sort=<NAME>: Sort by field (selected by the header name).
-r, --reverse: Reverse sort.
-o, --output=<NAME>: Select output columns. Option may be repeated. See description for header names.
-9: Output list of PID separated by -9, such that you can kill them all at once, by kill -9 $(Gps -9 ...).
--kill: Kill selected processes.
--no-header: Suppress header.
--no-truncate: Print full columns, do not truncate based on terminal width.
--width=<N>: Set line-width (otherwise taken as terminal width).
--colors=<NAME>: Select color scheme from: “none”, “dark”. [default: “dark”]
-l, --list: Print selected column as list.
--sep=<NAME>: Set column separator. [default: ” “]
--long: Print full information (each column is printed as a line).
--debug=<FILE>: Debug: read squeue -o "%all" from file.
-h, --help: Show help.
--version: Show version.

(c - MIT) T.W.J. de Geus | tom@geus.me | www.geus.me | github.com/tdegeus/GooseSLURM

Gsub#

Gsub

Submit job-scripts and add the “–chdir” option to run the scripts from the directory in with the sbatch-file is stored. See https://slurm.schedmd.com/sbatch.html

Usage:

Gsub [options] <files>…

Arguments:

Job-scripts.

Options:

--dry-run: Print commands to screen, without executing.
--verbose: Verbose all commands and their output.
-Q, --quiet: Do no show progress-bar.

–log = FILENAME: Log the JobIDs to a YAML-file (updated after each submit). Existing log files are appended.
–delay = FLOAT: Seconds to wait between submitting jobs. [default: 0.1]
-r, –repeat = INT: Submit using dependencies such that the job will be repeated ‘n’ times. [default: 1]

--serial: Submit using dependencies such that jobs are run after each other.

-A, –account = ARG (sbatch option): Account name.
-b, –begin = ARG (sbatch option): Allocate at the later time. E.g. --begin=now+1hour.
–comment = ARG (sbatch option): Arbitrary comment.
-c, –constraint = ARG (sbatch option): Nodes can have features assigned to them by the Slurm administrator.
-d, –dependency = ARG (sbatch option): Defer the start of this job until the specified dependencies have been satisfied completed.
-X, –exclude = ARG (sbatch option): Exclude nodes.
–export = ARG (sbatch option): Export environment variables.
–mem = ARG (sbatch option): Memory allocation.
-w, –wait (sbatch option): Do not exit until the submitted job terminates.

-h, --help: Show help.
--version: Show version.

(c - MIT) T.W.J. de Geus | tom@geus.me | www.geus.me | github.com/tdegeus/GooseSLURM

Gdel#

Gdel: Stop running jobs.
Usage:: Gdel [options] Gdel [options] <JobId>…
Arguments:: ID-number(s) of the job(s) to delete. (default: all user’s jobs)

Options:

…
Any option for Gstat.

-h, --help

Show help.

—version Show version.

(c - MIT) T.W.J. de Geus | tom@geus.me | www.geus.me | github.com/tdegeus/GooseSLURM

Gacct#

Display available information from sacct -j jobid... or sacct [OPTION].

The time fields can be specified as rich (e.g. -S="-1h" for jobs that started at maximum one hour ago) in addition to the default format of sacct.

As state use: running / r, completed / cd, failed / f, timeout / to, resizing / rs, deadline / dl, node_fail / nf.

The output can be returned in JSON format (--json).

Extra columns can be added (--extra), see sacct --helpformat. Commonly used are --extra="WorkDir".

The following column abbreviations are used (depending on the output): - State: ST - ExitCode: exit

Tip

Use seff jobid to get more information, such as memory usage.

usage: Gacct [-h] [-X] [-j] [--sep SEP] [--no-truncate] [--sort SORT]
             [--reverse] [--no-header] [--width WIDTH] [-o OUTPUT]
             [--infer INFER] [-e EXTRA] [--abspath] [--relpath] [--root ROOT]
             [-C] [-L] [-a] [-T] [-S STARTTIME] [-E ENDTIME] [-i NNODES]
             [-I NCPUS] [-k TIMELIMIT_MIN] [-K TIMELIMIT_MAX] [-r PARTITION]
             [-s STATE] [-N NODELIST] [-M CLUSTERS] [-A ACCOUNT] [-u USER]
             [--uid UID] [-U] [-g GROUP] [--gid GID] [--name NAME] [-q QOS]
             [-v]
             [jobid ...]

Positional Arguments#

jobid: JobID(s) to read.

options#

-X, --allocations

Include only main job.

Default: False

-j, --json

Print in JSON format.

Default: False

--sep

Column separator.

Default: ” “

--no-truncate

Print without fitting screen.

Default: False

--sort

Sort based on column.

Default: []

--reverse

Reverse order.

Default: False

--no-header

Do not print header.

Default: False

--width

Print width (default: read from terminal).

-o, --output

Output columns.

--infer

Read extra data from JOBID.infer.

-e, --extra

Extra columns.

Default: []

--abspath

Print directories as absolute (default: automatic).

Default: False

--relpath

Print directories as relative (default: automatic).

Default: False

--root

Filter jobs whose WorkDir has this root (WorkDir printed relative to this root).

-C, --cwd

Same as --root ..

Default: False

-L, --allclusters

Display jobs ran on all clusters.

Default: False

-a, --allusers

All users (default: only current user).

Default: False

-T, --truncate

Truncate time.

Default: False

-S, --starttime

Job started after time.

-E, --endtime

Job end before time.

-i, --nnodes

Jobs which ran on this many nodes.

-I, --ncpus

Jobs which ran on this many cpus.

-k, --timelimit-min

Only send data about jobs with this timelimit.

-K, --timelimit-max

Only send data about jobs with this timelimit.

-r, --partition

(Comma separated list of) partitions.

Default: []

-s, --state

Select state(s).

Default: []

-N, --nodelist

Select nodelist(s).

Default: []

-M, --clusters

Select cluster(s).

Default: []

-A, --account

Select account(s).

Default: []

-u, --user

Select username(s).

Default: []

--uid

Select user-id(s).

Default: []

-U

Select current user.

Default: False

-g, --group

Select group(s).

Default: []

--gid

Select group-id(s).

Default: []

--name

Select job-name(s).

Default: []

-q, --qos

Select qos(s).

Default: []

-v, --version

show program’s version number and exit