Linux#
Most clusters run on a Linux Operating System (OS). Other examples of operating systems are Windows or Mac OSx. Linux operating systems are also known as “Distributions” because they basically are a collection of software packages, which are “distributed” to the users. Popular Linux distributions are: Ubuntu, Fedora, RedHat, CentOS and openSuse, but there are many more (see for example distrowatch for an overview of all distributions).
The (mis)use of the name#
Within this software collection, almost always a package called Linux is found, which is the “kernel”. This particular piece of software is very important, because it handles the connection between the hardware and all the other pieces of software. However, when people speak about Linux, they typically mean a distribution containing Linux. It is like calling an airplane by the name of its engine, which is a bit awkward, but this is just how it is.
Most users never interact with the “kernel”, they experience the pieces of software that provide the user interface (UI). UIs come in two flavours, the Graphical User Interface (GUI) and the Command Line Interface (CLI). When you install a Linux Distribution on your own computer, it typically comes with a GUI or desktop environment, e.g. Gnome and KDE. Typically clusters only offer a CLI, which is basically a terminal (“window”) which presents you a prompt, where you can type a command.
Linux file structure#
Directories (folders) are delimited with a
/
(instead of a\
in Windows).The top most directory (or root) is called
/
. Hard-drives, other media, or even remote file systems can be mounted anywhere. For example a USB drive is commonly mounted at/media/mystick
. In contrast to Windows where each drive has a different name in the file tree (e.g.C:\
).All characters can be used in directory and file names, but it is best not to use exotic characters (e.g.
*
,"
,'
).File (and directory) names starting with a
.
are hidden files, and are not visible by default.Files (and directories) have owners and permissions, preventing misuse or accidental removal.
Each user has his or her own home-folder which is typically located at
/home/myusername
.
The BASH-shell#
To interact with the Linux operating system a Shell is used. In this command line environment, commands given by the user are interpreted by the Shell. Several Shells exist, each with its own syntax and built-in commands. One of the most popular is the Bash-shell.
Before introducing several features of the Bash-shell it is useful to discuss the basic controls. In principle the only form of control is through the keyboard. The exception is copying and pasting (parts) of commands, which is exclusively done using the mouse. Crtl+c
, Crtl+v
, etc. have a different meaning (see below). Specifically, highlighted text is automatically copied to the clipboard. It is pasted using the middle mouse button. Alternatively, the copy
and paste
command can be reached through the right mouse button. Several other basic controls are listed below.
Each command follows a prompt that is displayed by the terminal. For example:
[username@hostname ~]$
A command is followed by
return
to execute the command.By pressing
TAB
, Bash will try to auto-complete your typed command, pressingTAB
twice will print auto-complete suggestions.To stop a command
Ctrl+c
is used. It is advised not to useCtrl+z
orCtrl+s
which, respectively, sleeps or freezes a command.To move the location of the cursor (by definition only possible inside the current command) the keys \(\leftarrow\), \(\rightarrow\),
Home
, andEnd
can be used.To move through the history the keys \(\uparrow\) and \(\downarrow\) are used. Alternatively, the history can be searched using
Crtl+r
followed by keywords. To progress through the selection, useCrtl+r
. It is noticed that when pressingreturn
the selected command is directly executed. Use the \(\rightarrow\) to edit the selected command in stead.To log-out
Crtl+d
is used, this is equivalent of typingexit
.
Commands#
Overview#
In general a command consists of three parts: the command, options, and input arguments. Without going into detail, we consider an example. The command
[username@hostname ~]$ tar -czp -f outputname.tar.gz foldername
creates a compressed archive. This command can be divided as follows
prompt $ command <options> arguments
# prompt: [username@hostname ~]$
# command: tar
# options: -czp -f outputname.tar.gz
# argument: foldername
From this, we observe that different parts of the command are separated by spaces. Also, we observe that options begin with a “-
”. Furthermore some options require an argument. As is observed for the -f
option, the argument directly follows the option. Finally, it is remarked that options are commonly combined. In the command above the options -c
, -z
and -p
are grouped to -czp
.
Most commands have a manual page. This page is found using
[username@hostname ~]$ man commandname
This opens a simple text-viewer. Using the \(\downarrow\) / \(\uparrow\), PageUp
/ PageDown
, and the scroll wheel on the mouse one can scroll through the manual page. To search the manual use /
followed by your query, and n
to progress through the search results. To close the editor type q
. The man
command prompts accept the same commands as the less
-viewer.
Alternatively (or sometimes exclusively), a (short) manual page can often be printed to the screen. This is provided by the command itself, i.e.
[username@hostname ~]$ commandname -h
[username@hostname ~]$ commandname --help
Several useful commands are listed, the most important ones are elaborated in the following sections.
Command |
Description |
---|---|
pwd |
print the current working directory |
ls |
list directory contents |
du |
report disk usage of files |
find |
search and find files |
cd |
change directory |
mkdir |
make a directory |
cp |
copy files (and directories with the |
mv |
move (rename) files and directories |
rm |
remove files (and directories with the |
cat |
concatenate files and print on the standard output |
head |
print the first few lines of a file |
tail |
print the last few lines of a file |
grep |
Globally search a Regular Expression and Print, use this for simple output filtering |
less |
a text-file viewer |
vi |
a text-file editor |
top |
display Linux tasks |
ps |
report a process status list |
which |
shows the full path of (shell) commands |
chmod |
change file’s permissions |
cd – Change directory#
The change directory (cd) command can be used to navigate through the file tree by changing the current directory. Let us use an example of a file tree such as displayed above. Typically the terminal will start in the user’s home folder:
[username@hostname ~]$
where the current directory is indicated between brackets: [ ... ]
. Notice that [ ~ ]
is the abbreviation of [ /home/username ]
. We can now change directory by typing
[username@hostname ~]$ cd ~/sim/sub1
[username@hostname sub1]$
where the change of directory is specified in absolute sense. Alternatively, we can use a relative file path to do the same. In a relative file path definition use
./
to denote the current directory../
to denote the one directory up../../
to denote the two directories up
The previous command could therefore also be specified as follows
[username@hostname ~]$ cd ./sim/sub1
[username@hostname sub1]$
where ./
is not strictly necessary, i.e.
[username@hostname ~]$ cd sim/sub1
[username@hostname sub1]$
is equivalent. If we would now like to change the directory to ~/sim/sub2
we could use a relative path definition:
[username@hostname sub1]$ cd ../sub2
[username@hostname sub2]$
Notice that it is convenient to use relative file definitions inside code, as they are not dependent on the file structure. For example if ../sub2/
would have been included in a code, the code is not influenced by changing sim
to test
. In contrast, if we had used an absolute path, the code would fail. This is particularly important when running the same code or script on different machines (running on different platforms), such as in the case of a desktop computer and a cluster.
ls – List directory contents#
The contents (files and directories) of the current directory are listed in “matrix” format using
[username@hostname ~]$ ls
Depending on the shell and the terminal that are used, executable files, files, and folders are highlighted differently. By specifying (optional) input arguments, the contents of directories other than the current directory are listed. For the example above
[username@hostname ~]$ ls ~/sim/sub1
would list one file, output.log
.
More detailed file information can be obtained using the -lh
option. For example
[username@hostname ~]$ ls -lh ~/sim/sub1
would output for example
-rw-rw-r-- 1 exuser exgroup 26K Sep 18 11:57 output.log
whereby the columns indicate:
permissions
count
owner
size
time/data modified
name
Or more specifically
In Linux each file/directory/link has permissions. In the output of ls -l these permissions break down as follows:
a. - -/d/l b. rw- user c. rw- group d. r-- other
Herein, the first item specifies if the item is a file (
-
), a directory (d
), or link (l
). The next three group specify the permissions of the file’s owner, its group (both specified in 3.), and other users. Hereinr
corresponds to read permission,w
to write permission, andx
to execute permission. In this case the userexuser
is allowed to read and write the file. The same permission resides with users in the groupexgroup
, while other users may only read the file.From this it follows that an executable in Linux is nothing more than a file (e.g. plain text) with the right permissions. The
extension
is in principle meaningless. The file can be made executable using the command chmod, e.g.[myname@hostname ~] $ chmod u + x output . log
More information is found online.
Note
The permissions can be directly specified (instead of added or removed) using a numerical notation:
4 = r (read)
2 = w (write)
1 = x (execute)
The desired permissions are set by adding the numerical value of those permissions you would like to allow. For example:
[username@hostname ~]$ chmod 764 output.log [username@hostname ~]$ ls -lh output.log -rwxrw-r-- 1 exuser exgroup 26K Sep 18 11:57 output.log
The number of directories and links inside the item. For a file the counter is always equal to one.
The user and group name to which the file belongs.
The size of the file. Because we have used the
-h
option, this is in human readable format (i.e. kilo-, mega-, giga-, or terabytes).The time and date of the last modification to the file.
The file name
cp, rm, mv – File operations#
The copy (cp), remove (rm), and move (mv) commands are used to do file operations, directories are created using mkdir.
Copy#
To copy a file:
[myname@hostname ~] $ cp source destination
For example to make a backup of the output.log
file, used as an example in the previous section, in the same folder:
[myname@hostname ~] $ cp ~/ sim / sub1 / output . log ~/ sim / sub1 / output . bak
If this command is issued from the ~/sim/sub1
directory, the relative command
[myname@hostname sub1] $cp output . log output . bak
is sufficient.
If a directory is copied, the -r
(recursive) options should be specified to also copy all the content of the directory. For example:
[myname@hostname ~] $ cp -r ~/ sim / sub2 ~/ sim / sub3
Remove#
Analogous to the copy command, a file is removed using
[myname@hostname ~] $ rm filename
To remove a directory use
[myname@hostname ~] $ rm -r directoryname
Notice that, in principle, removed files cannot be recovered, i.e. there is no such thing as a recycle bin when removing files from the command line. For convenience, wild cards can be used. One example of a wild card is *
. Simply said, the *
replaces zero or more characters. For example to remove all .log
files in the ~/sim/sub1
folder:
[myname@hostname sub1] $ rm *. log
which in this case would remove only output.log
. In contrast, the command
[myname@hostname ~] $ rm -r ~/ sim / sub *
would remove all the directories beginning with sub
, which, in this case would be both the directories sub1
and sub2
including all their content.
Danger
Never use the command
[myname@hostname ~] $ rm -r *.*
since it removes all files and directories up and down the file tree (including those that are hidden) to which the user has permissions. Thus, all your files on the computer are permanently lost. The .*
in the wild card string also matches ..
which causes the remove command to also remove higher directories. This mistake is typically made by DOS users, where it has a different meaning. In a Linux environment, rm -r * is usually the intended command, i.e. empty the current directory.
Move#
To move a file to a different location (or to rename a file) the following command is used (for files and directories)
[myname@hostname ~] $ mv source destination
For example to rename the output.log
file:
[myname@hostname sub1] $ mv output . log output . txt
To move this file to the ~/sim/sub2
directory:
[myname@hostname sub1] $ mv output . log ../ sub2 / output . txt
Make a directory#
To create a directory, use the command
[myname@hostname sub1] $ mkdir dirname
Redirecting output#
Redirecting output is a powerful capability of (among others) Bash. This way the output that is printed to standard Input/Output (i.e. the screen) can be intercepted and used differently. The output can be transferred to another command using |
, or it can be stored to a file using >
or appended to a file using >>
.
For example to find the lines in which error messages are included in the file output.log
, we could use:
[username@hostname sub1]$ cat output.log | grep -n "error"
The cat command outputs the contents of the output.log
file. The |
intercepts this output and forwards it to the The grep command, which and prints the lines matching the pattern error
(including the line numbers, because of the -n
option).
These lines can be stored to a file error.log
using
the command
[username@hostname sub1]$ cat output.log | grep -n "error" > error.log
To get the current directory as the top line of the file, we do
[username@hostname sub1]$ pwd > error.log
which empties or creates the file error.log
and prints the current working directory. The file is now appended with the error lines by
[username@hostname sub1]$ cat output.log | grep -n "error" >> error.log
As a final note, the Bash shell considers two outputs, the stdout
and the stderr
. Any program can write to these outputs, and typically both are shown in the terminal window. It is possible to redirect each output differently, but this is considered outside the scope of this document.
Basic scripting#
Bash commands, some of which are introduced above, can be combined in a script. Such a script is an executable plain-text file. Below, we consider a very simple script myscript. We first make the file and give the user executable permissions, e.g. by
[username@hostname ~]$ touch myscript
[username@hostname ~]$ chmod u+x myscript
We then edit the file’s contents to
#!/bin/bash
#
# This is a very simple script
varname="Hello world"
echo $varname
In this script, the first line selects the environment in which the script is programmed, in this case the bash environment. Except for the shell-definition on the first line, any statement that follows a #
is a comment and is not evaluated. The last two lines are the only lines of code, in which the string "Hello world"
is assigned to the variable varname
. In the second line, the echo command prints the variable varname
, and thus "Hello world"
, to the screen. the variable name is preceded by a $
, to get the value of a variable.
Environment settings#
If a script is often used, it can be useful to make it a “global” script, such that it can be used in the same way as for example cd. To this end, it is common to create a directory bin
in the home folder:
[username@hostname ~]$ mkdir ~/bin
Next, Bash has to look for executable files in this directory. To this end, we add the new directory to the PATH
variable:
[username@hostname ~]$ export PATH=$HOME/bin:$PATH
where $HOME
is equivalent to
~
.
Warning
Beware that copy/pasting code from this page may not transfer correctly.
To avoid having to specify this after every new login, this (and other commands) can be added to the file ~/.bashrc
. This file is evaluated at the beginning of each login. This file is commonly of the following format:
# .bashrc
# Source global definitions
if [ -f /etc/bashrc ]; then
. /etc/bashrc
fi
export PATH=$HOME/bin:$PATH