Introduction to Erwin
Overview | Key points
| Logging in | Passwords
| Files | Busy?
| Process Map
| Serial | Parallel | Editing
| Printing | Documentation
| Compiling | Libraries
| Broken | Erwin?
This page describes erwin, the cluster of Athlon PCs purchased from
Aspen Systems, and the basics of using
it.
Overview
Erwin is a cluster of PC-like machines running the Scyld-Beowulf version of Linux. We currently
have 1 two-processor master node and 9 two-processor slave nodes.
All nodes have identical computational hardware: two Athlon 1800s with 3.5Gbyte
total. Deep in BLAS3, each processor can do >2GFlop, approx three times
Isaac's computational throughput.
In theory we have a lot of compute power, but erwin is not a parallel
machine in the traditional (Isaac, Carter) sense. The nodes are linked
by a slow interconnection network (100Mbit ethernet) so only loosely coupled
parallel jobs will run well. Serial, single processor, jobs will run very
well, so it is best to think of Erwin as an easily utilized cluster of workstations.
Key points
- Do not run large or long running programs on the master node (Read
on to find out how to do it right). This slows down the machine for everyone.
- Users who abuse Erwin by, e.g. running on the master node may
find their accounts suspended.
- Fast: we have seen speeds from x1 to x3 of Isaac's speed. x1.5
seems typical
- Fragile: remember that Erwin is essentially built from PC's. The
I/O is not as fast as on Isaac. Be careful if launching many jobs at once.
Under heavy load Erwin will likely not perform well.
- Files: You have to run from a special area ($ETMP or /erwin/cat/your_user_name)
for your files to be seen by the slave nodes. This includes programs as
well as data. Read on to find out how.
Logging in
Erwin will only permit ssh access from inside NREL. There is no access
from outside of NREL due to the NREL firewall. You cannot login to the slave
nodes.
Passwords
The password is the same as for Isaac and friends, due to the "network"
(NIS) password system. To change your password you should run "yppasswd"
which will change your password on all UNIX machines.
Files
Erwin NFS mounts your usual homespace from either Isaac or u80csi3,
therefore inheriting all of your disk space limits. Files are shared over
100Mb/sec ethernet, so large file transfers will be slow.
Note however, that you cannot run from your usual
homespace - these files are not visible to the slave nodes, where
you need to run. Erwin gives you a temporary workspace, $ETMP, which is probably
/erwin/cat/your_user_name. This is a large local disk on Erwin, visible
to the slave nodes. You can store up to ~10GByte, but note that your files are not backed up!
How busy is the system?
A special version of top is installed. This shows - for the first few
seconds - the programs running on the master node, and then updates to
show everything running on the cluster
Here Paul is running a great many escan jobs. Note that they are all
running on slave nodes - the size (SIZE) and resident size (RSS) is listed
a zero.
For a graphical view of what is happening on the system, run "beostatus".
This shows the processor, memory, disk and network usage of each node.
In the snapshot below, there are 2 CPUs free on the slaves - node 2 CPU
1 and node 8 CPU 0. Note that the master node is listed as "-1" - we can
see some network traffic and disk I/O but no CPU activity. (This is how
it should be)
Which nodes is which job running on?
ps xf | bpstat -P
is a highly useful command which gives a graphical overview of the
jobs running on the system - who is the parent/sibling of who, and
which nodes are they running on.
This is particularly useful when you want to
make sure that one of your MPI jobs has arrived at the correct
destination!
Running serial (single
processor) programs
To run on the slave nodes, use the following incantation, remembering
that your programs and data files must be on $ETMP.
cd $ETMP/my_run_directory
To run on the less busy node:
mpprun -np 1 -nolocal ./my_program_name
To run on a specific node:
bpsh "node number" ./my_program_name
Note that you can background, foreground, renice, kill, ps (etc) programs
as per any normal UNIX system, even if your programs are running on
the slave nodes. You can redirect standard input and output as you would
any conventional program. This is one of the benefits of Scyld Beowulf
over conventional cluster systems.
Short running programs, such as an analysis script or gnuplot, can be
run directly on the master node per any normal UNIX system.
How do I run in parallel?
We have successfully compiled MPI (mpich version 1.2.3_Scyld from a
source rpm provided by Scyld, for the curious) for use with the Portland
Group's compilers. Linking MPI programs using mpif90 and/or mpif77 should
work.
Our current setup no longer allows to link MPI programs using gcc, g77
and the likes. If this is a problem, please contact Volker.
The magic words to run MPI-enabled code are
mpirun -np [number of desired processors] [your_job_name]
.
However, this will likely distribute your calculation onto CPUs that
are scattered across different nodeboards, generating much network
traffic and slow output. It should help throughput to keep your job
restricted to CPUs on as few nodeboards as possible. Use,
e.g.
mpirun -np 2 -beowulf_job_map 5:5 [your_job_name]
mpirun -np 4 -beowulf_job_map 5:5:6:6 [your_job_name]
etc.
to shepherd your run onto the two CPUs on nodeboard 5 or the
four CPUs on nodeboards 5 and 6, respectively.
Feedback on performance with different settings
would be greatly appreciated!
Editing
vi and emacs are installed
Printing
You can print postscript files to our HP 5000 using "lp"
Documentation
In addition to the system man pages, the compiler documentation is
here (local copy, remote up to date copy).
We have one (1) printed copy of the Scyld documentation, in case you wish
to try something fancy. For everything else, try google.com or ask your local user expert.
Compiling
Erwin has f77 (pgf77, g77), f90 (pgf90),
c (pgcc, gcc), and c++ (pgCC, g++)
compilers installed. If everything has worked
(your shell has sourced /etc/profile) then they will be on your PATH.
If not, try
declare -x "PGI=/usr/local/pgi"
declare -x
"PATH=$PGI/linux86/bin:$PATH"
declare -x
"MANPATH=$MANPATH:$PGI/man"
declare -x
"LM_LICENSE_FILE=$PGI/license.dat"
declare -x "LD_LIBRARY_PATH=$PGI/linux86/lib"
The Portland Group compilers also come with a debugger (pbdbg) and a
profiler (pgprof). Use them.
The compilers accept quite conventional compile options (-g, -c, -O2, etc.).
You might wish to try "-O2 -tp athlonxp
-byteswapio".
This should give a good level of optimisation, and also enables
(byteswapio) reading and writing of binary files in the same format as
Isaac and the SUNs. "-fast"
has been found to cause problems with some codes. There are many compile
options for optimisation - experiment and let us know what you find. Note that unnecessarily high levels of optimisation can
slow down code. Be sure to benchmark your code, and preferably profile
it.
Erwin also has two versions of gcc (the Gnu Compiler Collection)
installed. The system compiler which you will get by default is gcc
2.96 (old). However you may wish to try the up-to-date version gcc
3.1.1. The compiler binaries can be found in
/usr/local/gcc-3.1.1/bin (gcc for C, g++ for C++, g77 for f77,
but not Fortran 90). You must include
/usr/local/gcc-3.1.1/lib in your library path to compile dynamically
linked binaries:
export LD_LIBRARY_PATH=/usr/local/gcc-3.1.1/lib/ (for bash syntax)
You may simply wish to compile a static binary without any extra calls
but this can results in large executables and is not recommended.
Documentation on gcc can be found by typing "info gcc", for g77 by
typing "info g77", or at
the GNU web page (follow the links for g77).
Contact Gabriel for further
questions.
Libraries
BLAS, LAPACK and FFTW are found in /usr/local/bin. Use "-L/usr/local/lib
-llapack -lf77blas -latlas -lfftw" to use them all.
The blas library are in libatlas.a and need the fortran wrapper
libf77blas.a to be called from fortran code.
To call blas or lapack libraries from C or C++ you might need to include
libg2c.a in your link line. I had to recompile the blas and lapack
using g77 to be able to use them from C++ (g++)... You might be able
to use the system lapack and blas if you compile with the portland group
c and c++ compilers (I heard there are buggy though).
If someone is having troubles this particular issue,
contact Gabriel.
Ensure that you do not use the BLAS or LAPACK
in /usr/lib. These libraries are slow (unoptimised), while those
in /usr/local/lib are bullet-like in comparison.
Applications
Some useful applications are installed: TeX, gnuplot, xmgrace, gv etc. are
present. Please make suggestions if you would like anything else installed.
What is broken?
mail is broken. mail was not intended to be configured, but
sent mail is currently bounced back to your NREL address due to a bad send
address ("-1", the master node, not erwin.nrel.gov). Perhaps we will fix
this.
Why erwin?
You
should know this
Last update: 20th November 2002
gabriel_bester@nrel.gov