Getting Started with High Performance Computing
HPC@Mines has several different machines, and most of the machines have several different hardware configurations. For each of these machines a lot is different and a lot will be the same. This page discusses and links to all the steps that are common to all the machines, then it links to the individual machines.
Keep in mind the HPC@Mines recourses are here for just that, High Performance Computing at Colorado School of Mines. What exactly is HPC? in this context it means massively parallel applications. Why are we not inclusive of serial programs? This is because our computers are not that (kind of) fast. Confused? Our computers cannot multiply 3141 * 5926 any faster than my Chromebook can, but they multiply two massive matrixes together, say where each matrix is 30GB; good luck getting your custom build desktop to be even able do that. So what makes our computers so special for such large tasks? It is the fact that we have over 15,000 computing cores. Your serial jobs would only get in the way of larger jobs and since they can run just as good on a lab machine you may run them there.
While every student can get access to this machine, you have to request it, but do not worry, for students it is an easy process.
Simply email: firstname.lastname@example.org giving:
- Your Name:
- Your Major/Department:
- Your Reason
Keep in mind that reasons like:
"...I would like to run my painfully optimized serial C program that I have been tinkering with..."
Will be less likely accepted compared to:
"...I would like to run my easily parallelized Python program that I have been tinkering with..."
âGetting access to BlueM is a little bit more involved process, for it you have to fill out an allocation request.
They are available here
Once you are given an allocation, you will have to assign your jobs with certain accounting information. Accounting is done for all runs on Aun and Mc2. That is, an parallel run must be "charged" to a particular account. You can see the accounts you are authorized to use by running the following command on either Aun or Mc2.
sacctmgr list association cluster=mc2 user=$LOGNAME format=User,Account%20
This will return a list of your accounts. You can then see the association between your account number and your project title by running the following command, replacing ACCOUNT_NUMBER with the values from the previous command.
sacctmgr list account ACCOUNT_NUMBER format=Account%20,Descr%50
You can charge to a particular account by adding the following line to your batch script:
Please note that the "none" account is not a valid account and cannot be used to run jobs.
You do not currently need an account number to run on Mio.
For more information, see the HPC@Mines accounting/charge page.
You cannot simply walk up to one of our supercomputer and login, they simply are not made for that and one of them lives in a federal facility on the outskirts of Boulder, CO. So what you use SSH, visit this page for information on logging in using just about any Operating System.
All of the super computing recourses at Mines use modules to get your environment set up as quickly as possible, letting you get your code running as soon as possible. Ideally the same modules will be available on each machine so that you do not have to care about particular paths and compiler versions, we will have taken care of that already for you. However, we seek to get modules out the door as soon as possible, so there can be lag time between when we finish a module for one machine and move to the next.
Finding the right Module
You have two options for this, command line and web.
juser@mio001[~]->module avail ---------------------------------------------------------------- /usr/share/Modules/modulefiles ---------------------------------------------------------------- dot module-cvs module-info modules null use.own utility ----------------------------------------------------------------------- /opt/modulefiles ----------------------------------------------------------------------- Apps/MaterialsStudio/6.0 PrgEnv/libs/fftw/gcc/3.3.3 PrgEnv/pgi/default openmpi/intel/1.6.5 .......
This will show you a list of all the possible modules available on that machine at that time
Using a Module
module load [module name]
For example to load the python module on mio001 you would type:
juser@mio001[~]->module load PrgEnv/python/Enthought/2.7.2_v7.1-2
For more information go to the HPC-Modules page
NOTE: Mio and Aun users
The default user environment does not have a default compiler, which means you will need specify one.
To see your current compiler type:
If this returns something like
Then you are done, if it returns an error you need to add the following lines to the end of your .bash_profile file, which is located at ~/.bash_profile.
#load the Compilers and OpenMPI version of MPI module load module load Core/Devel >& /dev/null module load utility >& /dev/null
To activate the commands you just added now type the command:
Since HPC@Mines is a shared recourse that seeks to optimize performance, a scheduler is used. The scheduler is essentially a queue that seeks to optimally divvy up recourses and make sure that two people are not fighting over the same recourse. The only way to run jobs on the "compute nodes" is through the scheduler, if you run your code without calling something like "sbatch" you are probably doing something wrong and should read up on the documentation. (You are probably doing what is called running on the login(head) node, which will eventually lockup the machine and not allow users, including yourself, to login or use to the machine, which is not ideal)
If you have never run on a HPC@Mines recourse I highly suggest following the steps in First Run, if you are familiar with it you may choose to jump down to More Information.
The files discussed here can also be obtained from: http://hpc.mines.edu/bluem/quickfiles/example.tgz
Also please not that while these examples were completed on mio, they procedure is exactly the same on BlueM and Aun.
Note that the "makefile" and run scripts discussed here can be used as templates for other applications.
To run the quick start example create a directory for your example and go to it.
[joeuser@mio001 bins]$mkdir guide [joeuser@mio001 bins]$cd guide
Copy the file that contains our example code to your directory and unpack it.
[joeuser@mio001 guide]$cp /opt/utility/quickstart/example.tgz . [joeuser@mio001 guide]$tar -xzf *
If you like, do an ls to see what you have.
[joeuser@mio001 guide]$ls aun_script color.f90 docol.f90 example.tgz helloc.c index.html makefile mc2_old_script mc2_script slurm_script
Make the program
[joeuser@mio001 guide]$make echo mio001 mio001 mpif90 -c color.f90 mpicc -DNODE_COLOR=node_color_ helloc.c color.o -lifcore -o helloc rm -rf *.o
Run the script using sbatch
[joeuser@mio001 guide]$sbatch slurm_script Submitted batch job 1993
The script will create a new directory based on the job number for your program to run in. In this case 1993
The command squeue show what jobs are running. If there are nodes free this job will run quickly and it may show that their are no jobs running.
[joeuser@mio001 guide]$squeue JOBID PARTITION NAME USER ST TIME NODES MIDPLANELIST(REASON) [joeuser@mio001 guide]$
Do an ls to see your new directory, 1993 in this case. The file slurm-1993.out contains additional information about your job. If you don't redirect output from within your script all "terminal" output would be put in this file.
[joeuser@mio001 guide]$ls 1993 aun_script color.f90 docol.f90 example.tgz helloc helloc.c index.html makefile mc2_old_script mc2_script mio1_script slurm-1993.out
We can look at what the files in out new directory. These are (1) a copy of our environment, (2) the output from the program which is a message from every mpi task which starts with "everyone..." and a single one from message from each node that contains "Hello"
[joeuser@mio001 1993]$cd 1993 [joeuser@mio001 1993]$ ls env.1993 output.1993 script.1993 submit [joeuser@mio001 1993]$grep everyone output* | head -3 everyone compute004 1 16 0 everyone compute005 13 16 8 everyone compute004 3 16 0 [joeuser@mio001 1993]$grep Hello output* | head -3 ********************* Hello from compute004 0 16 0 ********************* Hello from compute005 8 16 8 [joeuser@mio001 1993]$
We can look at the file slurm-1993.out also:
[joeuser@mio001 1993]$ cd .. [joeuser@mio001 guide]$ tail slurm-1993.out job has finished ++ pwd + echo 'run in' /u/pa/ru/tkaiser/guide/1993 ' produced the files:' run in /u/pa/ru/tkaiser/guide/1993 produced the files: + ls -lt total 0 -rw-rw-r-- 1 tkaiser tkaiser 544 Feb 17 11:32 output.1993 -rw-rw-r-- 1 tkaiser tkaiser 5957 Feb 17 11:32 env.1993 -rw-rw-r-- 1 tkaiser tkaiser 1587 Feb 17 11:32 script.1993 lrwxrwxrwx 1 tkaiser tkaiser 22 Feb 17 11:32 submit -> /u/pa/ru/tkaiser/guide [joeuser@mio001 guide]$
Congratulations, you have run your first super computing program.
The Useful Links page contains a curated selection of different pages across HPC@Mines, it is highly suggested for building a general body of computing knowledge.
Two items are of particular importance on that list are the File System Policy page and the Charge/Accounting page, please read those before begining use on our systems.