Cluster computing on the Sun Grid Engine
Theory
- MPI is an environment for running parallel jobs (which can communicate to each other). There are many implementations of MPI, most notably LAM-MPI and OpenMPI. I am using LAM-MPI here.
- MPI works if you run parallel stuff on your own, at a home cluster. With colleagues, you need a scheduler like SGE, SLURM, or Torque.
- In R, you can use one of a few packages to run your programs in parallel. snowfall seems the most simple to me, because you can run your script on the supercluster as well as on your local machine (in parallel or sequential), without too much code changing.
- Alternatives would have been snow (but snowfall is just a convenient wrapper for snow), and Rmpi (I’m not sure if you can run your code on a single computer with this package). Also, snow/snowfall use Rmpi functionalities if you run it in MPI mode.
With snowfall, you use list functions (lapply etc). Ideally, have a list with chunks of data, so each worker gets a chunk of the data.
Write your R script
Set up an R script:
Testing
To test your script, use a small subset of data and instead of qsub, which submits the job to the scheduler, run it on qrsh
(with appropriate parameters, i.e. qrsh -pe lammpi 10-32 myscript.sh
, which is an “interactive” shell that shows you stdout directly. If this runs through, proceed.
Set up a shell script
Then, set up a shell script that calls execution of this R script. The best way is to put SGE parameters in the script file as such:
Advanced
Check cores and RAM available
Use qfree
to see all nodes and what’s available. The parameter -l h_vmem
is valid for one single worker, not the complete thing.
Script should run on local machine and cluster
The command qsub
sets a few environment variables (see man qsub for which ones). You can query the value of this value, and if it is set, initialize a SGE cluster, but if it is unset, initialize a parallel environment on your local machines (with 4 cores in this example). Replace the sfInit()
call with this snippet:
Now the code runs regardless of whether you submit it to the SGE or run it locally.
Network random number generator
Start the network random number generator to make sure each cluster uses different seeds: sfClusterSetupRNG()
Load balancing
By using sfClusterApplyLB
instead of sfLapply
, faster machines are given further work without having to wait for the slowest machine to finish.
Saving/Restoring intermediate results
See ?sfClusterApplySR
for that.
Supply a range of cores, if number is not too important
Use #$ -pe lammpi 32-64
if you would like 64 slots, but would also be happy with 32 slots if that meant less waiting time.
High memory needed
Submit your job to the queue maxmem.q
: qsub -q maxmem.q myscript.sh
Lessons learned
- Learn how to use the scheduler.
qstat
,qfree
, and the likes. - Before running the big job, find out how much Memory and cores are free (
qfree
). Then find out how much memory your single jobs need: - Run the job on a small subset (or with
-l h_rt=00:30:00
), then after it aborts, runqstat -j
to see how much max vmem it needed. Use this value to run the giant job with.