On ARC (Advanced Research Computing) machines ARCUS and ARCUS-B
Getting onto the machine
For access to any of the ARC machines, you will need to register with Advanced Research Computing (ARC, formerly the Oxford Supercomputing Centre). The easiest way to join is to register for a user account with an existing project - take a look at the list of projects on the registration page and talk to the person responsible for the one you would like to join.
ARC runs a few training courses each year for new users. The notes for these courses are available online, and might be worth a look.
These instructions were last checked by Jochen in Spring 2015 (using cell-based Chaste on ARCUS and ARCUS-B). Changes may be required to get Chaste running on other machines.
Setting the environment
You will need SCons and the Intel compiler to compile code and RNV for compiling CellML files into Chaste compatible cell models.
In order to compile and run Chaste tests and executables you will need to set up the Chaste dependencies in your user profile. This is done by adding the following lines to your $HOME/.bash_profile file
if [[ `hostname -f` = *arcus.osc.local ]] then # this section contains all the commands to be run on arcus module load scons/2.3.4 module load PETSc/openmpi-1.6.5/3.5_icc-2013 module load python/2.7 module load vtk/5.8.0 ### These should match with python/hostconfig/machines/arcus.py # Xerces export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/system/software/linux-x86_64/xerces-c/3.3.1/lib # Szip export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/system/software/linux-x86_64/lib/szip/2.1/lib # Boost export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/system/software/linux-x86_64/lib/boost/1_56_0/lib export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/include export CPATH=${LD_LIBRARY_PATH}:/usr/include else # this section contains all the commands to be run on arcusb module load scons/2.3.4 module load python/2.7 module load vtk/5.10.1 module unload intel-compilers/2013 intel-mkl/2013 module load PETSc/mvapich2-2.0.1/3.5_icc-2015 module load intel-mkl/2015 module load hdf5-parallel/1.8.14_mvapich2_intel # All of these are defined in arcus_b.py already for the linker to find them but the # environemt variables still seem to be necessary to run the executables export LD_LIBRARY_PATH=/system/software/linux-x86_64/xerces-c/3.3.1/lib:$LD_LIBRARY_PATH export LD_LIBRARY_PATH=/system/software/linux-x86_64/lib/boost/1_56_0/lib:$LD_LIBRARY_PATH export LD_LIBRARY_PATH=/system/software/linux-x86_64/lib/xsd/3.3.0-1/lib:$LD_LIBRARY_PATH export LD_LIBRARY_PATH=/system/software/linux-x86_64/lib/szip/2.1/lib:$LD_LIBRARY_PATH export LD_LIBRARY_PATH=/system/software/linux-x86_64/lib/vtk/5.10.1/lib:$LD_LIBRARY_PATH export LD_LIBRARY_PATH=/system/software/arcus-b/lib/parmetis/4.0.3/mvapich2-2.0.1__intel-2015/lib:$LD_LIBRARY_PATH export LD_LIBRARY_PATH=/system/software/arcus-b/lib/sundials/mvapich2-2.0.1/2.5.0/double/lib:$LD_LIBRARY_PATH fi # Add chaste libraries - you may need to change this depending on where you installed (or plan to install) Chaste export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${DATA}/Chaste/lib
Note that the exact configuration depends on the cluster and hence we introduced the if statement above. You should check that the modules have been loaded correctly by using the module list command. The if statement may need to be extended when we start using other ARC clusters. Even though this configuration sets up the chaste dependencies on both of the clusters (ARCUS and ARCUS-B) it is advisable to not mix the binaries that are generated by them, i.e. either use a different chaste folder on each of the clusters or only compile and run chaste code on one of them. At the current stage the configuration file does not set up RNV, which is required to run PyCML, and CVODE. CVODE is discussed later on this page.
Getting Chaste
It makes sense to download Chaste in your $DATA area where there is ample space to store code, meshes and output.
cd $DATA # Check out code base (takes a few minutes) svn co https://chaste.cs.ox.ac.uk/svn/chaste/trunk Chaste --username jmpf@comlab.ox.ac.uk # (the last bit will, of course, be your Chaste login) # Check out a user project svn co https://chaste.cs.ox.ac.uk/svn/chaste/projects/jmpf Chaste/projects/jmpf # (the last parts will, of course, be your Chaste project name)
Compiling a test
It's important to compile only and not to attempt to run programs on the head-node. As of r15416 the SCons build system should automatically pick up a configuration file based on previous configurations. Be careful to not compile the same binaries on different clusters, i.e. either use only ARCUS or ARCUS-B to compile and run Chaste. If you would like to compile optimised builds (faster) replace build=Intel with build=IntelProduction_hpc.
cd $DATA/Chaste # Compiling a simple parallel test scons build=Intel compile_only=1 test_suite=global/test/TestPetscTools.hpp # Compiling PyCml test scons b=Intel co=1 ts=heart/test/ionicmodels/TestPyCml.hpp # Compiling a user project test scons b=Intel co=1 ts=projects/jmpf/test/TestVtk.hpp #Compiling the main Chaste executable scons b=Intel co=1 exe=1 chaste_libs=1 apps
Running code on ARCUS
Here is an example script which runs the above test and the Chaste executable on ARCUS. Save as, for example, run_Chaste.
#!/bin/bash --login # Name of the job #PBS -N TestChaste # Use 1 node with 32 cores = 32 MPI legs #PBS -l nodes=1:ppn=32 # Kill after one hour #PBS -l walltime=01:00:00 # Send me email at the beginning and the end of the run #PBS -m be #PBS -M your_address_not_jmpf@cs.ox.ac.uk # Join output and error files #PBS -j oe # Copy all environmental variables #PBS -V # Set up MPI cd $PBS_O_WORKDIR ##### The appropriate include for the machine: # . enable_hal_mpi.sh . enable_arcus_mpi.sh #Switch to Chaste directory cd ${DATA}/Chaste # A parallel test mpirun $MPI_HOSTS ./global/build/intel/TestPetscToolsRunner # A PyCML test mpirun $MPI_HOSTS ./heart/build/intel/ionicmodels/TestPyCmlRunner # A user project test mpirun $MPI_HOSTS ./projects/jmpf/build/intel/TestVtkRunner # A test of the executable mpirun $MPI_HOSTS apps/src/Chaste apps/texttest/weekly/Propagation1d/ChasteParameters.xml
Submit script and see state of the queue
qsub run_Chaste.sh qstat
More information on the Torque job scheduler is available here.
Running code on ARCUS-B
ARCUS-B uses a different job scheduler called SLURM. Information about using the SLURM scheduler can be found here http://www.arc.ox.ac.uk/content/arcus-phase-b, and here http://www.arc.ox.ac.uk/content/slurm-job-scheduler.
A sample SLURM script would be
#!/bin/bash --login # Name of the job #SBATCH --job-name=TestChaste # Use 1 node with 32 cores = 32 MPI legs #SBATCH --nodes=1 #SBATCH --ntasks-per-node=32 # Kill after one hour #SBATCH --time=01:00:00 # Send me email at the beginning and the end, and abortion of the run # (I prefer the FAIL option - only send emails when the process gets aborted) #SBATCH --mail-type=ALL #SBATCH --mail-user=your_address_not_jmpf@cs.ox.ac.uk # Joining output and error files is done automatically by SLURM, as well as copying the environment variables, # and the change of working directory # Set up MPI using the appropriate include for the machine: . enable_arcus_b_mpi.sh #Switch to Chaste directory cd ${DATA}/Chaste # A parallel test mpirun $MPI_HOSTS ./global/build/intel/TestPetscToolsRunner # A PyCML test mpirun $MPI_HOSTS ./heart/build/intel/ionicmodels/TestPyCmlRunner # A user project test mpirun $MPI_HOSTS ./projects/jmpf/build/intel/TestVtkRunner # A test of the executable mpirun $MPI_HOSTS apps/src/Chaste apps/texttest/weekly/Propagation1d/ChasteParameters.xml
You can submit the script and see the state of the queue using
sbatch SCRIPT_NAME.sh squeue
Using CVODE
There are some problems with using the default config file found in python/hostconfig/machines if you want to use CVODE. To get around these problems, copy the machine configuration file to python/hostconfig/local.py and replace the CVODE section at the end with the following. Note: This has not been tested on ARCUS or ARCUS-B, and the exact paths probably need to be changed.
# Chaste may also optionally link against CVODE. # CVODE is not installed - line below is now set to "True" use_cvode = int(prefs.get('use-cvode', True)) if use_cvode: # Look for the version of CVODE in the folder where it is located (part of PETSc) DetermineCvodeVersion('/system/software/hal/lib/PETSc/petsc-3.0.0-p12/icc-2011/include') # Now add the CVODE libraries to the list other_libraries.extend(['sundials_cvode', 'sundials_nvecserial'])
Troubleshooting
If you receive a python error of the kind
python: error while loading shared libraries: libpython2.7.so.1.0: cannot open shared object file: No such file or directory
just go into your chaste directory and copy libpython2.7.so.1.0 into the ./lib directory
cp /system/software/linux-x86_64/python/2.7.8/lib/libpython2.7.so.1.0 ./lib
This may not be the cleanest way of fixing this issue, other suggestions are welcome!