On ARC (Advanced Research Computing) machines ARCUS and ARCUS-B

Getting onto the machine

For access to any of the ARC machines, you will need to register with Advanced Research Computing (ARC, formerly the Oxford Supercomputing Centre). The easiest way to join is to register for a user account with an existing project - take a look at the list of projects on the registration page and talk to the person responsible for the one you would like to join.

ARC runs a few training courses each year for new users. The notes for these courses are available online, and might be worth a look.

These instructions were last checked by Jochen in Spring 2015 (using cell-based Chaste on ARCUS and ARCUS-B). Changes may be required to get Chaste running on other machines.

Setting the environment

You will need SCons and the Intel compiler to compile code and RNV for compiling CellML files into Chaste compatible cell models.

In order to compile and run Chaste tests and executables you will need to set up the Chaste dependencies in your user profile. This is done by adding the following lines to your $HOME/.bash_profile file

if [[ `hostname -f` = *arcus.osc.local ]]
then
    # this section contains all the commands to be run on arcus
    module load scons/2.3.4
    module load PETSc/openmpi-1.6.5/3.5_icc-2013 
    module load python/2.7
    module load vtk/5.8.0
    ### These should match with python/hostconfig/machines/arcus.py
    # Xerces
    export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/system/software/linux-x86_64/xerces-c/3.3.1/lib
    # Szip
    export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/system/software/linux-x86_64/lib/szip/2.1/lib
    # Boost
    export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/system/software/linux-x86_64/lib/boost/1_56_0/lib
    export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/include
    export CPATH=${LD_LIBRARY_PATH}:/usr/include
else
    # this section contains all the commands to be run on arcusb
    module load scons/2.3.4
    module load python/2.7
    module load vtk/5.10.1
    module unload intel-compilers/2013 intel-mkl/2013
    module load PETSc/mvapich2-2.0.1/3.5_icc-2015
    module load intel-mkl/2015
    module load hdf5-parallel/1.8.14_mvapich2_intel
    # All of these are defined in arcus_b.py already for the linker to find them but the 
    # environemt variables still seem to be necessary to run the executables
    export LD_LIBRARY_PATH=/system/software/linux-x86_64/xerces-c/3.3.1/lib:$LD_LIBRARY_PATH
    export LD_LIBRARY_PATH=/system/software/linux-x86_64/lib/boost/1_56_0/lib:$LD_LIBRARY_PATH
    export LD_LIBRARY_PATH=/system/software/linux-x86_64/lib/xsd/3.3.0-1/lib:$LD_LIBRARY_PATH
    export LD_LIBRARY_PATH=/system/software/linux-x86_64/lib/szip/2.1/lib:$LD_LIBRARY_PATH
    export LD_LIBRARY_PATH=/system/software/linux-x86_64/lib/vtk/5.10.1/lib:$LD_LIBRARY_PATH
    export LD_LIBRARY_PATH=/system/software/arcus-b/lib/parmetis/4.0.3/mvapich2-2.0.1__intel-2015/lib:$LD_LIBRARY_PATH
    export LD_LIBRARY_PATH=/system/software/arcus-b/lib/sundials/mvapich2-2.0.1/2.5.0/double/lib:$LD_LIBRARY_PATH
fi

# Add chaste libraries - you may need to change this depending on where you installed (or plan to install) Chaste
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${DATA}/Chaste/lib

Note that the exact configuration depends on the cluster and hence we introduced the if statement above. You should check that the modules have been loaded correctly by using the module list command. The if statement may need to be extended when we start using other ARC clusters. Even though this configuration sets up the chaste dependencies on both of the clusters (ARCUS and ARCUS-B) it is advisable to not mix the binaries that are generated by them, i.e. either use a different chaste folder on each of the clusters or only compile and run chaste code on one of them. At the current stage the configuration file does not set up RNV, which is required to run PyCML, and CVODE. CVODE is discussed later on this page.

Getting Chaste

It makes sense to download Chaste in your $DATA area where there is ample space to store code, meshes and output.

cd $DATA
# Check out code base (takes a few minutes)
svn co https://chaste.cs.ox.ac.uk/svn/chaste/trunk Chaste --username jmpf@comlab.ox.ac.uk
#                                                         (the last bit will, of course, be your Chaste login)
# Check out a user project
svn co https://chaste.cs.ox.ac.uk/svn/chaste/projects/jmpf Chaste/projects/jmpf
#                                                    (the last parts will, of course, be your Chaste project name)

Compiling a test

It's important to compile only and not to attempt to run programs on the head-node. As of r15416 the SCons build system should automatically pick up a configuration file based on previous configurations. Be careful to not compile the same binaries on different clusters, i.e. either use only ARCUS or ARCUS-B to compile and run Chaste. If you would like to compile optimised builds (faster) replace build=Intel with build=IntelProduction_hpc.

cd $DATA/Chaste

# Compiling a simple parallel test
scons build=Intel compile_only=1 test_suite=global/test/TestPetscTools.hpp
# Compiling PyCml test
scons b=Intel co=1 ts=heart/test/ionicmodels/TestPyCml.hpp
# Compiling a user project test
scons b=Intel co=1 ts=projects/jmpf/test/TestVtk.hpp

#Compiling the main Chaste executable
scons b=Intel co=1 exe=1 chaste_libs=1 apps

Running code on ARCUS

Here is an example script which runs the above test and the Chaste executable on ARCUS. Save as, for example, run_Chaste.

#!/bin/bash --login

# Name of the job 
#PBS -N TestChaste

# Use 1 node with 32 cores = 32 MPI legs 
#PBS -l nodes=1:ppn=32

# Kill after one hour 
#PBS -l walltime=01:00:00

# Send me email at the beginning and the end of the run
#PBS -m be
#PBS -M your_address_not_jmpf@cs.ox.ac.uk 

# Join output and error files
#PBS -j oe

# Copy all environmental variables
#PBS -V 

# Set up MPI
cd $PBS_O_WORKDIR
##### The appropriate include for the machine:
# . enable_hal_mpi.sh
. enable_arcus_mpi.sh
#Switch to Chaste directory
cd ${DATA}/Chaste

# A parallel test
mpirun $MPI_HOSTS ./global/build/intel/TestPetscToolsRunner
# A PyCML test
mpirun $MPI_HOSTS ./heart/build/intel/ionicmodels/TestPyCmlRunner
# A user project test
mpirun $MPI_HOSTS ./projects/jmpf/build/intel/TestVtkRunner

# A test of the executable
mpirun $MPI_HOSTS apps/src/Chaste apps/texttest/weekly/Propagation1d/ChasteParameters.xml

Submit script and see state of the queue

qsub run_Chaste.sh
qstat

More information on the Torque job scheduler is available here.

Running code on ARCUS-B

ARCUS-B uses a different job scheduler called SLURM. Information about using the SLURM scheduler can be found here http://www.arc.ox.ac.uk/content/arcus-phase-b, and here http://www.arc.ox.ac.uk/content/slurm-job-scheduler.

A sample SLURM script would be

#!/bin/bash --login

# Name of the job 
#SBATCH --job-name=TestChaste

# Use 1 node with 32 cores = 32 MPI legs 
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=32

# Kill after one hour 
#SBATCH --time=01:00:00

# Send me email at the beginning and the end, and abortion of the run
# (I prefer the FAIL option - only send emails when the process gets aborted)
#SBATCH --mail-type=ALL
#SBATCH --mail-user=your_address_not_jmpf@cs.ox.ac.uk 

# Joining output and error files is done automatically by SLURM, as well as copying the environment variables,
# and the change of working directory

# Set up MPI using the appropriate include for the machine:
. enable_arcus_b_mpi.sh

#Switch to Chaste directory
cd ${DATA}/Chaste

# A parallel test
mpirun $MPI_HOSTS ./global/build/intel/TestPetscToolsRunner
# A PyCML test
mpirun $MPI_HOSTS ./heart/build/intel/ionicmodels/TestPyCmlRunner
# A user project test
mpirun $MPI_HOSTS ./projects/jmpf/build/intel/TestVtkRunner

# A test of the executable
mpirun $MPI_HOSTS apps/src/Chaste apps/texttest/weekly/Propagation1d/ChasteParameters.xml

You can submit the script and see the state of the queue using

sbatch SCRIPT_NAME.sh
squeue

Using CVODE

There are some problems with using the default config file found in python/hostconfig/machines if you want to use CVODE. To get around these problems, copy the machine configuration file to python/hostconfig/local.py and replace the CVODE section at the end with the following. Note: This has not been tested on ARCUS or ARCUS-B, and the exact paths probably need to be changed.

    # Chaste may also optionally link against CVODE.
    # CVODE is not installed - line below is now set to "True"
    use_cvode = int(prefs.get('use-cvode', True))
    if use_cvode:
        # Look for the version of CVODE in the folder where it is located (part of PETSc)
        DetermineCvodeVersion('/system/software/hal/lib/PETSc/petsc-3.0.0-p12/icc-2011/include')
        # Now add the CVODE libraries to the list
        other_libraries.extend(['sundials_cvode', 'sundials_nvecserial'])

Troubleshooting

If you receive a python error of the kind

python: error while loading shared libraries: libpython2.7.so.1.0: cannot open shared object file: No such file or directory

just go into your chaste directory and copy libpython2.7.so.1.0 into the ./lib directory

cp /system/software/linux-x86_64/python/2.7.8/lib/libpython2.7.so.1.0 ./lib

This may not be the cleanest way of fixing this issue, other suggestions are welcome!