ChasteGuides/TroubleShooting

Troubleshooting

This pages stores a list of common problem and issues that occasionally crop up, together with their solutions. Please update problems and solutions if needed, and add new problems as they crop up.

Table of Contents

  1. Troubleshooting
  2. Running tests or executables
    1. An exception occurs but the exception message is not shown
    2. The directory output was written to seems empty
    3. Semaphore error / error about semget and setnum
    4. Running an already compiled test executable manually (instead of compiling …
    5. Appear to get incorrect answers when solving a PDE
    6. Lots of HDF5 errors followed immediately by the program aborting
    7. Pure virtual method called terminate called without an active exception
    8. Error using checkpointing: "unregistered class"
  3. Scons errors
    1. No SConstruct file found
    2. IndexError: string index out of range
    3. "Found dependency cycle(s)"
  4. PyCml errors
    1. ConfigurationError: No transmembrane potential found; check your …
    2. ConfigurationError: No stimulus current found; you'll have trouble …
    3. TranslationError: Cannot convert ionic current from amps to uA/cm2
  5. Compilation and linking errors
    1. The compiler complains that a variable has not been defined when it …
    2. Get a mental, extremely long, set of compilation errors, appears to be …
    3. Undefined reference errors on linking
    4. A "comparison between signed and unsigned integer expressions" error …
    5. "Error: void <YOUR_TEST_CLASS>::<YOUR_TEST_METHOD>() is private"
    6. Compiler says c_vector or c_matrix was not declared (perhaps even if …
    7. Get a "Assertion 'petsc_is_initialised' failed" error
    8. "Fatal error; unknown error handler. May be MPI call before MPI_INIT. …
    9. Boost/serialization/vector.hpp
    10. glibc detected free(): invalid pointer
    11. Invalid application of 'sizeof' to incomplete type …
    12. "function definition does not declare parameters"
    13. "error: expected primary-expression before '>' token"
    14. Test... "does not name a type"
    15. You are trying to #include a CVODE cardiac cell and the class isn't …
    16. Compile error complaining about methods being hidden
  6. Other
    1. 'A repository hook failed' error message when trying to commit
    2. Rolling back an accidental commit

Running tests or executables

An exception occurs but the exception message is not shown

Add throw (Exception) after the method declaration in the failing test, i.e.

    void TestSomething() throw (Exception)
    {
        // test
    }

The directory output was written to seems empty

If, in a terminal, you are in an output directory, and rerun a test that involves that directory being wiped and recreated, the terminal might act as if there's nothing in the directory, i.e. an ls displays no files. Just cd out and back into the directory again.

Semaphore error / error about semget and setnum

Problem: The results of a test are summarised as a 'Semaphore error', or an 'MPI semaphore error', with the actual test result containing 'semget failed for setnum = 0', e.g.

Running 9 testssemget failed for setnum =  0

This happens occasionally due to shortcomings in MPICH. Rerun the test. If that fails run the MPICH cleanipcs script, e.g. by doing:

~/mpi/sbin/cleanipcs

Running an already compiled test executable manually (instead of compiling and running with scons) fails

Some environment variables are setup by the SCons scripts, you need to export them yourself.

cdchaste
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`pwd`/lib
./component/build/whatever/TestXRunner

Also, if files are output to the current directory rather than /tmp/chaste/testoutput/, also do

export CHASTE_TEST_OUTPUT=/tmp/chaste/testoutput/

(or /tmp/whatever/testoutput).

Appear to get incorrect answers when solving a PDE

If you try to solve a simple PDE with Neumann boundary conditions, with a known solution, but appear to get incorrect answers, you may have done one of two things incorrectly. For the elliptic PDE "du/dt = div (D grad_u) + f", where D is a matrix, the Neumann boundary condition that you specify (using BoundaryConditionsContainer and ConstBoundaryConditions etc) is the value of "(D grad_u) dot n" (where n is the unit outward facing normal), not the value of du/dn and not du/dx in 1D. In particular, in 1D, on the left-hand side n=-1 and you need to specify "-D du/dx", not just du/dx. See UserTutorials/SolvingLinearPdes for an example.

Lots of HDF5 errors followed immediately by the program aborting

The errors look something like:

HDF5-DIAG: Error detected in HDF5 (1.8.4) MPI-process 0:
 #000: ../../../src/H5F.c line 1954 in H5Fclose(): decrementing file ID failed
   major: Object atom
   minor: Unable to close file
 #001: ../../../src/H5F.c line 1754 in H5F_close(): can't close file
   major: File accessability
   minor: Unable to close file
 #002: ../../../src/H5F.c line 1900 in H5F_try_close(): unable to flush cache
   major: Object cache
   minor: Unable to flush data from cache
 #003: ../../../src/H5F.c line 1679 in H5F_flush(): unable to flush metadata cache
   major: Object cache
   minor: Unable to flush data from cache
 #004: ../../../src/H5AC.c line 955 in H5AC_flush(): Can't flush cache.
   major: Object cache
   minor: Unable to flush data from cache
 #005: ../../../src/H5C.c line 3966 in H5C_flush_cache(): dirty pinned entry flush failed.
   major: Object cache
   minor: Unable to flush data from cache
 #006: ../../../src/H5C.c line 10763 in H5C_flush_single_entry(): unable to flush entry
   major: Object cache
   minor: Unable to flush data from cache
 #007: ../../../src/H5Fsuper_cache.c line 733 in H5F_sblock_flush(): unable to write superblock
   major: Low-level I/O
   minor: Write failed
 #008: ../../../src/H5FDint.c line 185 in H5FD_write(): driver write request failed
   major: Virtual File Layer
   minor: Write failed
 #009: ../../../src/H5FDmpio.c line 1820 in H5FD_mpio_write(): MPI_File_write_at failed
   major: Internal error (too specific to document in detail)
   minor: Some MPI function failed
 #010: ../../../src/H5FDmpio.c line 1820 in H5FD_mpio_write(): MPI_ERR_IO: input/output error
   major: Internal error (too specific to document in detail)
   minor: MPI Error String
HDF5-DIAG: Error detected in HDF5 (1.8.4) MPI-process 0:
 #000: ../../../src/H5F.c line 1512 in H5Fopen(): unable to open file
   major: File accessability
   minor: Unable to open file
 #001: ../../../src/H5F.c line 1307 in H5F_open(): unable to read superblock
   major: File accessability
   minor: Read failed
 #002: ../../../src/H5Fsuper.c line 305 in H5F_super_read(): unable to find file signature
   major: File accessability
   minor: Not an HDF5 file
 #003: ../../../src/H5Fsuper.c line 153 in H5F_locate_signature(): unable to find a valid file signature
   major: Low-level I/O
   minor: Unable to initialize object

Chaste error: io/src/reader/Hdf5DataReader.cpp:83: Hdf5DataReader could not open <filename.h5>
*** An error occurred in MPI_Barrier
*** after MPI was finalized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
Abort after MPI_FINALIZE completed successfully; not able to guarantee that all other processes were killed!
HDF5: infinite loop closing library
    D,A,S,T,F,F,AC,FD,P,FD,P,FD,P,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,...

This seems to occur when using OpenMPI and an NFS file system. It’s not specific to Chaste (or even to HDF5), but is due to a bug in OpenMPI (see e.g.  https://bugzilla.redhat.com/show_bug.cgi?id=712300). We haven't yet found a workaround. Probably your HDF5 output will be ok when running on a single process, but won’t work at all in parallel.

Pure virtual method called terminate called without an active exception

If you get this when running a test:

pure virtual method called 
terminate called without an active exception
(possibly followed by loads of errors)

there are two main causes. If you are using Boost versions 1.44-1.46.1, and the error occurs after the test has finished, then a  Boost bug is the likely cause. The bug is harmless, check to see if the test did actually pass. If so, you can either ignore these messages, or try and install a different version of Boost.

If not, then it is (probably) because you are trying to call a virtual method from inside an abstract class constructor. For example, doing

ConcreteClass(arg) : public AbstractClass(arg)
{};

AbstractClass(arg)
{
   AbstractClass::Method(arg) // which in turn calls some virtual methods
}

will lead to the error.

Really you shouldn't call virtual methods, even indirectly, from a constructor, as the object may not be in a fully constructed state and you may get unexpected behaviour even if the error above isn't seen (see below for technical details). Hence the best solution is to change your design so that this isn't necessary. However, if you know that there's never going to be a class derived from yours then you can do the following, although it's not robust to future code changes:

ConcreteClass(arg) : public AbstractClass(arg)
{
   AbstractClass::Method(arg) // which in turn calls some virtual methods
};

AbstractClass(arg)
{};

Technical details on why this happens:

Error using checkpointing: "unregistered class"

This error arises when trying to load (or sometimes save) an archive containing an object of a class that hasn't been "seen" by the serialization system. Unfortunately the serialiation library doesn't tell you in the error which class is the culprit, so you have to use an educated guess. There are two common cases.

The first case occurs most often if you are saving and loading a simulation in different source or test files. Ensure that you have the same list of includes in both cases, so that all classes "seen" when saving are also "seen" when loading.

On older versions of Boost (before 1.37) this can also arise if you are using a "non-standard" cell model in a cardiac simulation, i.e. one that is hard-coded but not one of those available via the XML configuration file. To resolve this you unfortunately have to include the header for the relevant cell model in CardiacSimulationArchiver.cpp; after the inclusion of HeartConfigRelatedCellFactory.hpp is a sensible location.

See also the Boost Serialization guide.

Scons errors

No SConstruct file found

The following error

scons: *** No SConstruct file found.
File "/usr/lib/scons/SCons/Script/Main.py", line 825, in _main

is, as it says, due to no SConstruct file found and probably because you are in the wrong directory - you should run scons from the main Chaste directory.

IndexError: string index out of range

If you get the following error

scons  test_suite=heart/test/bidomain/TestBidomainProblem.hpp 
IndexError: string index out of range:
  File "/usr/lib/scons/SCons/Script/Main.py", line 1171:
    _exec_main(parser, values)
  File "/usr/lib/scons/SCons/Script/Main.py", line 1144:
    _main(parser)
  File "/usr/lib/scons/SCons/Script/Main.py", line 880:
    if a[0] == '-':

it could be because of the two spaces (!!!) between "scons" and "test_suite=" in "scons test_suite=heart/test/bidomain/TestBidomainProblem.hpp" (only an issue if running scons through eclipse).

"Found dependency cycle(s)"

scons: done building targets.

scons: *** Found dependency cycle(s):
  Internal Error: no cycle found for node global/build/debug/src/Version.o (<SCons.Node.FS.File instance at 0xdacc20>) in state executed

File "/usr/lib64/python2.5/site-packages/SCons/Taskmaster.py", line 797, in cleanup

If scons throws an error saying it has found a dependency cycle it probably has conflicting versions of things lying around and a clean build (scons -c) seems to sort it out.

PyCml errors

ConfigurationError: No transmembrane potential found; check your configuration

This occurs when PyCml cannot determine which variable in the model represents the transmembrane potential. You can either add a for_model stanza to the PyCml config file or, better, annotate the CellML file to specify the variable. See ChasteGuides/CodeGenerationFromCellML for more information on how to do this, in particular the section on Model annotation with RDF.

ConfigurationError: No stimulus current found; you'll have trouble generating Chaste code

PyCml needs to know which variable in the model represents the stimulus current in order to replace the model's stimulus with the one defined by Chaste. You should annotate the relevant variable as described here. If the model doesn't have a stimulus (e.g. a sino-atrial node model) then annotate the model itself accordingly.

TranslationError: Cannot convert ionic current from amps to uA/cm2 without knowing which variable in the cell model represents the membrane capacitance

Some cell models give ionic currents in units of amps (or microamps etc.) whereas Chaste expects amps normalised by area (microamps per centimetre squared). In order to convert between the two, PyCml needs to use the membrane capacitance from the model, and hence needs to know which variable represents this. You should annotate the relevant variable as described here.

Compilation and linking errors

The compiler complains that a variable has not been defined when it clearly has been, in a base class

If class A and class B are both templated, and B<DIM> inherits from A<DIM>, then the compiler won't be able to locate variables defined in the base class if used in the child unless the base is specified. The solution is to write this-> before the variable in question. i.e.

template<int DIM>
class A
{
public:
  double x;
};
template<int DIM>
class B : public A<DIM>
{
  void run()
  {
    std::cout << x;
  }
};

won't compile, whereas

template<int DIM>
class A
{
public:
  double x;
};
template<int DIM>
class B : public A<DIM>
{
  void run()
  {
    std::cout << this->x;
  }
};

will do. See AccessingMemberVariablesInTemplatedSuperclasses.

Get a mental, extremely long, set of compilation errors, appears to be something to do with Boost (mpl or ublas)

Try putting

#include "UblasIncludes.hpp"

as the first include in the source file being compiled (generally either a cpp file or the test being run).

It may be that the ublas include needs to come after any serialization include; I'm not sure.

Undefined reference errors on linking

Problem: Undefined reference errors on linking, such as:

undefined reference to `Node<1u>::AddElement(unsigned int)'

This may mean that you are using a templated class which has been written using Explicit Instantiation, and you are using a value for one of the template parameters for which there isn't an instantiation. Not all classes are instantiated for all dimensions in order to reduce compilation time. Look at the bottom of the cpp file for the class in question (e.g. Node.cpp in the example above) and check if there is a line like

template class Node<1>;

at the bottom of the file. Add the missing dimension if not.

Note that for some classes (e.g. BoundaryConditionsContainer) the situation is more complex, since we don't just template over dimension. In such cases there should be a file named like BoundaryConditionsContainerImplementation.hpp which you can include, either in the hpp file (if you do not provide a cpp) or in your cpp file (and add further explicit instantiations there). See StaticAndDynamicPolymorphism#Explicitinstantiation for more info.

A "comparison between signed and unsigned integer expressions" error (possibly) in a file in the CxxTest folder

The following error (or actually, warning, which is then taken as an error)

cxxtest/cxxtest/TestSuite.h: In function 'bool CxxTest::equals(X, Y) [with X = long unsigned int, Y = int]':
cxxtest/cxxtest/TestSuite.h:58:   instantiated from 'void CxxTest::doAssertEquals(const char*, unsigned int, const char*, X, const char*, Y, const char*) [with X = long unsigned int, Y = int]'
.<FILE_NAME>:<LINE_NUMBER>   instantiated from here
cxxtest/cxxtest/TestSuite.h:49: warning: comparison between signed and unsigned integer expressions

is usually due to a comparison between an unsigned variable and a hardcoded number (taken as an int by the compiler) in a TS_ASSERT_EQUALS, for example

unsigned my_var = 10;
TS_ASSERT_EQUALS(my_var, 10);

To tell the compiler to treat the (second) 10 as an unsigned, do the following

TS_ASSERT_EQUALS(my_var, 10u);

"Error: void <YOUR_TEST_CLASS>::<YOUR_TEST_METHOD>() is private"

Remember all test methods need to be declared as public.

Compiler says c_vector or c_matrix was not declared (perhaps even if you are including <boost/numeric/ublas/matrix.hpp>)

These are ublas vectors and matrices. Do

#include "UblasIncludes.hpp"

Note that it is not enough to do

#include <boost/numeric/ublas/matrix.hpp>

as then you would have to write boost::numeric::ublas::c_matrix<double,2,2> instead of c_matrix<double,2,2>.

Get a "Assertion 'petsc_is_initialised' failed" error

If the following is printed

global/src/DistributedVector.cpp:47: static void DistributedVector::CheckForPetsc(): Assertion `petsc_is_initialised' failed.

it is probably because you have forgotten to do

#include "PetscSetupAndFinalize.hpp"

in your test.

"Fatal error; unknown error handler. May be MPI call before MPI_INIT. Error message is MPI_COMM_RANK and code is 197"

You may have included PetscSetupAndFinalize.hpp in a source file - it should only be included in test files (and must be included in all tests files that use PETSc).

Boost/serialization/vector.hpp

Note: this error should no longer occur, as we include a workaround within Chaste.

(#1024) If you see this:

/usr/include/boost/serialization/vector.hpp:126: error: redefinition of 'struct boost::serialization::implementation_level<std::vector<long int, std::allocator<long int> > >'
/usr/include/boost/serialization/vector.hpp:126: error: previous definition of 'struct boost::serialization::implementation_level<std::vector<long int, std::allocator<long int> > >'
/usr/include/boost/serialization/vector.hpp:126: error: redefinition of 'struct boost::serialization::implementation_level<std::vector<long unsigned int, std::allocator<long unsigned int> > >'
/usr/include/boost/serialization/vector.hpp:126: error: previous definition of 'struct boost::serialization::implementation_level<std::vector<long unsigned int, std::allocator<long unsigned int> > >'
scons: building terminated because of errors.

put

#include <climits>

before

#include "boost/serialization/vector.hpp"

glibc detected free(): invalid pointer

If you get an error starting something like:

** glibc detected *** heart/build/debug/mechanics/TestCardiacElectroMechanicsProblemRunner: free(): invalid pointer: 0x00000000019ff758 ***
======= Backtrace: =========
/lib/libc.so.6[0x7f32485ae08a]
/lib/libc.so.6(cfree+0x8c)[0x7f32485b1c1c]

then it may be because you haven't included a virtual destructor in the base class of your inheritance hierarchy, and you're deleting a derived class via a pointer of base class type. Another situation that may provoke this error is when you make use of CellwiseData but forget to call ReallocateMemory().

Invalid application of 'sizeof' to incomplete type 'boost::STATIC_ASSERTION_FAILURE<false>'

If you get this error

/usr/include/boost/archive/detail/oserializer.hpp:566: error: invalid application of 'sizeof' to incomplete type 'boost::STATIC_ASSERTION_FAILURE<false>' 

then you need to archive a const pointer to your object instead of just a pointer, e.g. instead of

Electrodes<3>* p_electrodes = new Electrodes<3>(mesh,false,1,0,10,magnitude,duration);
output_arch << p_electrodes;

put

Electrodes<3>* const p_electrodes = new Electrodes<3>(mesh,false,1,0,10,magnitude,duration);
output_arch << p_electrodes;

"function definition does not declare parameters"

If you get this sort of error:

cell_based/src/tissue/cell/TissueCell.cpp:172: error: function definition does not declare parameters
scons: *** [cell_based/build/debug/src/tissue/cell/TissueCell.os] Error 1
scons: building terminated because of errors.

then the relevant function definition is incorrectly typed. Check you have two colons, e.g. TissueCell::AddCell().

"error: expected primary-expression before '>' token"

If you are trying to call a method that is templated, like this:

if (cell_iter->rGetCellPropertyCollection().HasProperty<CellLabel>())

but get this sort of error:

error: expected primary-expression before '>' token
error: expected primary-expression before ')' token

then this may be because the compiler doesn't know that the method is a template itself, and so parses the < character as "less than" and gets confused later on. In this case, you need to explicitly tell the compiler that the method is a template so that it parses < as the opening bracket of a template parameter list, like this:

if (cell_iter->rGetCellPropertyCollection().template HasProperty<CellLabel>())

Test... "does not name a type"

If you get an error like this:

projects/JohnW/build/debug/Sensitivity_Analysis/TestSensitivityAnalysisRunner.cpp:23:8: error: 'TestSensitivityAnalysis' does not name a type
projects/JohnW/build/debug/Sensitivity_Analysis/TestSensitivityAnalysisRunner.cpp:26:178: error: 'suite_TestSensitivityAnalysis' was not declared in this scope
projects/JohnW/build/debug/Sensitivity_Analysis/TestSensitivityAnalysisRunner.cpp: In member function 'virtual void TestDescription_TestSensitivityAnalysis_TestSensitivityAnalysisRunner::runTest()':
projects/JohnW/build/debug/Sensitivity_Analysis/TestSensitivityAnalysisRunner.cpp:31:19: error: 'suite_TestSensitivityAnalysis' was not declared in this scope
scons: *** [projects/JohnW/build/debug/Sensitivity_Analysis/TestSensitivityAnalysisRunner.o] Error 1

it probably means that the file CxxTest is trying to use is empty. This can happen if, for instance, the whole file is in a block surrounded by

#ifdef CHASTE_CVODE

#endif

and CHASTE_CVODE is false, you will need to change the guard to be true (i.e. enable CVODE in this case in the hostconfig file), or re-write the code so it isn't a problem.

You are trying to #include a CVODE cardiac cell and the class isn't recognised

i.e. compilation fails with the usual 'does not name a type' kind of error that happens when you e.g. mis-spell a class name.

This will happen if you try to use a CVODE cell and your hostconfig doesn't have CVODE (which is optional) set up. Run scons ts=global/test/TestChasteBuildInfo.hpp to see if SUNDIALS has a version number, if so CVODE is installed and linking to Chaste, if not you will need to InstallCvode.

Compile error complaining about methods being hidden

E.g. that virtual void ParentClass::Foo() was hidden by virtual void ChildClass::Foo()

Make sure that the Foo()s in each class have exactly the same definition (reutrn types, constness, parameters etc.).

Other

'A repository hook failed' error message when trying to commit

If, when you try to commit changes to the repository, SVN gives the error message

A repository hook failed
svn: Commit failed (details follow):
svn: 'pre-commit' hook failed with error output:
Branch or project commit with no indication in log message

this is because, when committing to your project (or a branch), you must start the commit message with the project (or branch) name in square brackets, eg [JohnD] implemented ...

Rolling back an accidental commit

Suppose that changeset r21486 was committed accidentally, and should be rolled back. This can be done using the commands

svn merge -c -21486 .
svn ci -m 'Rolling back r21486'