Troubleshooting
This pages stores a list of common problem and issues that occasionally crop up, together with their solutions. Please update problems and solutions if needed, and add new problems as they crop up.
Table of Contents
- Troubleshooting
-
Running tests or executables
- An exception occurs but the exception message is not shown
- The directory output was written to seems empty
- Semaphore error / error about semget and setnum
- Running an already compiled test executable manually (instead of compiling …
- Appear to get incorrect answers when solving a PDE
- Lots of HDF5 errors followed immediately by the program aborting
- Pure virtual method called terminate called without an active exception
- Error using checkpointing: "unregistered class"
- Scons errors
- PyCml errors
-
Compilation and linking errors
- The compiler complains that a variable has not been defined when it …
- Get a mental, extremely long, set of compilation errors, appears to be …
- Undefined reference errors on linking
- A "comparison between signed and unsigned integer expressions" error …
- "Error: void <YOUR_TEST_CLASS>::<YOUR_TEST_METHOD>() is private"
- Compiler says c_vector or c_matrix was not declared (perhaps even if …
- Get a "Assertion 'petsc_is_initialised' failed" error
- "Fatal error; unknown error handler. May be MPI call before MPI_INIT. …
- Boost/serialization/vector.hpp
- glibc detected free(): invalid pointer
- Invalid application of 'sizeof' to incomplete type …
- "function definition does not declare parameters"
- "error: expected primary-expression before '>' token"
- Test... "does not name a type"
- You are trying to #include a CVODE cardiac cell and the class isn't …
- Compile error complaining about methods being hidden
- Other
Running tests or executables
An exception occurs but the exception message is not shown
Add throw (Exception) after the method declaration in the failing test, i.e.
void TestSomething() throw (Exception) { // test }
The directory output was written to seems empty
If, in a terminal, you are in an output directory, and rerun a test that involves that directory being wiped and recreated, the terminal might act as if there's nothing in the directory, i.e. an ls displays no files. Just cd out and back into the directory again.
Semaphore error / error about semget and setnum
Problem: The results of a test are summarised as a 'Semaphore error', or an 'MPI semaphore error', with the actual test result containing 'semget failed for setnum = 0', e.g.
Running 9 testssemget failed for setnum = 0
This happens occasionally due to shortcomings in MPICH. Rerun the test. If that fails run the MPICH cleanipcs script, e.g. by doing:
~/mpi/sbin/cleanipcs
Running an already compiled test executable manually (instead of compiling and running with scons) fails
Some environment variables are setup by the SCons scripts, you need to export them yourself.
cdchaste export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:`pwd`/lib ./component/build/whatever/TestXRunner
Also, if files are output to the current directory rather than /tmp/chaste/testoutput/, also do
export CHASTE_TEST_OUTPUT=/tmp/chaste/testoutput/
(or /tmp/whatever/testoutput).
Appear to get incorrect answers when solving a PDE
If you try to solve a simple PDE with Neumann boundary conditions, with a known solution, but appear to get incorrect answers, you may have done one of two things incorrectly. For the elliptic PDE "du/dt = div (D grad_u) + f", where D is a matrix, the Neumann boundary condition that you specify (using BoundaryConditionsContainer and ConstBoundaryConditions etc) is the value of "(D grad_u) dot n" (where n is the unit outward facing normal), not the value of du/dn and not du/dx in 1D. In particular, in 1D, on the left-hand side n=-1 and you need to specify "-D du/dx", not just du/dx. See UserTutorials/SolvingLinearPdes for an example.
Lots of HDF5 errors followed immediately by the program aborting
The errors look something like:
HDF5-DIAG: Error detected in HDF5 (1.8.4) MPI-process 0: #000: ../../../src/H5F.c line 1954 in H5Fclose(): decrementing file ID failed major: Object atom minor: Unable to close file #001: ../../../src/H5F.c line 1754 in H5F_close(): can't close file major: File accessability minor: Unable to close file #002: ../../../src/H5F.c line 1900 in H5F_try_close(): unable to flush cache major: Object cache minor: Unable to flush data from cache #003: ../../../src/H5F.c line 1679 in H5F_flush(): unable to flush metadata cache major: Object cache minor: Unable to flush data from cache #004: ../../../src/H5AC.c line 955 in H5AC_flush(): Can't flush cache. major: Object cache minor: Unable to flush data from cache #005: ../../../src/H5C.c line 3966 in H5C_flush_cache(): dirty pinned entry flush failed. major: Object cache minor: Unable to flush data from cache #006: ../../../src/H5C.c line 10763 in H5C_flush_single_entry(): unable to flush entry major: Object cache minor: Unable to flush data from cache #007: ../../../src/H5Fsuper_cache.c line 733 in H5F_sblock_flush(): unable to write superblock major: Low-level I/O minor: Write failed #008: ../../../src/H5FDint.c line 185 in H5FD_write(): driver write request failed major: Virtual File Layer minor: Write failed #009: ../../../src/H5FDmpio.c line 1820 in H5FD_mpio_write(): MPI_File_write_at failed major: Internal error (too specific to document in detail) minor: Some MPI function failed #010: ../../../src/H5FDmpio.c line 1820 in H5FD_mpio_write(): MPI_ERR_IO: input/output error major: Internal error (too specific to document in detail) minor: MPI Error String HDF5-DIAG: Error detected in HDF5 (1.8.4) MPI-process 0: #000: ../../../src/H5F.c line 1512 in H5Fopen(): unable to open file major: File accessability minor: Unable to open file #001: ../../../src/H5F.c line 1307 in H5F_open(): unable to read superblock major: File accessability minor: Read failed #002: ../../../src/H5Fsuper.c line 305 in H5F_super_read(): unable to find file signature major: File accessability minor: Not an HDF5 file #003: ../../../src/H5Fsuper.c line 153 in H5F_locate_signature(): unable to find a valid file signature major: Low-level I/O minor: Unable to initialize object Chaste error: io/src/reader/Hdf5DataReader.cpp:83: Hdf5DataReader could not open <filename.h5> *** An error occurred in MPI_Barrier *** after MPI was finalized *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort) Abort after MPI_FINALIZE completed successfully; not able to guarantee that all other processes were killed! HDF5: infinite loop closing library D,A,S,T,F,F,AC,FD,P,FD,P,FD,P,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,FD,...
This seems to occur when using OpenMPI and an NFS file system. It’s not specific to Chaste (or even to HDF5), but is due to a bug in OpenMPI (see e.g. https://bugzilla.redhat.com/show_bug.cgi?id=712300). We haven't yet found a workaround. Probably your HDF5 output will be ok when running on a single process, but won’t work at all in parallel.
Pure virtual method called terminate called without an active exception
If you get this when running a test:
pure virtual method called terminate called without an active exception (possibly followed by loads of errors)
there are two main causes. If you are using Boost versions 1.44-1.46.1, and the error occurs after the test has finished, then a Boost bug is the likely cause. The bug is harmless, check to see if the test did actually pass. If so, you can either ignore these messages, or try and install a different version of Boost.
If not, then it is (probably) because you are trying to call a virtual method from inside an abstract class constructor. For example, doing
ConcreteClass(arg) : public AbstractClass(arg) {}; AbstractClass(arg) { AbstractClass::Method(arg) // which in turn calls some virtual methods }
will lead to the error.
Really you shouldn't call virtual methods, even indirectly, from a constructor, as the object may not be in a fully constructed state and you may get unexpected behaviour even if the error above isn't seen (see below for technical details). Hence the best solution is to change your design so that this isn't necessary. However, if you know that there's never going to be a class derived from yours then you can do the following, although it's not robust to future code changes:
ConcreteClass(arg) : public AbstractClass(arg) { AbstractClass::Method(arg) // which in turn calls some virtual methods }; AbstractClass(arg) {};
Technical details on why this happens:
- The problem in the code above is that the compiler will allow you to instantiate an object of type Derived (as it is not abstract: all virtual methods are implemented).
- The construction of a class starts with the construction of all the bases, in this case Base.
- The compiler will generate the virtual method table for type Base, where the entry for f() is 0 (not implemented in base).
- The compiler will execute the code in the constructor then.
- After the Base part has completely been constructed, construction of the Derived element part starts.
- The compiler will change the virtual table so that the entry for f() points to Derived::f().
- If you try calling the method f() while still constructing Base, the entry in the virtual method table is still null and the application crashes.
Error using checkpointing: "unregistered class"
This error arises when trying to load (or sometimes save) an archive containing an object of a class that hasn't been "seen" by the serialization system. Unfortunately the serialiation library doesn't tell you in the error which class is the culprit, so you have to use an educated guess. There are two common cases.
The first case occurs most often if you are saving and loading a simulation in different source or test files. Ensure that you have the same list of includes in both cases, so that all classes "seen" when saving are also "seen" when loading.
On older versions of Boost (before 1.37) this can also arise if you are using a "non-standard" cell model in a cardiac simulation, i.e. one that is hard-coded but not one of those available via the XML configuration file. To resolve this you unfortunately have to include the header for the relevant cell model in CardiacSimulationArchiver.cpp; after the inclusion of HeartConfigRelatedCellFactory.hpp is a sensible location.
See also the Boost Serialization guide.
Scons errors
No SConstruct file found
The following error
scons: *** No SConstruct file found. File "/usr/lib/scons/SCons/Script/Main.py", line 825, in _main
is, as it says, due to no SConstruct file found and probably because you are in the wrong directory - you should run scons from the main Chaste directory.
IndexError: string index out of range
If you get the following error
scons test_suite=heart/test/bidomain/TestBidomainProblem.hpp IndexError: string index out of range: File "/usr/lib/scons/SCons/Script/Main.py", line 1171: _exec_main(parser, values) File "/usr/lib/scons/SCons/Script/Main.py", line 1144: _main(parser) File "/usr/lib/scons/SCons/Script/Main.py", line 880: if a[0] == '-':
it could be because of the two spaces (!!!) between "scons" and "test_suite=" in "scons test_suite=heart/test/bidomain/TestBidomainProblem.hpp" (only an issue if running scons through eclipse).
"Found dependency cycle(s)"
scons: done building targets. scons: *** Found dependency cycle(s): Internal Error: no cycle found for node global/build/debug/src/Version.o (<SCons.Node.FS.File instance at 0xdacc20>) in state executed File "/usr/lib64/python2.5/site-packages/SCons/Taskmaster.py", line 797, in cleanup
If scons throws an error saying it has found a dependency cycle it probably has conflicting versions of things lying around and a clean build (scons -c) seems to sort it out.
PyCml errors
ConfigurationError: No transmembrane potential found; check your configuration
This occurs when PyCml cannot determine which variable in the model represents the transmembrane potential. You can either add a for_model stanza to the PyCml config file or, better, annotate the CellML file to specify the variable. See ChasteGuides/CodeGenerationFromCellML for more information on how to do this, in particular the section on Model annotation with RDF.
ConfigurationError: No stimulus current found; you'll have trouble generating Chaste code
PyCml needs to know which variable in the model represents the stimulus current in order to replace the model's stimulus with the one defined by Chaste. You should annotate the relevant variable as described here. If the model doesn't have a stimulus (e.g. a sino-atrial node model) then annotate the model itself accordingly.
TranslationError: Cannot convert ionic current from amps to uA/cm2 without knowing which variable in the cell model represents the membrane capacitance
Some cell models give ionic currents in units of amps (or microamps etc.) whereas Chaste expects amps normalised by area (microamps per centimetre squared). In order to convert between the two, PyCml needs to use the membrane capacitance from the model, and hence needs to know which variable represents this. You should annotate the relevant variable as described here.
Compilation and linking errors
The compiler complains that a variable has not been defined when it clearly has been, in a base class
If class A and class B are both templated, and B<DIM> inherits from A<DIM>, then the compiler won't be able to locate variables defined in the base class if used in the child unless the base is specified. The solution is to write this-> before the variable in question. i.e.
template<int DIM> class A { public: double x; }; template<int DIM> class B : public A<DIM> { void run() { std::cout << x; } };
won't compile, whereas
template<int DIM> class A { public: double x; }; template<int DIM> class B : public A<DIM> { void run() { std::cout << this->x; } };
will do. See AccessingMemberVariablesInTemplatedSuperclasses.
Get a mental, extremely long, set of compilation errors, appears to be something to do with Boost (mpl or ublas)
Try putting
#include "UblasIncludes.hpp"
as the first include in the source file being compiled (generally either a cpp file or the test being run).
It may be that the ublas include needs to come after any serialization include; I'm not sure.
Undefined reference errors on linking
Problem: Undefined reference errors on linking, such as:
undefined reference to `Node<1u>::AddElement(unsigned int)'
This may mean that you are using a templated class which has been written using Explicit Instantiation, and you are using a value for one of the template parameters for which there isn't an instantiation. Not all classes are instantiated for all dimensions in order to reduce compilation time. Look at the bottom of the cpp file for the class in question (e.g. Node.cpp in the example above) and check if there is a line like
template class Node<1>;
at the bottom of the file. Add the missing dimension if not.
Note that for some classes (e.g. BoundaryConditionsContainer) the situation is more complex, since we don't just template over dimension. In such cases there should be a file named like BoundaryConditionsContainerImplementation.hpp which you can include, either in the hpp file (if you do not provide a cpp) or in your cpp file (and add further explicit instantiations there). See StaticAndDynamicPolymorphism#Explicitinstantiation for more info.
A "comparison between signed and unsigned integer expressions" error (possibly) in a file in the CxxTest folder
The following error (or actually, warning, which is then taken as an error)
cxxtest/cxxtest/TestSuite.h: In function 'bool CxxTest::equals(X, Y) [with X = long unsigned int, Y = int]': cxxtest/cxxtest/TestSuite.h:58: instantiated from 'void CxxTest::doAssertEquals(const char*, unsigned int, const char*, X, const char*, Y, const char*) [with X = long unsigned int, Y = int]' .<FILE_NAME>:<LINE_NUMBER> instantiated from here cxxtest/cxxtest/TestSuite.h:49: warning: comparison between signed and unsigned integer expressions
is usually due to a comparison between an unsigned variable and a hardcoded number (taken as an int by the compiler) in a TS_ASSERT_EQUALS, for example
unsigned my_var = 10; TS_ASSERT_EQUALS(my_var, 10);
To tell the compiler to treat the (second) 10 as an unsigned, do the following
TS_ASSERT_EQUALS(my_var, 10u);
"Error: void <YOUR_TEST_CLASS>::<YOUR_TEST_METHOD>() is private"
Remember all test methods need to be declared as public.
Compiler says c_vector or c_matrix was not declared (perhaps even if you are including <boost/numeric/ublas/matrix.hpp>)
These are ublas vectors and matrices. Do
#include "UblasIncludes.hpp"
Note that it is not enough to do
#include <boost/numeric/ublas/matrix.hpp>
as then you would have to write boost::numeric::ublas::c_matrix<double,2,2> instead of c_matrix<double,2,2>.
Get a "Assertion 'petsc_is_initialised' failed" error
If the following is printed
global/src/DistributedVector.cpp:47: static void DistributedVector::CheckForPetsc(): Assertion `petsc_is_initialised' failed.
it is probably because you have forgotten to do
#include "PetscSetupAndFinalize.hpp"
in your test.
"Fatal error; unknown error handler. May be MPI call before MPI_INIT. Error message is MPI_COMM_RANK and code is 197"
You may have included PetscSetupAndFinalize.hpp in a source file - it should only be included in test files (and must be included in all tests files that use PETSc).
Boost/serialization/vector.hpp
Note: this error should no longer occur, as we include a workaround within Chaste.
(#1024) If you see this:
/usr/include/boost/serialization/vector.hpp:126: error: redefinition of 'struct boost::serialization::implementation_level<std::vector<long int, std::allocator<long int> > >' /usr/include/boost/serialization/vector.hpp:126: error: previous definition of 'struct boost::serialization::implementation_level<std::vector<long int, std::allocator<long int> > >' /usr/include/boost/serialization/vector.hpp:126: error: redefinition of 'struct boost::serialization::implementation_level<std::vector<long unsigned int, std::allocator<long unsigned int> > >' /usr/include/boost/serialization/vector.hpp:126: error: previous definition of 'struct boost::serialization::implementation_level<std::vector<long unsigned int, std::allocator<long unsigned int> > >' scons: building terminated because of errors.
put
#include <climits>
before
#include "boost/serialization/vector.hpp"
glibc detected free(): invalid pointer
If you get an error starting something like:
** glibc detected *** heart/build/debug/mechanics/TestCardiacElectroMechanicsProblemRunner: free(): invalid pointer: 0x00000000019ff758 *** ======= Backtrace: ========= /lib/libc.so.6[0x7f32485ae08a] /lib/libc.so.6(cfree+0x8c)[0x7f32485b1c1c]
then it may be because you haven't included a virtual destructor in the base class of your inheritance hierarchy, and you're deleting a derived class via a pointer of base class type. Another situation that may provoke this error is when you make use of CellwiseData but forget to call ReallocateMemory().
Invalid application of 'sizeof' to incomplete type 'boost::STATIC_ASSERTION_FAILURE<false>'
If you get this error
/usr/include/boost/archive/detail/oserializer.hpp:566: error: invalid application of 'sizeof' to incomplete type 'boost::STATIC_ASSERTION_FAILURE<false>'
then you need to archive a const pointer to your object instead of just a pointer, e.g. instead of
Electrodes<3>* p_electrodes = new Electrodes<3>(mesh,false,1,0,10,magnitude,duration); output_arch << p_electrodes;
put
Electrodes<3>* const p_electrodes = new Electrodes<3>(mesh,false,1,0,10,magnitude,duration); output_arch << p_electrodes;
"function definition does not declare parameters"
If you get this sort of error:
cell_based/src/tissue/cell/TissueCell.cpp:172: error: function definition does not declare parameters scons: *** [cell_based/build/debug/src/tissue/cell/TissueCell.os] Error 1 scons: building terminated because of errors.
then the relevant function definition is incorrectly typed. Check you have two colons, e.g. TissueCell::AddCell().
"error: expected primary-expression before '>' token"
If you are trying to call a method that is templated, like this:
if (cell_iter->rGetCellPropertyCollection().HasProperty<CellLabel>())
but get this sort of error:
error: expected primary-expression before '>' token error: expected primary-expression before ')' token
then this may be because the compiler doesn't know that the method is a template itself, and so parses the < character as "less than" and gets confused later on. In this case, you need to explicitly tell the compiler that the method is a template so that it parses < as the opening bracket of a template parameter list, like this:
if (cell_iter->rGetCellPropertyCollection().template HasProperty<CellLabel>())
Test... "does not name a type"
If you get an error like this:
projects/JohnW/build/debug/Sensitivity_Analysis/TestSensitivityAnalysisRunner.cpp:23:8: error: 'TestSensitivityAnalysis' does not name a type projects/JohnW/build/debug/Sensitivity_Analysis/TestSensitivityAnalysisRunner.cpp:26:178: error: 'suite_TestSensitivityAnalysis' was not declared in this scope projects/JohnW/build/debug/Sensitivity_Analysis/TestSensitivityAnalysisRunner.cpp: In member function 'virtual void TestDescription_TestSensitivityAnalysis_TestSensitivityAnalysisRunner::runTest()': projects/JohnW/build/debug/Sensitivity_Analysis/TestSensitivityAnalysisRunner.cpp:31:19: error: 'suite_TestSensitivityAnalysis' was not declared in this scope scons: *** [projects/JohnW/build/debug/Sensitivity_Analysis/TestSensitivityAnalysisRunner.o] Error 1
it probably means that the file CxxTest is trying to use is empty. This can happen if, for instance, the whole file is in a block surrounded by
#ifdef CHASTE_CVODE #endif
and CHASTE_CVODE is false, you will need to change the guard to be true (i.e. enable CVODE in this case in the hostconfig file), or re-write the code so it isn't a problem.
You are trying to #include a CVODE cardiac cell and the class isn't recognised
i.e. compilation fails with the usual 'does not name a type' kind of error that happens when you e.g. mis-spell a class name.
This will happen if you try to use a CVODE cell and your hostconfig doesn't have CVODE (which is optional) set up. Run scons ts=global/test/TestChasteBuildInfo.hpp to see if SUNDIALS has a version number, if so CVODE is installed and linking to Chaste, if not you will need to InstallCvode.
Compile error complaining about methods being hidden
E.g. that virtual void ParentClass::Foo() was hidden by virtual void ChildClass::Foo()
Make sure that the Foo()s in each class have exactly the same definition (reutrn types, constness, parameters etc.).
Other
'A repository hook failed' error message when trying to commit
If, when you try to commit changes to the repository, SVN gives the error message
A repository hook failed svn: Commit failed (details follow): svn: 'pre-commit' hook failed with error output: Branch or project commit with no indication in log message
this is because, when committing to your project (or a branch), you must start the commit message with the project (or branch) name in square brackets, eg [JohnD] implemented ...
Rolling back an accidental commit
Suppose that changeset r21486 was committed accidentally, and should be rolled back. This can be done using the commands
svn merge -c -21486 . svn ci -m 'Rolling back r21486'