ChasteGuides/AutomatedBuildGuide

Fixing bad commits

Or, how to understand the automated emails telling you something's wrong, and what to do about them. Also covers how to fix nightly test failures.

See also ChasteGuides/BestPracticeGuide.

1. Is the "bad commit" my fault?

Broken builds

If you see Build failed (check build log for ": ***"). then this is a broken build. A broken build (that is, broken compilation) is more serious than broken tests. It potentially means that there will be no nightly test results, and will probably interfere with the work of every other Chaste developer until it is fixed. Check the build log for strings matching ": ***"

Broken tests

2. Nightly tests

Here we list some common causes of nightly test failures (with specific examples) and how to fix them.

Memory testing

A common cause of memory leaks is forgetting to delete a new pointer at the end of a test. We now provide an example of this type of memory leak and how to fix it. The memory testing nightly build for r13190 failed 1 out of 256 test suites. Memory leaks were found in the TestForces test suite. Part of the output for this test suite is shown below:

160 bytes in 1 blocks are definitely lost in loss record 3 of 6
   at 0x4C23809: operator new(unsigned long) (vg_replace_malloc.c:230)
   by 0x7FB662: TestForces::TestRepulsionForceMethods() (TestForces.hpp:678)
   by 0x7FCD57: TestDescription_TestForces_TestRepulsionForceMethods::runTest() (TestForcesRunner.cpp:73)
   by 0x718E39: CxxTest::RealTestDescription::run() (RealDescriptions.cpp:96)
   by 0x742FDB: CxxTest::TestRunner::runTest(CxxTest::TestDescription&) (TestRunner.h:74)
   by 0x7430B7: CxxTest::TestRunner::runSuite(CxxTest::SuiteDescription&) (TestRunner.h:61)
   by 0x7C6704: CxxTest::TestRunner::runWorld() (TestRunner.h:46)
   by 0x7C67C0: CxxTest::TestRunner::runAllTests(CxxTest::TestListener&) (TestRunner.h:23)
   by 0x7C6846: CxxTest::ErrorFormatter::run() (ErrorFormatter.h:47)
   by 0x719135: main (TestForcesRunner.cpp:19)

From this output we can see that a memory leak occurred because a new pointer was created within the test TestRepulsionForceMethods(), but was not deleted at the end of this test. This memory leak was fixed in r13191.

See also FixingMemoryTesting.

Coverage

A common cause of coverage failures is forgetting to add tests for all possible cases in an if/else or switch statement, or test that any EXCEPTIONs are thrown under the correct circumstances. We now provide an example of this type of coverage failure and how to fix it. The coverage nightly build for r12897 failed 1 out of 604 test suites. A coverage failure was found in the file AbstractFeCableObjectAssembler.hpp. Part of the output for this file is shown below:

        1:  389:    if (mAssembleMatrix)
        -:  390:    {
        1:  391:        assemble_event = HeartEventHandler::ASSEMBLE_SYSTEM;
        -:  392:    }
        -:  393:    else
        -:  394:    {
    #####:  395:        assemble_event = HeartEventHandler::ASSEMBLE_RHS;
        -:  396:    }
        -:  397:
        1:  398:    if (mAssembleMatrix && mMatrixToAssemble==NULL)
        -:  399:    {
    #####:  400:        EXCEPTION("Matrix to be assembled has not been set");
        -:  401:    }
        1:  402:    if (mAssembleVector && mVectorToAssemble==NULL)
        -:  403:    {
    #####:  404:        EXCEPTION("Vector to be assembled has not been set");
        -:  405:    }

From this output we can see that further tests are required to cover the lines marked #####. This coverage failure was fixed in r12914.

Doxygen coverage

A common cause of Doxygen coverage failures is forgetting to correctly document input arguments for a method. We now provide an example of this type of Doxygen coverage failure and how to fix it. The Doxygen coverage nightly build for r13261 failed 1 out of 722 test suites. A Doxygen coverage failure was found in the file SolidMechanicsProblemDefinition.hpp. Part of the output for this file is shown below:

/home/bob/eclipse/workspace/trunk-13261-2011-07-26-01_39_49/pde/src/problem/SolidMechanicsProblemDefinition.hpp:179: 
  Warning: argument `X' of command @param is not found in the argument list of 
  SolidMechanicsProblemDefinition< DIM >::EvaluateBodyForceFunction(c_vector< double, DIM > &rX, double t)
/home/bob/eclipse/workspace/trunk-13261-2011-07-26-01_39_49/pde/src/problem/SolidMechanicsProblemDefinition.hpp:179: 
  Warning: The following parameters of SolidMechanicsProblemDefinition::EvaluateBodyForceFunction(c_vector< double, DIM > &rX, double t) are not documented:
  parameter rX

From this output we can see that the input argument rX for the method SolidMechanicsProblemDefinition::EvaluateBodyForceFunction() was incorrectly documented as X. This Doxygen coverage failure was fixed in r13262.

Parallel

One cause for parallel build failures is forgetting to add the macro EXIT_IF_PARALLEL to tests that are only intended to be run sequentially. We now provide an example of this type of parallel build failure and how to fix it. The parallel nightly build for r13407 failed 4 out of 264 test suites. Failures were found in the test suites TestCellBasedSimulationWithBuskeForces, TestForcesNotForRelease, TestWritingPdeSolversTutorial and TestPeriodicForces. Part of the output for the test suite TestCellBasedSimulationWithBuskeForces is shown below:

 ***** TestCellBasedSimulationWithBuskeForces.hpp *****

Entering TestSimpleMonolayerWithBuskeAdhesiveForce
Entering TestSimpleMonolayerWithBuskeAdhesiveForce
TestCellBasedSimulationWithBuskeForcesRunner: cell_based/src/mesh/HoneycombMeshGenerator.cpp:41: HoneycombMeshGenerator::HoneycombMeshGenerator(unsigned int, unsigned int, unsigned int, double): Assertion `PetscTools::IsSequential()' failed.
TestCellBasedSimulationWithBuskeForcesRunner: cell_based/src/mesh/HoneycombMeshGenerator.cpp:41: HoneycombMeshGenerator::HoneycombMeshGenerator(unsigned int, unsigned int, unsigned int, double): Assertion `PetscTools::IsSequential()' failed.
TestCellBasedSimulationWithBuskeForcesRunner: cell_based/src/mesh/HoneycombMeshGenerator.cpp:41: HoneycombMeshGenerator::HoneycombMeshGenerator(unsigned int, unsigned int, unsigned int, double): Assertion `PetscTools::IsSequential()' failed.
/home/bob/mpi/bin/mpirun.ch_shmem: line 91: 12384 Aborted                 /home/bob/eclipse/workspace/trunk-13407-2011-08-10-04_06_44/notforrelease_cell_based/build/debug_fpe/simulation/TestCellBasedSimulationWithBuskeForcesRunner

From this output we can see that the class HoneycombMeshGenerator is not intended to be used in parallel, and the EXIT_IF_PARALLEL macro (defined in the header file PetscTools.hpp) should be added to the start of TestSimpleMonolayerWithBuskeAdhesiveForce. This parallel build failure was fixed in r13416.

Profiling

When running a profiling build (to identify areas for potential optimisation) the compiler performs extra code analysis, and hence can spot further potential errors in your code. These typically consist of variables that it thinks might sometimes be used before being assigned to, for example in this case:

heart/src/odes/AbstractRushLarsenCardiacCell.cpp: In member function 'virtual OdeSolution AbstractRushLarsenCardiacCell::Compute(double, double, double)':
heart/src/odes/AbstractRushLarsenCardiacCell.cpp:81: warning: 'curr_time' may be used uninitialized in this function
scons: *** [heart/build/profile_ndebug/src/odes/AbstractRushLarsenCardiacCell.o] Error 1

These can be fixed by altering the code logic, most commonly by initialising the variable to a dummy value when it is declared.

Another particularly common case is when a variable is only used within an assert. Since the profile builds define NDEBUG, assertions are turned off, and hence such variables will trigger an 'unused variable' error. For example here:

notforrelease_cell_based/src/population/mechanics/GeneralisedPeriodicLinearSpringForce.cpp: In member function 'boost::numeric::ublas::c_vector<double, DIM> GeneralisedPeriodicLinearSpringForce<DIM>::CalculateForceBetweenNodes(unsigned int, unsigned int, AbstractCellPopulation<U>&) [with unsigned int DIM = 1u]':
notforrelease_cell_based/src/population/mechanics/GeneralisedPeriodicLinearSpringForce.cpp:192:   instantiated from here
notforrelease_cell_based/src/population/mechanics/GeneralisedPeriodicLinearSpringForce.cpp:110: warning: unused variable 'ageA'
notforrelease_cell_based/src/population/mechanics/GeneralisedPeriodicLinearSpringForce.cpp:111: warning: unused variable 'ageB'

Either don't store the value tested in a variable, or assign it to itself (e.g. ageA=ageA;) to circumvent the error.

Windows

Windows builds are run using CMake on a scratch.cs but it's best for you to run specific tests in MS Visual Studio.

Helpful debugging hints:

Copyrights

All Chaste source and test files must include the standard Chaste copyright notice. If you omit this, the special 'Copyrights' test will fail, as for example here. Fix this by adding the copyright comment to the top of your file.

Orphaned tests and duplicate file names

All tests must be listed within a test pack file. These are text files in the test folders named like "SomethingTestPack.txt", and define groups of tests to run in standard builds (see also TestingStrategy). Any tests not listed will cause the special 'OrphanedTests' test to fail. Simply add your test to a suitable test pack (e.g. 'Continuous', 'Nightly', or 'Weekly') to fix this.

Similarly, all Chaste source and test files must have a unique name, or the build system will get confused. If you've created a file with a duplicate name, the special 'DuplicateFileNames' test will fail. Rename your file to fix this.

Tests killed off

Most of the automated builds apply a run time limit to tests; the exact limit depends on the kind of build. Tests that run for too long will be killed (see e.g. this one) and a message like "Test killed due to exceeding time limit of 180 seconds" will appear in the output.

Common culprits are continuous tests taking too long; however this may not lead to the continuous build failing. Problems could also show up in the MemoryTesting or lofty nightly builds in particular.

Fix this problem by making the test shorter! For example, run a simulation for less time (or optimise the code). However, make sure that you don't break coverage by doing so. Consider whether tests could be refactored to reduce duplication.