Cardiac Checkpointing And Restarting

This tutorial is automatically generated from TestCardiacCheckpointingAndRestartingTutorial.hpp at revision 1dfba06b4d82. Note that the code is given in full at the bottom of the page.

Checkpointing and restarting cardiac simulations

In this tutorial we show how to save and reload cardiac simulations.

CardiacSimulationArchiver is the main class that takes care of checkpointing cardiac simulations.

#include <cxxtest/TestSuite.h>
#include "CardiacSimulationArchiver.hpp"
#include "BidomainProblem.hpp"
#include "LuoRudy1991.hpp"
#include "PetscSetupAndFinalize.hpp"
#include "PlaneStimulusCellFactory.hpp"

class TestCardiacCheckpointingAndRestartingTutorial : public CxxTest::TestSuite
{
public:

First, the checkpointing test.

    void TestCheckpointing()
    {

We set up exactly the same simulation as in the Another Bidomain Simulation tutorial.

        HeartConfig::Instance()->Reset();

        PlaneStimulusCellFactory<CellLuoRudy1991FromCellML,2> cell_factory(-2000000);
        HeartConfig::Instance()->SetSimulationDuration(5.0); //ms
        HeartConfig::Instance()->SetOutputDirectory("BidomainCheckpointingTutorial");
        HeartConfig::Instance()->SetOutputFilenamePrefix("results");
        HeartConfig::Instance()->SetMeshFileName("mesh/test/data/2D_0_to_1mm_800_elements", cp::media_type::Orthotropic);

        double scale = 2;
        HeartConfig::Instance()->SetIntracellularConductivities(Create_c_vector(1.75*scale, 0.19*scale));
        BidomainProblem<2> bidomain_problem( &cell_factory );

        bidomain_problem.Initialise();
        bidomain_problem.Solve();

To save the entire simulation, use the CardiacSimulationArchiver class, as shown in the following. Note the BidomainProblem<2> as the template parameter. The output directory is relative to CHASTE_TEST_OUTPUT.

        CardiacSimulationArchiver<BidomainProblem<2> >::Save(bidomain_problem, "BidomainCheckpointingTutorial/saved_simulation");
    }

This is how to restart the test.

    void TestRestarting()
    {

To restart from the saved simulation directory we use the CardiacSimulationArchiver class, as shown in the following. Note the BidomainProblem<2> as the template parameter again. The dimension (2) must match the one given in the saved archive directory. The output directory is again relative to CHASTE_TEST_OUTPUT.

        BidomainProblem<2>* p_bidomain_problem = CardiacSimulationArchiver<BidomainProblem<2> >::Load("BidomainCheckpointingTutorial/saved_simulation");

The simulation duration has to be amended. Note that the duration is always given with respect to the origin of the first solve. This means that we are running from t=5 ms (the end of the previous simulation) to t=10 ms. The output files are concatenated so that they appear to be made by a single simulation running from t=0 ms to t=10 ms. Note: loading an archive also loads HeartConfig options, so HeartConfig calls such as this one must appear after CardiacSimulationArchiver::Load().

        HeartConfig::Instance()->SetSimulationDuration(10); //ms

One point of checkpointing and restarting is that there may be something which we want to change during the course of experiment. Here we change the conductivity.

        HeartConfig::Instance()->SetIntracellularConductivities(Create_c_vector(3.0, 0.3));

        p_bidomain_problem->Solve();

Note that the pointer p_bidomain_problem exists in the scope of this test and that the object which was unarchived was created on the CardiacSimulationArchiver::Load() line above. We are therefore responsible for deleting the memory.

        delete p_bidomain_problem;
    }
};

Notes

  • Making a checkpoint does add a significant overhead at present, in particular because the mesh is written out to disk at each checkpoint. This is to ensure that each checkpoint directory contains everything needed to resume the simulation. The mesh written out will be in permuted form if it was partitioned for a parallel simulation.

  • To make this process slightly more efficient, Chaste will copy the original mesh files if the mesh was loaded from disk and hasn’t been modified (e.g. by permuting). Because of this, if you modify the mesh in memory, e.g. by setting element attributes as in the bidomain-with-bath tutorial, then you need to inform Chaste by calling mesh.SetMeshHasChangedSinceLoading(), so your modifications aren’t lost.

  • Meshes written in checkpoints use a binary form of the Triangle/Tetgen mesh format. This makes checkpoints significantly smaller but will cause portability problems if checkpoints are moved between little-endian systems (e.g. x86) and big-endian systems (e.g. PowerPC).

  • Checkpoints may be resumed on any number of processes — you are not restricted to the number on which it was saved. However, the mesh will not be re-partitioned if loaded on a different number of processes, so the parallel efficiency of the simulation may be significantly reduced in this case.

  • Resuming a checkpoint will attempt to extend the original results HDF5 file if it exists (specified by HeartConfig::SetOutputDirectory and HeartConfig::SetOutputFilenamePrefix), so that the file contains the complete simulation results. If this file does not exist a new file will be created containing just the results from the resume time.

  • When checkpointing, the progress_status.txt file only reports the percentage of time to go until the ‘’next’’ checkpoint, not until the end of the simulation. This makes it slightly less useful; however, the presence of the checkpoint directories (1ms, 2ms, etc.) provides overall progress information instead.

  • On all Boost versions if you are saving and loading a simulation in different source or test files, ensure that you have the same list of includes in both cases, or the serialization code may not know about all classes when loading, and give an unregistered class error.

Full code

#include <cxxtest/TestSuite.h>
#include "CardiacSimulationArchiver.hpp"
#include "BidomainProblem.hpp"
#include "LuoRudy1991.hpp"
#include "PetscSetupAndFinalize.hpp"
#include "PlaneStimulusCellFactory.hpp"

class TestCardiacCheckpointingAndRestartingTutorial : public CxxTest::TestSuite
{
public:
    void TestCheckpointing()
    {
        HeartConfig::Instance()->Reset();

        PlaneStimulusCellFactory<CellLuoRudy1991FromCellML,2> cell_factory(-2000000);
        HeartConfig::Instance()->SetSimulationDuration(5.0); //ms
        HeartConfig::Instance()->SetOutputDirectory("BidomainCheckpointingTutorial");
        HeartConfig::Instance()->SetOutputFilenamePrefix("results");
        HeartConfig::Instance()->SetMeshFileName("mesh/test/data/2D_0_to_1mm_800_elements", cp::media_type::Orthotropic);

        double scale = 2;
        HeartConfig::Instance()->SetIntracellularConductivities(Create_c_vector(1.75*scale, 0.19*scale));
        BidomainProblem<2> bidomain_problem( &cell_factory );

        bidomain_problem.Initialise();
        bidomain_problem.Solve();

        CardiacSimulationArchiver<BidomainProblem<2> >::Save(bidomain_problem, "BidomainCheckpointingTutorial/saved_simulation");
    }

    void TestRestarting()
    {
        BidomainProblem<2>* p_bidomain_problem = CardiacSimulationArchiver<BidomainProblem<2> >::Load("BidomainCheckpointingTutorial/saved_simulation");

        HeartConfig::Instance()->SetSimulationDuration(10); //ms

        HeartConfig::Instance()->SetIntracellularConductivities(Create_c_vector(3.0, 0.3));

        p_bidomain_problem->Solve();

        delete p_bidomain_problem;
    }
};