Chaste  Release::2017.1
Hdf5DataWriter Class Reference

#include <Hdf5DataWriter.hpp>

+ Inheritance diagram for Hdf5DataWriter:
+ Collaboration diagram for Hdf5DataWriter:

Public Member Functions

 Hdf5DataWriter (DistributedVectorFactory &rVectorFactory, const std::string &rDirectory, const std::string &rBaseName, bool cleanDirectory=true, bool extendData=false, std::string datasetName="Data", bool useCache=false)
 
virtual ~Hdf5DataWriter ()
 
void DefineFixedDimension (long dimensionSize)
 
void DefineFixedDimension (const std::vector< unsigned > &rNodesToOuput, long vecSize)
 
void DefineUnlimitedDimension (const std::string &rVariableName, const std::string &rVariableUnits, unsigned estimatedLength=1)
 
void AdvanceAlongUnlimitedDimension ()
 
int DefineVariable (const std::string &rVariableName, const std::string &rVariableUnits)
 
bool IsInDefineMode ()
 
virtual void EndDefineMode ()
 
void PossiblyExtend ()
 
void EmptyDataset ()
 
void PutVector (int variableID, Vec petscVector)
 
void PutStripedVector (std::vector< int > variableIDs, Vec petscVector)
 
bool GetUsingCache ()
 
void WriteCache ()
 
void PutUnlimitedVariable (double value)
 
void Close ()
 
int GetVariableByName (const std::string &rVariableName)
 
bool ApplyPermutation (const std::vector< unsigned > &rPermutation, bool unsafeExtendingMode=false)
 
void DefineFixedDimensionUsingMatrix (const std::vector< unsigned > &rNodesToOuput, long vecSize)
 
void SetFixedChunkSize (const unsigned &rTimestepsPerChunk, const unsigned &rNodesPerChunk, const unsigned &rVariablesPerChunk)
 
void SetTargetChunkSize (hsize_t targetSize)
 
void SetAlignment (hsize_t alignment)
 
- Public Member Functions inherited from AbstractHdf5Access
 AbstractHdf5Access (const std::string &rDirectory, const std::string &rBaseName, const std::string &rDatasetName, bool makeAbsolute=true)
 
 AbstractHdf5Access (const FileFinder &rDirectory, const std::string &rBaseName, const std::string &rDatasetName)
 
virtual ~AbstractHdf5Access ()
 
bool IsDataComplete ()
 
std::vector< unsignedGetIncompleteNodeMap ()
 
std::string GetUnlimitedDimensionName ()
 
std::string GetUnlimitedDimensionUnit ()
 

Private Member Functions

void CheckVariableName (const std::string &rName)
 
void CheckUnitsName (const std::string &rName)
 
void ComputeIncompleteOffset ()
 
void OpenFile ()
 
hsize_t CalculateNumberOfChunks ()
 
void CalculateChunkDims (unsigned targetSize, unsigned *pChunkSizeInBytes, bool *pAllOneChunk)
 
void SetChunkSize ()
 

Private Attributes

DistributedVectorFactorymrVectorFactory
 
const bool mCleanDirectory
 
const bool mUseExistingFile
 
bool mIsInDefineMode
 
bool mIsFixedDimensionSet
 
unsigned mEstimatedUnlimitedLength
 
unsigned mFileFixedDimensionSize
 
unsigned mDataFixedDimensionSize
 
unsigned mLo
 
unsigned mHi
 
unsigned mNumberOwned
 
unsigned mOffset
 
bool mNeedExtend
 
bool mUseMatrixForIncompleteData
 
std::vector< DataWriterVariablemVariables
 
long unsigned mCurrentTimeStep
 
Mat mSinglePermutation
 
Mat mDoublePermutation
 
Mat mSingleIncompleteOutputMatrix
 
Mat mDoubleIncompleteOutputMatrix
 
bool mUseOptimalChunkSizeAlgorithm
 
hsize_t mChunkSize [DATASET_DIMS]
 
hsize_t mNumberOfChunks
 
hsize_t mFixedChunkSize [DATASET_DIMS]
 
hsize_t mChunkTargetSize
 
hsize_t mAlignment
 
bool mUseCache
 
long unsigned mCacheFirstTimeStep
 
std::vector< doublemDataCache
 

Friends

class TestHdf5DataWriter
 

Additional Inherited Members

- Protected Member Functions inherited from AbstractHdf5Access
bool DoesDatasetExist (const std::string &rDatasetName)
 
void SetUnlimitedDatasetId ()
 
void SetMainDatasetRawChunkCache ()
 
- Protected Attributes inherited from AbstractHdf5Access
std::string mBaseName
 
std::string mDatasetName
 
FileFinder mDirectory
 
bool mIsDataComplete
 
std::string mUnlimitedDimensionName
 
std::string mUnlimitedDimensionUnit
 
bool mIsUnlimitedDimensionSet
 
std::vector< unsignedmIncompleteNodeIndices
 
hid_t mFileId
 
hid_t mUnlimitedDatasetId
 
hid_t mVariablesDatasetId
 
hsize_t mDatasetDims [DATASET_DIMS]
 
- Static Protected Attributes inherited from AbstractHdf5Access
static const unsigned DATASET_DIMS =3
 

Detailed Description

A concrete HDF5 data writer class.

Definition at line 48 of file Hdf5DataWriter.hpp.

Constructor & Destructor Documentation

Hdf5DataWriter::Hdf5DataWriter ( DistributedVectorFactory rVectorFactory,
const std::string &  rDirectory,
const std::string &  rBaseName,
bool  cleanDirectory = true,
bool  extendData = false,
std::string  datasetName = "Data",
bool  useCache = false 
)

Constructor.

Parameters
rVectorFactorythe factory to use in creating PETSc Vec and DistributedVector objects.
rDirectorythe directory in which to write the data to file, relative to chaste test output.
rBaseNamethe name of the file in which to write the data
cleanDirectorywhether to clean the directory (defaults to true)
extendDatawhether to try opening an existing file and appending to it.
datasetNameThe name of the HDF5 dataset to write, defaults to "Data".
useCacheWhether to cache writes so only whole chunks are written to disk.

The extendData parameter allows us to add to an existing dataset. It only really makes sense if the existing file has an unlimited dimension which we can extend. It also only makes sense if cleanDirectory is false, otherwise there won't be a file there to read...

Todo:
1300 We can't set mDataFixedDimensionSize, because the information isn't in the input file. This means that checking the size of input vectors in PutVector and PutStripedVector is impossible.

Definition at line 52 of file Hdf5DataWriter.cpp.

References AdvanceAlongUnlimitedDimension(), ComputeIncompleteOffset(), AbstractHdf5Access::DATASET_DIMS, AbstractHdf5Access::DoesDatasetExist(), EXCEPTION, DistributedVectorFactory::GetLocalOwnership(), mCacheFirstTimeStep, mChunkSize, mCleanDirectory, mCurrentTimeStep, mDataCache, mDataFixedDimensionSize, AbstractHdf5Access::mDatasetDims, AbstractHdf5Access::mDatasetName, mFileFixedDimensionSize, AbstractHdf5Access::mFileId, mFixedChunkSize, AbstractHdf5Access::mIncompleteNodeIndices, AbstractHdf5Access::mIsDataComplete, mIsFixedDimensionSet, mIsInDefineMode, AbstractHdf5Access::mIsUnlimitedDimensionSet, mLo, mNumberOwned, mOffset, mrVectorFactory, AbstractHdf5Access::mUnlimitedDatasetId, mUseCache, mUseExistingFile, DataWriterVariable::mVariableName, mVariables, AbstractHdf5Access::mVariablesDatasetId, DataWriterVariable::mVariableUnits, OpenFile(), and AbstractHdf5Access::SetUnlimitedDatasetId().

Hdf5DataWriter::~Hdf5DataWriter ( )
virtual

Member Function Documentation

bool Hdf5DataWriter::ApplyPermutation ( const std::vector< unsigned > &  rPermutation,
bool  unsafeExtendingMode = false 
)

Apply a permutation to all occurences of PutVector Should be called when in define mode

Parameters
rPermutationa forward/?reverse permutation
unsafeExtendingModeis true when we are extending a file which requires a permutation to be applied to it. In particular we are extending a cardiac simulation with "original node ordering"
Returns
success value. A value "false" indictates that the permutation was empty or was the identity and was not applied

Definition at line 1295 of file Hdf5DataWriter.cpp.

References EXCEPTION, mDataFixedDimensionSize, mDoublePermutation, mFileFixedDimensionSize, mHi, mIsInDefineMode, mLo, mSinglePermutation, and PetscTools::SetupMat().

Referenced by AbstractCardiacProblem< ELEMENT_DIM, SPACE_DIM, PROBLEM_DIM >::InitialiseWriter().

void Hdf5DataWriter::CalculateChunkDims ( unsigned  targetSize,
unsigned pChunkSizeInBytes,
bool pAllOneChunk 
)
private

Calculate (and save in the member variables) chunk dimensions based on a target size. Chunks are kept as "square" as possible while wasting as little space around the edges of the dataset as possible.

Parameters
[in]targetSizeThe target number of entries in each dimension in the chunk.
[out]pChunkSizeInBytesThe size of the resulting chunk in bytes.
[out]pAllOneChunkWhether the dataset is spanned by a single chunk.

Definition at line 1406 of file Hdf5DataWriter.cpp.

References CeilDivide(), AbstractHdf5Access::DATASET_DIMS, mChunkSize, and AbstractHdf5Access::mDatasetDims.

Referenced by SetChunkSize().

hsize_t Hdf5DataWriter::CalculateNumberOfChunks ( )
private

Little method to calculate the number of chunks resulting from given chunk dimensions.

Returns
The number of chunks resulting from chunk dimensions (stored in member variable).

Definition at line 1395 of file Hdf5DataWriter.cpp.

References CeilDivide(), AbstractHdf5Access::DATASET_DIMS, mChunkSize, and AbstractHdf5Access::mDatasetDims.

Referenced by SetChunkSize().

void Hdf5DataWriter::CheckUnitsName ( const std::string &  rName)
private

Check name of unit is allowed, i.e. contains only alphanumeric & _, and isn't blank.

Parameters
rNameunit name

Definition at line 558 of file Hdf5DataWriter.cpp.

References EXCEPTION.

Referenced by CheckVariableName(), and DefineVariable().

void Hdf5DataWriter::CheckVariableName ( const std::string &  rName)
private

Check name of variable is allowed, i.e. contains only alphanumeric & _, and isn't blank.

Parameters
rNamevariable name

Definition at line 549 of file Hdf5DataWriter.cpp.

References CheckUnitsName(), and EXCEPTION.

Referenced by DefineVariable().

void Hdf5DataWriter::ComputeIncompleteOffset ( )
private
void Hdf5DataWriter::DefineFixedDimension ( const std::vector< unsigned > &  rNodesToOuput,
long  vecSize 
)

Define the fixed dimension, assuming incomplete data output (subset of the nodes).

Parameters
rNodesToOuputNode indexes to be output (precondition: to be monotonic increasing)
vecSize

Definition at line 413 of file Hdf5DataWriter.cpp.

References ComputeIncompleteOffset(), DefineFixedDimension(), EXCEPTION, mFileFixedDimensionSize, AbstractHdf5Access::mIncompleteNodeIndices, and AbstractHdf5Access::mIsDataComplete.

void Hdf5DataWriter::DefineFixedDimensionUsingMatrix ( const std::vector< unsigned > &  rNodesToOuput,
long  vecSize 
)

Define the fixed dimension, assuming incomplete data output (subset of the nodes) and using a matrix to convert from full to incomplete output (rather than picking required data values out one at a time).

Parameters
rNodesToOuputNode indexes to be output (precondition: to be monotonic increasing)
vecSize

Definition at line 438 of file Hdf5DataWriter.cpp.

References ComputeIncompleteOffset(), DefineFixedDimension(), EXCEPTION, mDataFixedDimensionSize, mDoubleIncompleteOutputMatrix, mFileFixedDimensionSize, mHi, AbstractHdf5Access::mIncompleteNodeIndices, AbstractHdf5Access::mIsDataComplete, mLo, mNumberOwned, mOffset, mSingleIncompleteOutputMatrix, mUseMatrixForIncompleteData, and PetscTools::SetupMat().

void Hdf5DataWriter::DefineUnlimitedDimension ( const std::string &  rVariableName,
const std::string &  rVariableUnits,
unsigned  estimatedLength = 1 
)
void Hdf5DataWriter::EmptyDataset ( )

Reset mCurrentTimeStep to 0 and resize the dataset and unlimited dataset to size 1 in the first dimension. Future writes will therefore overwrite the original contents.

* THIS METHOD (EFFECTIVELY) DELETES THE DATASET! * *

Definition at line 1265 of file Hdf5DataWriter.cpp.

References mCurrentTimeStep, AbstractHdf5Access::mDatasetDims, mNeedExtend, and PossiblyExtend().

bool Hdf5DataWriter::GetUsingCache ( )

Whether we're caching writes

Returns
whether we're caching writes

Definition at line 1090 of file Hdf5DataWriter.cpp.

References mUseCache.

int Hdf5DataWriter::GetVariableByName ( const std::string &  rVariableName)

Get the id of the given variable, the variable must already exist or an exception will be thrown.

Parameters
rVariableNamevariable name to look up
Returns
HDF5 id for the given variable.

Definition at line 1275 of file Hdf5DataWriter.cpp.

References EXCEPTION, and mVariables.

Referenced by AbstractCardiacProblem< ELEMENT_DIM, SPACE_DIM, PROBLEM_DIM >::DefineExtraVariablesWriterColumns(), MonodomainPurkinjeProblem< ELEMENT_DIM, SPACE_DIM >::DefineWriterColumns(), BidomainProblem< DIM >::DefineWriterColumns(), ExtendedBidomainProblem< DIM >::DefineWriterColumns(), and AbstractCardiacProblem< ELEMENT_DIM, SPACE_DIM, PROBLEM_DIM >::DefineWriterColumns().

bool Hdf5DataWriter::IsInDefineMode ( )

Check whether writer is in define mode.

When in define mode variables can be defined but no data may be written, and vice versa. When extending, the writer will be in define mode if the dataset does not exist, and vice versa.

Returns
whether in define mode.

Definition at line 570 of file Hdf5DataWriter.cpp.

References mIsInDefineMode.

void Hdf5DataWriter::PossiblyExtend ( )

Extend the dataset to the correct to the correct dimensions if needed.

Definition at line 1255 of file Hdf5DataWriter.cpp.

References AbstractHdf5Access::mDatasetDims, mNeedExtend, AbstractHdf5Access::mUnlimitedDatasetId, and AbstractHdf5Access::mVariablesDatasetId.

Referenced by EmptyDataset(), PutStripedVector(), PutUnlimitedVariable(), and PutVector().

void Hdf5DataWriter::SetAlignment ( hsize_t  alignment)

Set the alignment parameter to pass through to H5Pset_alignment. Every file object will be aligned on the disk to a multiple of this parameter. See the H5P docs for more information.

Especially useful with SetTargetChunkSize to ensure each chunk (each chunk is an 'file object') is aligned to a disk stripe on a striped file system with minimal wastage.

This method only has an effect when creating a NEW HDF5 FILE. Must be called in define mode with extendData set to False.

Parameters
alignmentAlignment (bytes)

Definition at line 1488 of file Hdf5DataWriter.cpp.

References EXCEPTION, mAlignment, mIsInDefineMode, and mUseExistingFile.

Referenced by AbstractCardiacProblem< ELEMENT_DIM, SPACE_DIM, PROBLEM_DIM >::InitialiseWriter().

void Hdf5DataWriter::SetChunkSize ( )
private

This method sets the chunk size by building up in each dimension until a threshold is reached, unless user-specified values have been set using SetFixedChunkSize. By default, chunks of 128 K are used, which seems to be a good compromise. For large problems performance will usually improve by increasing this value (to e.g. 1 M).

Note: The public method for altering the algorithm's target chunk size is SetTargetChunkSize.

Definition at line 1430 of file Hdf5DataWriter.cpp.

References CalculateChunkDims(), CalculateNumberOfChunks(), AbstractHdf5Access::DATASET_DIMS, mChunkSize, mChunkTargetSize, mFixedChunkSize, mNumberOfChunks, and mUseOptimalChunkSizeAlgorithm.

Referenced by EndDefineMode().

void Hdf5DataWriter::SetFixedChunkSize ( const unsigned rTimestepsPerChunk,
const unsigned rNodesPerChunk,
const unsigned rVariablesPerChunk 
)

Use a particular chunk size, ignoring the algorithm that figures out a sensible value.

This method may be useful in very specific circumstances, e.g. if you want chunks to align perfectly with stripes on a Lustre file system, or if you have a parallel problem where every process can be assigned the same size and shape hyperslab of the HDF5 file (thus getting rid of contention between processes when reading/writing).

USE WITH CAUTION - as it can degrade performance.

Parameters
rTimestepsPerChunkThe number of unlimited dimension steps (usually time steps) per chunk.
rNodesPerChunkThe number of objects (usually nodes) per chunk in the second dimension.
rVariablesPerChunkThe number of objects (usually output variables) per chunk in the third dimension.

Definition at line 1383 of file Hdf5DataWriter.cpp.

References mFixedChunkSize, mIsInDefineMode, and mUseOptimalChunkSizeAlgorithm.

void Hdf5DataWriter::SetTargetChunkSize ( hsize_t  targetSize)

Adjust the target (max) chunk size in the chunking algorithm. Useful when one knows roughly how big a chunk should be but doesn't care about the exact dimensions or shape. Default is 128 K.

Especially useful with SetAlignment for ensuring each chunk gets its own stripe on striped file systems.

This method only has an effect when creating a NEW DATASET. Must be called in define mode.

Parameters
targetSizeMax chunk size (bytes)

Definition at line 1477 of file Hdf5DataWriter.cpp.

References EXCEPTION, mChunkTargetSize, and mIsInDefineMode.

Referenced by AbstractCardiacProblem< ELEMENT_DIM, SPACE_DIM, PROBLEM_DIM >::InitialiseWriter().

Member Data Documentation

hsize_t Hdf5DataWriter::mAlignment
private

User-provided alignment parameter

Definition at line 88 of file Hdf5DataWriter.hpp.

Referenced by OpenFile(), and SetAlignment().

long unsigned Hdf5DataWriter::mCacheFirstTimeStep
private

Coordinate to keep track of cache writes

Definition at line 91 of file Hdf5DataWriter.hpp.

Referenced by Hdf5DataWriter(), and WriteCache().

hsize_t Hdf5DataWriter::mChunkSize[DATASET_DIMS]
private
hsize_t Hdf5DataWriter::mChunkTargetSize
private

User-provided target chunk size (for the algorithm)

Definition at line 86 of file Hdf5DataWriter.hpp.

Referenced by SetChunkSize(), and SetTargetChunkSize().

const bool Hdf5DataWriter::mCleanDirectory
private

Whether to wipe the output directory

Definition at line 56 of file Hdf5DataWriter.hpp.

Referenced by Hdf5DataWriter(), and OpenFile().

long unsigned Hdf5DataWriter::mCurrentTimeStep
private
std::vector<double> Hdf5DataWriter::mDataCache
private

Cache results here before writing

Definition at line 92 of file Hdf5DataWriter.hpp.

Referenced by EndDefineMode(), Hdf5DataWriter(), PutStripedVector(), PutVector(), and WriteCache().

unsigned Hdf5DataWriter::mDataFixedDimensionSize
private

The size of the fixed dimension (size of the vector of nodes)

Definition at line 63 of file Hdf5DataWriter.hpp.

Referenced by ApplyPermutation(), DefineFixedDimension(), DefineFixedDimensionUsingMatrix(), Hdf5DataWriter(), PutStripedVector(), and PutVector().

Mat Hdf5DataWriter::mDoubleIncompleteOutputMatrix
private

Stores striped nodes to be output as a matrix

Definition at line 80 of file Hdf5DataWriter.hpp.

Referenced by DefineFixedDimensionUsingMatrix(), PutStripedVector(), and ~Hdf5DataWriter().

Mat Hdf5DataWriter::mDoublePermutation
private

Stores a permutation of a striped structure (u_0 v_0 u_1 v_1) as a matrix

Definition at line 77 of file Hdf5DataWriter.hpp.

Referenced by ApplyPermutation(), PutStripedVector(), and ~Hdf5DataWriter().

unsigned Hdf5DataWriter::mEstimatedUnlimitedLength
private

An estimate of the unlimited dimension length for performance reasons.

Definition at line 61 of file Hdf5DataWriter.hpp.

Referenced by AdvanceAlongUnlimitedDimension(), DefineUnlimitedDimension(), EndDefineMode(), and OpenFile().

unsigned Hdf5DataWriter::mFileFixedDimensionSize
private

The size of the fixed dimension (number of rows)

Definition at line 62 of file Hdf5DataWriter.hpp.

Referenced by ApplyPermutation(), DefineFixedDimension(), DefineFixedDimensionUsingMatrix(), EndDefineMode(), Hdf5DataWriter(), OpenFile(), PutStripedVector(), and PutVector().

hsize_t Hdf5DataWriter::mFixedChunkSize[DATASET_DIMS]
private

User-provided chunk size

Definition at line 85 of file Hdf5DataWriter.hpp.

Referenced by Hdf5DataWriter(), SetChunkSize(), and SetFixedChunkSize().

unsigned Hdf5DataWriter::mHi
private

Local ownership of a PETSc vector of size mFileFixedDimensionSize

Definition at line 65 of file Hdf5DataWriter.hpp.

Referenced by ApplyPermutation(), ComputeIncompleteOffset(), DefineFixedDimension(), and DefineFixedDimensionUsingMatrix().

bool Hdf5DataWriter::mIsFixedDimensionSet
private

Is the fixed dimension set

Definition at line 59 of file Hdf5DataWriter.hpp.

Referenced by DefineFixedDimension(), EndDefineMode(), and Hdf5DataWriter().

unsigned Hdf5DataWriter::mLo
private
bool Hdf5DataWriter::mNeedExtend
private

Used so that the data set is only extended when data is written

Definition at line 69 of file Hdf5DataWriter.hpp.

Referenced by AdvanceAlongUnlimitedDimension(), EmptyDataset(), and PossiblyExtend().

hsize_t Hdf5DataWriter::mNumberOfChunks
private

The total number of chunks in the dataset

Definition at line 84 of file Hdf5DataWriter.hpp.

Referenced by SetChunkSize().

unsigned Hdf5DataWriter::mNumberOwned
private
unsigned Hdf5DataWriter::mOffset
private
DistributedVectorFactory& Hdf5DataWriter::mrVectorFactory
private

The factory to use in creating PETSc Vec and DistributedVector objects.

Definition at line 54 of file Hdf5DataWriter.hpp.

Referenced by DefineFixedDimension(), and Hdf5DataWriter().

Mat Hdf5DataWriter::mSingleIncompleteOutputMatrix
private

Stores nodes to be output as a matrix

Definition at line 79 of file Hdf5DataWriter.hpp.

Referenced by DefineFixedDimensionUsingMatrix(), PutVector(), and ~Hdf5DataWriter().

Mat Hdf5DataWriter::mSinglePermutation
private

Stores a permutation as a matrix

Definition at line 76 of file Hdf5DataWriter.hpp.

Referenced by ApplyPermutation(), PutVector(), and ~Hdf5DataWriter().

bool Hdf5DataWriter::mUseCache
private
const bool Hdf5DataWriter::mUseExistingFile
private

Whether we are using an existing file (for extending existing dataset, or adding a new one)

Definition at line 57 of file Hdf5DataWriter.hpp.

Referenced by Hdf5DataWriter(), OpenFile(), and SetAlignment().

bool Hdf5DataWriter::mUseMatrixForIncompleteData
private

Whether to use a matrix format for incomplete data

Definition at line 70 of file Hdf5DataWriter.hpp.

Referenced by DefineFixedDimensionUsingMatrix(), PutStripedVector(), and PutVector().

bool Hdf5DataWriter::mUseOptimalChunkSizeAlgorithm
private

Whether to use the built-in algorithm for optimal chunk size

Definition at line 82 of file Hdf5DataWriter.hpp.

Referenced by SetChunkSize(), and SetFixedChunkSize().

std::vector<DataWriterVariable> Hdf5DataWriter::mVariables
private

The data variables

Definition at line 72 of file Hdf5DataWriter.hpp.

Referenced by DefineVariable(), EndDefineMode(), GetVariableByName(), Hdf5DataWriter(), and OpenFile().


The documentation for this class was generated from the following files: