VTK
9.2.6
|
VTK 10.0 introduces a new mechanism for representing data hierarchies using vtkPartitionedDataSetCollection and vtkDataAssembly. This document describes the design details.
The design is based on three classes:
vtkPartitionedDataSet is simply a collection of datasets that are to be treated as a logical whole. In data-parallel applications, each dataset may represent a partition of the complete dataset on the current worker process, rank, or thread. Each dataset in a vtkPartitionedDataSet is called a partition, implying it is only a part of a whole.
All non-null partitions have similar field and attribute arrays. For example, if a vtkPartitionedDataSet comprises of vtkDataSet subclasses, all will have exactly the same number of point data/cell data arrays, with same names, same number of components, and same data types.
vtkPartitionedDataSetCollection is a collection of vtkPartitionedDataSet. Thus, it is simply a mechanism to group multiple vtkPartitionedDataSet instances together. Since each vtkPartitionedDataSet represents a whole dataset (not be confused with vtkDataSet), we can refer to each item in a vtkPartitionedDataSetCollection as a partitioned-dataset.
Unlike items in the vtkPartitionedDataSet, there are no restrictions of consistency between each items, partitioned-datasets, in the vtkPartitionedDataSetCollection. Thus, in the multiblock-dataset parlance, each item in this collection can be thought of as a block.
vtkDataAssembly is a means to define an hierarchical organization of items in a vtkPartitionedDataSetCollection. This is literally a tree made up of named nodes. Each node in the tree can have associated dataset-indices. For a vtkDataAssembly is associated with a vtkPartitionedDataSetCollection, each of the dataset-indices is simply the index of a partitioned-dataset in the vtkPartitionedDataSetCollection. A dataset-index can be associated with multiple nodes in the assembly, however, a dataset-index cannot be associated with the same node more than once.
An assembly provides an ability to define a more complex view of the raw data blocks in a more application-specific form. This is not much different than what could be achieved using simply a vtkMultiBlockDataSet. However, there are several advantages to this separation of storage (vtkPartitionedDataSetCollection) and organization (vtkDataAssembly). These will become clear as we cover different use-cases.
While nodes in the data-assembly have unique ids, public facing algorithm APIs should not use them. For example an extract-block filter that allows users to choose which blocks (rather partitioned-datasets) to extract from vtkPartitionedDataSetCollection can expose an API that lets users provide path-expression to identify nodes in the associated data-assembly using their names.
Besides accessing nodes by querying using their names, vtkDataAssembly also supports a mechanism to iterate over all nodes in depth-first or breadth-first order using a visitor. vtkDataAssemblyVisitor defines a API that can be implemented to do custom action as each node in the tree is visited.