NWB Helpers#

Collection of Pydantic models and helper functions for configuring dataset IO parameters for different backends.

class BackendConfiguration(/, **data: 'Any') → 'None'[source]#

Bases: BaseModel

A model for matching collections of DatasetConfigurations to a specific backend.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

backend: ClassVar[Literal['hdf5', 'zarr']]#

pretty_backend_name: ClassVar[Literal['HDF5', 'Zarr']]#

data_io_class: ClassVar[Type[DataIO]]#

model_config: ClassVar[ConfigDict] = {'validate_assignment': True}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

dataset_configurations: dict[str, neuroconv.tools.nwb_helpers._configuration_models._base_dataset_io.DatasetIOConfiguration]#

__str__() → str[source]#: Not overriding __repr__ as this is intended to render only when wrapped in print().

classmethod schema(**kwargs) → dict[str, Any][source]#

classmethod schema_json(**kwargs) → dict[str, Any][source]#

classmethod model_json_schema(**kwargs) → dict[str, Any][source]#

Generates a JSON schema for a model class.

Args:

by_alias: Whether to use attribute aliases or not. ref_template: The reference template. schema_generator: To override the logic used to generate the JSON schema, as a subclass of

GenerateJsonSchema with your desired modifications

mode: The mode in which to generate the schema.

Returns:

The JSON schema for the given model class.

classmethod from_nwbfile(nwbfile: NWBFile) → Self[source]#

find_locations_requiring_remapping(nwbfile: NWBFile) → dict[str, neuroconv.tools.nwb_helpers._configuration_models._base_dataset_io.DatasetIOConfiguration][source]#

Find locations of objects with mismatched IDs in the file.

This function identifies neurodata objects in the nwbfile that have matching locations with the current configuration but different object IDs. It returns a dictionary of remapped DatasetIOConfiguration objects for these mismatched locations.

Parameters: nwbfile (pynwb.NWBFile) – The NWBFile object to check for mismatched object IDs.
Returns: A dictionary where: * Keys: Locations in the NWB of objects with mismatched IDs. * Values: New DatasetIOConfiguration objects corresponding to the updated object IDs.
Return type: dict[str, DatasetIOConfiguration]

Notes

This function only checks for objects with the same location but different IDs.
It does not identify objects missing from the current configuration.
The returned DatasetIOConfiguration objects are copies of the original configurations

with updated object_id fields.

build_remapped_backend(locations_to_remap: dict[str, neuroconv.tools.nwb_helpers._configuration_models._base_dataset_io.DatasetIOConfiguration]) → Self[source]#

Build a remapped backend configuration by updating mismatched object IDs.

This function takes a dictionary of new DatasetIOConfiguration objects (as returned by find_locations_requiring_remapping) and updates a copy of the current configuration with these new configurations.

Parameters: locations_to_remap (dict) – A dictionary mapping locations in the NWBFile to their corresponding new DatasetIOConfiguration objects with updated IDs.
Returns: A new instance of the backend configuration class with updated object IDs for the specified locations.
Return type: Self

class HDF5BackendConfiguration(/, **data: 'Any') → 'None'[source]#

Bases: BackendConfiguration

A model for matching collections of DatasetConfigurations specific to the HDF5 backend.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

backend: ClassVar[Literal['hdf5']] = 'hdf5'#

pretty_backend_name: ClassVar[Literal['HDF5']] = 'HDF5'#

data_io_class#: alias of H5DataIO

dataset_configurations: dict[str, neuroconv.tools.nwb_helpers._configuration_models._hdf5_dataset_io.HDF5DatasetIOConfiguration]#

model_config: ClassVar[ConfigDict] = {'validate_assignment': True}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ZarrBackendConfiguration(/, **data: 'Any') → 'None'[source]#

Bases: BackendConfiguration

A model for matching collections of DatasetConfigurations specific to the Zarr backend.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

backend: ClassVar[Literal['zarr']] = 'zarr'#

pretty_backend_name: ClassVar[Literal['Zarr']] = 'Zarr'#

data_io_class#: alias of ZarrDataIO

dataset_configurations: dict[str, neuroconv.tools.nwb_helpers._configuration_models._zarr_dataset_io.ZarrDatasetIOConfiguration]#

number_of_jobs: int#

model_config: ClassVar[ConfigDict] = {'validate_assignment': True}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class DatasetIOConfiguration(/, **data: 'Any') → 'None'[source]#

Bases: BaseModel, ABC

A data model for configuring options about an object that will become a HDF5 or Zarr Dataset in the file.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

model_config: ClassVar[ConfigDict] = {'validate_assignment': True}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

object_id: str#

location_in_file: str#

dataset_name: Literal['data', 'timestamps']#

dtype: dtype#

full_shape: tuple[int, ...]#

chunk_shape: ]]#

buffer_shape: Optional[tuple[int, ...]]#

compression_method: Optional[Union[str, FilterRefBase, Codec]]#

compression_options: Optional[dict[str, Any]]#

abstract get_data_io_kwargs() → dict[str, Any][source]#

Fetch the properly structured dictionary of input arguments.

Should be passed directly as dynamic keyword arguments (**kwargs) into a H5DataIO or ZarrDataIO.

__str__() → str[source]#

Not overriding __repr__ as this is intended to render only when wrapped in print().

Reason being two-fold; a standard repr is intended to be slightly more machine-readable / a more basic representation of the true object state. But then also because an iterable of these objects, such as a list[DatasetConfiguration], would print out the nested representations, which only look good when using the basic repr (that is, this fancy string print-out does not look good when nested in another container).

classmethod validate_all_shapes(values: dict[str, Any]) → dict[str, Any][source]#

classmethod schema(**kwargs) → dict[str, Any][source]#

classmethod schema_json(**kwargs) → dict[str, Any][source]#

classmethod model_json_schema(**kwargs) → dict[str, Any][source]#

Generates a JSON schema for a model class.

Args:

by_alias: Whether to use attribute aliases or not. ref_template: The reference template. schema_generator: To override the logic used to generate the JSON schema, as a subclass of

GenerateJsonSchema with your desired modifications

mode: The mode in which to generate the schema.

Returns:

The JSON schema for the given model class.

classmethod from_neurodata_object(neurodata_object: Container, dataset_name: Literal['data', 'timestamps'], builder: Optional[BaseBuilder] = None) → Self[source]#

Construct an instance of a DatasetIOConfiguration for a dataset in a neurodata object in an NWBFile.

Parameters

neurodata_object (hdmf.Container) – The neurodata object containing the field that will become a dataset when written to disk.
dataset_name (“data” or “timestamps”) – The name of the field that will become a dataset when written to disk. Some neurodata objects can have multiple such fields, such as pynwb.TimeSeries which can have both data and timestamps, each of which can be configured separately.
builder (hdmf.build.builders.BaseBuilder, optional) – The builder object that would be used to construct the NWBFile object. If None, the dataset is assumed to NOT have a compound dtype.

class HDF5DatasetIOConfiguration(/, **data: 'Any') → 'None'[source]#

Bases: DatasetIOConfiguration

A data model for configuring options about an object that will become a HDF5 Dataset in the file.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

compression_method: Optional[Union[Literal['gzip', 'lzf', 'szip', 'Bitshuffle', 'Blosc', 'Blosc2', 'BZip2', 'FciDecomp', 'LZ4', 'Sperr', 'SZ', 'SZ3', 'Zfp', 'Zstd'], FilterRefBase]]#

compression_options: Optional[dict[str, Any]]#

get_data_io_kwargs() → dict[str, Any][source]#

Fetch the properly structured dictionary of input arguments.

Should be passed directly as dynamic keyword arguments (**kwargs) into a H5DataIO or ZarrDataIO.

model_config: ClassVar[ConfigDict] = {'validate_assignment': True}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ZarrDatasetIOConfiguration(/, **data: 'Any') → 'None'[source]#

Bases: DatasetIOConfiguration

A data model for configuring options about an object that will become a Zarr Dataset in the file.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

compression_method: Optional[Union[Literal['gzip', 'blosc', 'lz4', 'delta', 'lzma', 'fletcher32', 'categorize', 'packbits', 'bz2', 'zstd', 'zlib', 'jenkins_lookup3', 'shuffle'], Codec]]#

compression_options: Optional[dict[str, Any]]#

filter_methods: Codec, InstanceOf()]]]]#

filter_options: Optional[list[dict[str, Any]]]#

classmethod validate_filter_methods_and_options_length_match(values: dict[str, Any])[source]#

get_data_io_kwargs() → dict[str, Any][source]#

Fetch the properly structured dictionary of input arguments.

Should be passed directly as dynamic keyword arguments (**kwargs) into a H5DataIO or ZarrDataIO.

model_config: ClassVar[ConfigDict] = {'validate_assignment': True}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

get_default_backend_configuration(nwbfile: NWBFile, backend: Literal['hdf5', 'zarr']) → Union[HDF5BackendConfiguration, ZarrBackendConfiguration][source]#: Fill a default backend configuration to serve as a starting point for further customization.

get_default_dataset_io_configurations(nwbfile: NWBFile, backend: Union[None, Literal['hdf5', 'zarr']] = None) → Generator[DatasetIOConfiguration, None, None][source]#

Generate DatasetIOConfiguration objects for wrapping NWB file objects with a specific backend.

This method automatically detects all objects in an NWB file that can be wrapped in a hdmf.DataIO. If the NWB file is in append mode, it supports auto-detection of the backend. Otherwise, it requires a backend specification.

Parameters

nwbfile (pynwb.NWBFile) – An in-memory NWBFile object, either generated from the base class or read from an existing file of any backend.
backend (“hdf5” or “zarr”) – Which backend format type you would like to use in configuring each dataset’s compression methods and options.

Yields

DatasetIOConfiguration – A summary of each detected object that can be wrapped in a hdmf.DataIO.

configure_backend(nwbfile: NWBFile, backend_configuration: Union[HDF5BackendConfiguration, ZarrBackendConfiguration]) → None[source]#

Configure all datasets specified in the backend_configuration with their appropriate DataIO and options.

Parameters

nwbfile (pynwb.NWBFile) – The in-memory pynwb.NWBFile object to configure.
backend_configuration (HDF5BackendConfiguration or ZarrBackendConfiguration) – The configuration model to use when configuring the datasets for this backend.

add_device_from_metadata(nwbfile: NWBFile, modality: str = 'Ecephys', metadata: Optional[dict] = None)[source]#

Add device information from metadata to NWBFile object.

Will always ensure nwbfile has at least one device, but multiple devices within the metadata list will also be created.

Parameters

nwbfile (NWBFile) – NWBFile to which the new device information is to be added
modality (str) – Type of data recorded by device. Options: - Ecephys (default) - Icephys - Ophys - Behavior
metadata (dict) – Metadata info for constructing the NWBFile (optional). Should be of the format:
```
metadata[modality]['Device'] = [
    {
        'name': my_name,
        'description': my_description
    },
    ...
]
```
Missing keys in an element of metadata['Ecephys']['Device'] will be auto-populated with defaults.

configure_and_write_nwbfile(nwbfile: NWBFile, output_filepath: str, backend: Optional[Literal['hdf5']] = None, backend_configuration: Optional[BackendConfiguration] = None) → None[source]#

Write an NWB file using a specific backend or backend configuration.

A backend or a backend_configuration must be provided. To use the default backend configuration for the specified backend, provide only backend. To use a custom backend configuration, provide backend_configuration. If both are provided, backend must match backend_configuration.backend.

Parameters

nwbfile (NWBFile)
output_filepath (str)
backend ({“hdf5”}, optional) – The type of backend used to create the file. This option uses the default backend_configuration for the specified backend. If no backend is specified, the backend_configuration is used.
backend_configuration (BackendConfiguration, optional) – Specifies the backend type and the chunking and compression parameters of each dataset. If no backend_configuration is specified, the default configuration for the specified backend is used.

get_default_nwbfile_metadata() → DeepDict[source]#

Return structure with defaulted metadata values required for a NWBFile.

These standard defaults are:

metadata["NWBFile"]["session_description"] = "no description"
metadata["NWBFile"]["identifier"] = str(uuid.uuid4())

Proper conversions should override these fields prior to calling NWBConverter.run_conversion()

Returns: A dictionary containing default metadata values for an NWBFile, including session description, identifier, and NeuroConv version information.
Return type: DeepDict

get_module(nwbfile: NWBFile, name: str, description: str = None)[source]#

Check if processing module exists. If not, create it. Then return module.

Parameters

nwbfile (NWBFile) – The NWB file to check or add the module to.
name (str) – The name of the processing module.
description (str, optional) – Description of the module. Only used if creating a new module.

Returns

The existing or newly created processing module.

Return type

ProcessingModule

make_nwbfile_from_metadata(metadata: dict) → NWBFile[source]#

Make NWBFile from available metadata.

Parameters: metadata (dict) – Dictionary containing metadata for creating the NWBFile. Must contain an ‘NWBFile’ key with required fields.
Returns: A newly created NWBFile object initialized with the provided metadata.
Return type: NWBFile

make_or_load_nwbfile(nwbfile_path: Optional[Path] = None, nwbfile: Optional[NWBFile] = None, metadata: Optional[dict] = None, overwrite: bool = False, backend: Literal['hdf5', 'zarr'] = 'hdf5', verbose: bool = False)[source]#

Context for automatically handling decision of write vs. append for writing an NWBFile.

Parameters

nwbfile_path (FilePath) – Path for where to write or load (if overwrite=False) the NWBFile. If specified, the context will always write to this location.
nwbfile (NWBFile, optional) – An in-memory NWBFile object to write to the location.
metadata (dict, optional) – Metadata dictionary with information used to create the NWBFile when one does not exist or overwrite=True.
overwrite (bool, default: False) – Whether to overwrite the NWBFile if one exists at the nwbfile_path. The default is False (append mode).
backend (“hdf5” or “zarr”, default: “hdf5”) – The type of backend used to create the file.
verbose (bool, default: True) – If ‘nwbfile_path’ is specified, informs user after a successful write operation.