Path Expansion#

Helpful classes for expanding file or folder paths on a system given an f-string rule for matching patterns.

class AbstractPathExpander(/, *args, **kwargs)[source]#

Bases: ABC

Abstract base class for expanding file paths and extracting metadata.

This class provides methods to extract metadata from file paths within a directory and to expand paths based on a specified data specification. It is designed to be subclassed, with the list_directory method needing to be implemented by any subclass to provide the specific logic for listing files in a directory.

extract_metadata(base_directory: Path, format_: str)[source]#

Uses the parse library to extract metadata from file paths in the base_directory.

This method iterates over files in base_directory, parsing each file path according to format_. The format string is adjusted to the current operating system’s path separator. The method yields each file path and its corresponding parsed metadata. To constrain metadata matches to only the name of the file or folder/directory, the method checks that the metadata does not contain the OS path separator (e.g., ‘/’ or ‘').

Parameters
  • base_directory (DirectoryPath) – The base directory from which to list files for metadata extraction. It should be a path-like object that is convertible to a pathlib.Path.

  • format_ (str) – The format string used for parsing the file paths. This string can represent a path in any OS format, and is adjusted internally to match the current OS’s path separator.

Yields

tuple[Path, dict[str, Any]] – A tuple containing the file path as a Path object and a dictionary of the named metadata extracted from the file path.

abstract list_directory(base_directory: Path) Iterable[Path][source]#

List all folders and files in a directory recursively.

Parameters

base_directory (DirectoryPath) – The base directory whose contents will be iterated recursively.

Yields

sub_paths (iterable of strings) – Generator that yields all sub-paths of file and folders from the common root base_directory.

expand_paths(source_data_spec: dict[str, dict]) list[neuroconv.utils.dict.DeepDict][source]#

Match paths in a directory to specs and extract metadata from the paths.

Parameters

source_data_spec (dict) – Source spec.

Returns

deep_dicts

Return type

list of DeepDict objects

Examples

>>> path_expander.expand_paths(
...     dict(
...         spikeglx=dict(
...             base_directory="source_folder",
...             paths=dict(
...                 file_path="sub-{subject_id}/sub-{subject_id}_ses-{session_id}"
...             )
...         )
...     )
... )
class LocalPathExpander(/, *args, **kwargs)[source]#

Bases: AbstractPathExpander

Class for expanding file paths and extracting metadata on a local filesystem.

See https://neuroconv.readthedocs.io/en/main/user_guide/expand_path.html for more information.

list_directory(base_directory: Path) Iterable[Path][source]#

List all folders and files in a directory recursively.

Parameters

base_directory (DirectoryPath) – The base directory whose contents will be iterated recursively.

Yields

sub_paths (iterable of strings) – Generator that yields all sub-paths of file and folders from the common root base_directory.

construct_path_template(path: str, *, subject_id: str, session_id: str, **metadata_kwargs) str[source]#

Construct a path template by replacing specific parts of a given path with placeholders.

This function takes a real path example and replaces the occurrences of subject ID, session ID, and any additional metadata values with their respective placeholders.

Parameters
  • path (str) – The path string containing actual data that needs to be templated.

  • subject_id (str) – The subject ID in the path that will be replaced with the ‘{subject_id}’ placeholder.

  • session_id (str) – The session ID in the path that will be replaced with the ‘{session_id}’ placeholder.

  • **metadata_kwargs (dict) – Additional key-value pairs where the key is the placeholder name and the value is the actual data in the path that should be replaced by the placeholder.

Returns

The path string with specified parts replaced by placeholders.

Return type

str

Raises

ValueError – If subject_id, session_id, or any value in metadata_kwargs is an empty string, or if subject_id or session_id placeholders are not found in the path.

Examples

>>> construct_path_template(
>>>     "/data/subject456/session123/file.txt",
>>>     subject_id="subject456",
>>>     session_id="session123"
>>> )
'/data/{subject_id}/{session_id}/file.txt'
>>> construct_path_template(
>>>     "/data/subject789/session456/image.txt",
>>>     subject_id="subject789",
>>>     session_id="session456",
>>>     file_type="image"
>>> )
'/data/{subject_id}/{session_id}/{file_type}.txt'