datreant: persistent, Pythonic trees for heterogeneous data
Abstract¶
In science the filesystem often serves as a de facto database, with
directory trees being the zeroth-order scientific data structure. But it can
be tedious and error prone to work directly with the filesystem to retrieve
and store heterogeneous datasets. datreant makes working with directory
structures and files Pythonic with Treants: specially marked directories with
distinguishing characteristics that can be discovered, queried, and filtered.
Treants can be manipulated individually and in aggregate, with mechanisms for
granular access to the directories and files in their trees. Disparate
datasets stored in any format (CSV, HDF5, NetCDF, Feather, etc.) scattered
throughout a filesystem can thus be manipulated as meta-datasets of Treants.
datreant is modular and extensible by design to allow specialized applications
to be built on top of it, with `MDSynthesis as an example for working with
molecular dynamics simulation data. http://