Pydra - a flexible and lightweight dataflow engine for scientific analyses

Abstract

This paper presents a new lightweight dataflow engine written in Python: Pydra. Pydra is developed as an open-source project in the neuroimaging community, but it is designed as a general-purpose dataflow engine to support any scientific domain. The paper describes the architecture of the software, as well as several useful features, that make Pydra a customizable and powerful dataflow engine. Two examples are presented to demonstrate the syntax and properties of the package.

Keywords:dataflow enginescientific workflowsreproducibility