Blaze: Building A Foundation for Array-Oriented Computing in Python

Abstract

We present the motivation and architecture of Blaze, a library for cross-backend data-oriented computation. Blaze provides a standard interface to connect users familiar with NumPy and Pandas to other data analytics libraries like SQLAlchemy and Spark. We motivate the use of these projects through Blaze and discuss the benefits of standard interfaces on top of an increasingly varied software ecosystem. We give an overview of the Blaze architecture and then demonstrate its use on a typical problem. We use the abstract nature of Blaze to quickly benchmark and compare the performance of a variety of backends on a standard problem.

Keywords:array programmingbig datanumpyscipypandas