Abstract

In this work we discuss gpustats, a new Python library for assisting in “big data” statistical computing applications, particularly Monte Carlo-based inference algorithms. The library provides a general code generation / metaprogramming framework for easily implementing discrete and continuous probability density functions and random variable samplers. These functions can be utilized to achieve more than 100x speedup over their CPU equivalents. We demonstrate their use in an Bayesian MCMC application and discuss avenues for future work.

Keywords:GPUCUDAOpenCLPythonstatistical inferencestatisticsmetaprogrammingsamplingMarkov Chain Monte Carlo (MCMC)PyMCbig data