Abstract

The generation of random variates is an important tool that is required in many applications. Various software programs or packages contain generators for standard distributions like the normal, exponential or Gamma, e.g., the programming language R and the packages SciPy and NumPy in Python. However, it is not uncommon that sampling from new/non-standard distributions is required. Instead of deriving specific generators in such situations, so-called automatic or black-box methods have been developed. These allow the user to generate random variates from fairly large classes of distributions by only specifying some properties of the distributions (e.g. the density and/or cumulative distribution function). In this note, we describe the implementation of such methods from the C library UNU.RAN in the Python package SciPy and provide a brief overview of the functionality.

Keywords:numerical inversiongeneration of random variates

Introduction

The generation of random variates is an important tool that is required in many applications. Various software programs or packages contain generators for standard distributions, e.g., R R Core Team, 2021 and SciPy Virtanen et al., 2020 and NumPy Harris et al., 2020 in Python. Standard references for these algorithms are the books Devroye (1986), Dagpunar (1988), Gentle (2003), and Knuth (2014). An interested reader will find many references to the vast existing literature in these works. While relying on general methods such as the rejection principle, the algorithms for well-known distributions are often specifically designed for a particular distribution. This is also the case in the module stats in SciPy that contains more than 100 distributions and the module random in NumPy with more than 30 distributions. However, there are also so-called automatic or black-box methods for sampling from large classes of distributions with a single piece of code. For such algorithms, information about the distribution such as the density, potentially together with its derivative, the cumulative distribution function (CDF), and/or the mode must be provided. See Hörmann et al. (2004) for a comprehensive overview of these methods. Although the development of such methods was originally motivated to generate variates from non-standard distributions, these universal methods have advantages that make their usage attractive even for sampling from standard distributions. We mention some of the important properties (see Leydold & Hörmann (2000), Hörmann et al. (2004), Derflinger et al. (2010)):

  • The algorithms can be used to sample from truncated distributions.
  • For inversion methods, the structural properties of the underlying uniform random number generator are preserved and the numerical accuracy of the methods can be controlled by a parameter. Therefore, inversion is usually the only method applied for simulations using quasi-Monte Carlo (QMC) methods.
  • Depending on the use case, one can choose between a fast setup with slow marginal generation time and vice versa.

The latter point is important depending on the use case: if a large number of samples is required for a given distribution with fixed shape parameters, a slower setup that only has to be run once can be accepted if the marginal generation times are low. If small to moderate samples sizes are required for many different shape parameters, then it is important to have a fast setup. The former situation is referred to as the fixed-parameter case and the latter as the varying parameter case.

Implementations of various methods are available in the C library UNU.RAN Hörmann & Leydold, 2007 and in the associated R package Runuran (https://cran.r-project.org/web/packages/Runuran/index.html, Tirler & Leydold (2003)). The aim of this note is to introduce the Python implementation in the SciPy package that makes some of the key methods in UNU.RAN available to Python users in SciPy 1.8.0. These general tools can be seen as a complement to the existing specific sampling methods: they might lead to better performance in specific situations compared to the existing generators, e.g., if a very large number of samples are required for a fixed parameter of a distribution or if the implemented sampling method relies on a slow default that is based on numerical inversion of the CDF. For advanced users, they also offer various options that allow to fine-tune the generators (e.g., to control the time needed for the setup step).

Automatic algorithms in SciPy

Many of the automatic algorithms described in Hörmann et al. (2004) and Derflinger et al. (2010) are implemented in the ANSI C library, UNU.RAN (Universal Non-Uniform RANdom variate generators). Our goal was to provide a Python interface to the most important methods from UNU.RAN to generate univariate discrete and continuous non-uniform random variates. The following generators have been implemented in SciPy 1.8.0:

Before describing the implementation in SciPy, we give a short introduction to random variate generation.

A very brief introduction to random variate generation

It is well-known that random variates can be generated by inversion of the CDF FF of a distribution: if UU is a uniform random number on (0,1)(0,1), X:=F1(U)X := F^{-1}(U) is distributed according to FF. Unfortunately, the inverse CDF can only be expressed in closed form for very few distributions, e.g., the exponential or Cauchy distribution. If this is not the case, one needs to rely on implementations of special functions to compute the inverse CDF for standard distributions like the normal, Gamma or beta distributions or numerical methods for inverting the CDF are required. Such procedures, however, have the disadvantage that they may be slow or inaccurate, and developing fast and robust inversion algorithms such as HINV and PINV is a non-trivial task. HINV relies on Hermite interpolation of the inverse CDF and requires the CDF and PDF as an input. PINV only requires the PDF. The algorithm then computes the CDF via adaptive Gauss-Lobatto integration and an approximation of the inverse CDF using Newton’s polynomial interpolation. Note that an approximation of the inverse CDF can be achieved by interpolating the points (F(xi),xi)(F(x_i), x_i) for points xix_i in the domain of FF, i.e., no evaluation of the inverse CDF is required.

For discrete distributions, FF is a step-function. To compute the inverse CDF F1(U)F^{-1}(U), the simplest idea would be to apply sequential search: if XX takes values 0,1,2,0, 1, 2, \dots with probabilities p0,p1,p2,p_0, p_1, p_2, \dots, start with j=0j=0 and keep incrementing jj until F(j)=p0++pjUF(j) = p_0 + \dots + p_j \ge U. When the search terminates, X=j=F1(U)X = j = F^{-1}(U). Clearly, this approach is generally very slow and more efficient methods have been developed: if XX takes LL distinct values, DGT realizes very fast inversion using so-called guide tables / hash tables to find the index jj. In contrast DAU is not an inversion method but uses the alias method, i.e., tables are precomputed to write X as an equi-probable mixture of L two-point distributions (the alias values).

The rejection method has been suggested in Von Neumann (1951). In its simplest form, assume that ff is a bounded density on [a,b][a,b], i.e., f(x)Mf(x) \le M for all x[a,b]x \in [a,b]. Sample two independent uniform random variates on UU on [0,1][0,1] and VV on [a,b][a,b] until MUf(V)M \cdot U \le f(V). Note that the accepted points (U,V)(U,V) are uniformly distributed in the region between the x-axis and the graph of the PDF. Hence, X:=VX := V has the desired distribution ff. This is a special case of the general version: if f,gf, g are two densities on an interval JJ such that f(x)cg(x)f(x) \le c \cdot g(x) for all xJx \in J and a constant c1c \ge 1, sample UU uniformly distributed on [0,1][0,1] and XX distributed according to gg until cUg(X)f(X)c \cdot U \cdot g(X) \le f(X). Then XX has the desired distribution ff. It can be shown that the expected number of iterations before the acceptance condition is met is equal to cc. Hence, the main challenge is to find hat functions gg for which cc is small and from which random variates can be generated efficiently. TDR solves this problem by applying a transformation TT to the density such that xT(f(x))x \mapsto T(f(x)) is concave. A hat function can then be found by computing tangents at suitable design points. Note that by its nature any rejection method requires not always the same number of uniform variates to generate one non-uniform variate; this makes the use of QMC and of some variance reduction methods more difficult or impossible. On the other hand, rejection is often the fastest choice for the varying parameter case.

The Ratio-Of-Uniforms method (ROU, Kinderman & Monahan (1977)) is another general method that relies on rejection. The underlying principle is that if (U,V)(U,V) is uniformly distributed on the set Af:={(u,v):0<vf(u/v),a<u/v<b}A_f := \lbrace (u, v) : 0 < v \le \sqrt{f(u/v)}, a < u/v < b \rbrace where ff is a PDF with support (a,b)(a,b), then X:=U/VX := U/V follows a distribution according to ff. In general, it is not possible to sample uniform values on AfA_f directly. However, if AfR:=[u,u+]×[0,v+]A_f \subset R := [u_-, u_+] \times [0, v_+] for finite constants u,u+,v+u_-, u_+, v_+, one can apply the rejection method: generate uniform values (U,V)(U,V) on the bounding rectangle RR until (U,V)Af(U,V) \in A_f and return X=U/VX = U/V. Automatic methods relying on the ROU method such as SROU and automatic ROU Leydold, 2000 need a setup step to find a suitable region SR2S \in \mathbb{R}^2 such that AfSA_f \subset S and such that one can generate (U,V)(U,V) uniformly on SS efficiently.

Description of the SciPy interface

SciPy provides an object-oriented API to UNU.RAN’s methods. To initialize a generator, two steps are required:

  1. creating a distribution class and object,
  2. initializing the generator itself.

In step 1, a distributions object must be created that implements required methods (e.g., pdf, cdf). This can either be a custom object or a distribution object from the classes rv_continuous or rv_discrete in SciPy. Once the generator is initialized from the distribution object, it provides a rvs method to sample random variates from the given distribution. It also provides a ppf method that approximates the inverse CDF if the initialized generator uses an inversion method. The following example illustrates how to initialize the NumericalInversePolynomial (PINV) generator for the standard normal distribution:

import numpy as np
from scipy.stats import sampling
from math import exp

# create a distribution class with implementation
# of the PDF. Note that the normalization constant
# is not required
class StandardNormal:
    def pdf(self, x):
        return exp(-0.5 * x**2)

# create a distribution object and initialize the
# generator
dist = StandardNormal()
rng = sampling.NumericalInversePolynomial(dist)

# sample 100,000 random variates from the given
# distribution
rvs = rng.rvs(100000)

As NumericalInversePolynomial generator uses an inversion method, it also provides a ppf method that approximates the inverse CDF:

# evaluate the approximate PPF at a few points
ppf = rng.ppf([0.1, 0.5, 0.9])

It is also easy to sample from a truncated distribution by passing a domain argument to the constructor of the generator. For example, to sample from truncated normal distribution:

# truncate the distribution by passing a
# `domain` argument
rng = sampling.NumericalInversePolynomial(
   dist, domain=(-1, 1)
)

While the default options of the generators should work well in many situations, we point out that there are various parameters that the user can modify, e.g., to provide further information about the distribution (such as mode or center) or to control the numerical accuracy of the approximated PPF. (u_resolution). Details can be found in the SciPy documentation https://docs.scipy.org/doc/scipy/reference/. The above code can easily be generalized to sample from parametrized distributions using instance attributes in the distribution class. For example, to sample from the gamma distribution with shape parameter alpha, we can create the distribution class with parameters as instance attributes:

class Gamma:
    def __init__(self, alpha):
        self.alpha = alpha

    def pdf(self, x):
        return x**(self.alpha-1) * exp(-x)

    def support(self):
        return 0, np.inf

# initialize a distribution object with varying
# parameters
dist1 = Gamma(2)
dist2 = Gamma(3)

# initialize a generator for each distribution
rng1 = sampling.NumericalInversePolynomial(dist1)
rng2 = sampling.NumericalInversePolynomial(dist2)

In the above example, the support method is used to set the domain of the distribution. This can alternatively be done by passing a domain parameter to the constructor.

In addition to continuous distribution, two UNU.RAN methods have been added in SciPy to sample from discrete distributions. In this case, the distribution can be either be represented using a probability vector (which is passed to the constructor as a Python list or NumPy array) or a Python object with the implementation of the probability mass function. In the latter case, a finite domain must be passed to the constructor or the object should implement the support method[1].

# Probability vector to represent a discrete
# distribution. Note that the probability vector
# need not be vectorized
pv = [0.1, 9.0, 2.9, 3.4, 0.3]

# PCG64 uniform RNG with seed 123
urng = np.random.default_rng(123)
rng = sampling.DiscreteAliasUrn(
   pv, random_state=urng
)

# sample from the given discrete distribution
rvs = rng.rvs(100000)

Underlying uniform pseudo-random number generators

NumPy provides several generators for uniform pseudo-random numbers[2]. It is highly recommended to use NumPy’s default random number generator np.random.PCG64 for better speed and performance, see O'Neill (2014) and https://numpy.org/doc/stable/reference/random/bit_generators/index.html. To change the uniform random number generator, a random_state parameter can be passed as shown in the example below:

# 64-bit PCG random number generator in NumPy
urng = np.random.Generator(np.random.PCG64())
# The above line can also be replaced by:
# ``urng = np.random.default_rng()``
# as PCG64 is the default generator starting
# from NumPy 1.19.0

# change the uniform random number generator by
# passing the `random_state` argument
rng = sampling.NumericalInversePolynomial(
   dist, random_state=urng
)

We also point out that the PPF of inversion methods can be applied to sequences of quasi-random numbers. SciPy provides different sequences in its QMC module (scipy.stats.qmc).

NumericalInverseHermite provides a qrvs method which generates random variates using QMC methods present in SciPy (scipy.stats.qmc) as uniform random number generators[3]. The next example illustrates how to use qrvs with a generator created directly from a SciPy distribution object.

from scipy import stats
from scipy.stats import qmc

# 1D Halton sequence generator.
qrng = qmc.Halton(d=1)

rng = sampling.NumericalInverseHermite(stats.norm())

# generate quasi random numbers using the Halton
# sequence as uniform variates
qrvs = rng.qrvs(size=100, qmc_engine=qrng)

Benchmarking

To analyze the performance of the implementation, we tested the methods applied to several standard distributions against the generators in NumPy and the original UNU.RAN C library. In addition, we selected one non-standard distribution to demonstrate that substantial reductions in the runtime can be achieved compared to other implementations. All the benchmarks were carried out using NumPy 1.22.4 and SciPy 1.8.1 running in a single core on Ubuntu 20.04.3 LTS with Intel(R) Core(TM) i7-8750H CPU (2.20GHz clock speed, 16GB RAM). We run the benchmarks with NumPy’s MT19937 (Mersenne Twister) and PCG64 random number generators (np.random.MT19937 and np.random.PCG64) in Python and use NumPy’s C implementation of MT19937 in the UNU.RAN C benchmarks. As explained above, the use of PCG64 is recommended, and MT19937 is only included to compare the speed of the Python implementation and the C library by relying on the same uniform number generator (i.e., differences in the performance of the uniform number generation are not taken into account). The code for all the benchmarks can be found on https://github.com/tirthasheshpatel/unuran_benchmarks.

The methods used in NumPy to generate normal, gamma, and beta random variates are:

  • the ziggurat algorithm Marsaglia & Tsang, 2000 to sample from the standard normal distribution,
  • the rejection algorithms in Chapter XII.2.6 in Devroye (1986) if α<1\alpha < 1 and in Marsaglia & Tsang (2000) if α>1\alpha > 1 for the Gamma distribution,
  • Johnk’s algorithm (Jöhnk (1964), Section IX.3.5 in Devroye (1986)) if max{α,β}1\max \{ \alpha, \beta \} \le 1, otherwise a ratio of two Gamma variates with shape parameter α and β (see Section IX.4.1 in Devroye (1986)) for the beta distribution.

Benchmarking against the normal, gamma, and beta distributions

Table 1 compares the performance for the standard normal, Gamma and beta distributions. We recall that the density of the Gamma distribution with shape parameter a>0a > 0 is given by x(0,)xa1exx \in (0, \infty) \mapsto x^{a-1} e^{-x} and the density of the beta distribution with shape parameters α,β>0\alpha, \beta > 0 is given by x(0,1)xα1(1x)β1B(α,β)x \in (0, 1) \mapsto \frac{x^{\alpha-1}(1-x)^{\beta-1}}{B(\alpha, \beta)} where Γ()\Gamma(\cdot) and B(,)B(\cdot, \cdot) are the Gamma and beta functions. The results are reported in Table 1.

Table 1:Average time taken (reported in milliseconds, unless mentioned otherwise) to sample 1 million random variates from the standard normal distribution. The mean is computed over 7 iterations. Standard deviations are not reported as they were very small (less than 1% of the mean in the large majority of cases). Note that not all methods can always be applied, e.g., TDR cannot be applied to the Gamma distribution if a<1a < 1 since the PDF is not log-concave in that case. As NumPy uses rejection algorithms with precomputed constants, no setup time is reported.

DistributionMethodPythonC
SetupSampling (PCG64)Sampling (MT19937)SetupSampling (MT19937)
Standard normalPINV4.629.636.50.2732.4
HINV2.533.740.90.3836.8
TDR0.237.347.80.0241.4
SROU8.7 µs251021600.5 µs232
NumPy-17.622.4--
Gamma(0.05)PINV196.029.837.237.932.5
HINV24.536.143.81.940.7
NumPy-55.068.1--
Gamma(0.5)PINV16.531.238.62.034.5
HINV4.934.241.70.637.9
NumPy-86.499.2--
Gamma(3.0)PINV5.330.838.70.534.6
HINV5.33340.60.436.8
TDR0.238.849.60.0344
NumPy-36.547.1--
Beta(0.5, 0.5)PINV21.433.139.92.437.3
HINV2.138.445.30.242
NumPy-101112--
Beta(0.5, 1.0)HINV0.23744.30.0141.1
NumPy-125138--
Beta(1.3, 1.2)PINV15.730.537.21.734.3
HINV4.133.440.80.437.1
TDR0.246.857.80.0345
NumPy-74.397--
Beta(3.0, 2.0)PINV9.730.238.20.933.8
HINV5.833.741.20.437.4
TDR0.242.852.80.0244
NumPy-72.692.8--

We summarize our main observations:

  1. The setup step in Python is substantially slower than in C due to expensive Python callbacks, especially for PINV and HINV. However, the time taken for the setup is low compared to the sampling time if large samples are drawn. Note that as expected, SROU has a very fast setup such that this method is suitable for the varying parameter case.
  2. The sampling time in Python is slightly higher than in C for the MT19937 random number generator. If the recommended PCG64 generator is used, the sampling time in Python is slightly lower. The only exception is SROU: due to Python callbacks, the performance is substantially slower than in C. However, as the main advantage of SROU is the fast setup time, the main use case is the varying parameter case (i.e., the method is not supposed to be used to generate large samples).
  3. PINV, HINV, and TDR are at most about 2x slower than the specialized NumPy implementation for the normal distribution. For the Gamma and beta distribution, they even perform better for some of the chosen shape parameters. These results underline the strong performance of these black-box approaches even for standard distributions.
  4. While the application of PINV requires bounded densities, no issues are encountered for α=0.05\alpha=0.05 since the unbounded part is cut off by the algorithm. However, the setup can fail for very small values of α.

Benchmarking against a non-standard distribution

We benchmark the performance of PINV to sample from the generalized normal distribution Subbotin, 1923 whose density is given by x(,)pexp2Γ(1/p)x \in (-\infty, \infty) \mapsto \frac{p e^{-|x|^p}}{2\Gamma(1/p)} against the method proposed in Nardon & Pianca (2009) and against the implementation in SciPy’s gennorm distribution. The approach in Nardon & Pianca (2009) relies on transforming Gamma variates to the generalized normal distribution whereas SciPy relies on computing the inverse of CDF of the Gamma distribution (https://docs.scipy.org/doc/scipy/reference/generated/scipy.special.gammainccinv.html). The results for different values of pp are shown in Table 2.

Table 2:Comparing SciPy’s implementation and a specialized method against PINV to sample 1 million variates from the generalized normal distribution for different values of the parameter p. Time reported in milliseconds. The mean is computer over 7 iterations.

p0.250.450.7511.5258
Nardon and Pianca (2009)10010110145148120128122
SciPy’s gennorm distribution832100011105595240672062305950
Python (PINV Method, PCG64 urng)5047454140373838

PINV is usually about twice as fast than the specialized method and about 15-150 times faster than SciPy’s implementation[4]. We also found an R package pgnorm (https://cran.r-project.org/web/packages/pgnorm/) that implements various approaches from Kalke & Richter (2013). In that case, PINV is usually about 70-200 times faster. This clearly shows the benefit of using a black-box algorithm.

Conclusion

The interface to UNU.RAN in SciPy provides easy access to different algorithms for non-uniform variate generation for large classes of univariate continuous and discrete distributions. We have shown that the methods are easy to use and that the algorithms perform very well both for standard and non-standard distributions. A comprehensive documentation suite, a tutorial and many examples are available at https://docs.scipy.org/doc/scipy/reference/stats.sampling.html and https://docs.scipy.org/doc/scipy/tutorial/stats/sampling.html. Various methods have been implemented in SciPy, and if specific use cases require additional functionality from UNU.RAN, the methods can easily be added to SciPy given the flexible framework that has been developed. Another area of further development is to better integrate SciPy’s QMC generators for the inversion methods.

Finally, we point out that other sampling methods like Markov Chain Monte Carlo and copula methods are not part of SciPy. Relevant Python packages in that context are PyMC Patil et al., 2010, PyStan relying on Stan Team, 2021, Copulas (https://sdv.dev/Copulas/) and PyCopula (https://blent-ai.github.io/pycopula/).

Acknowledgments

The authors wish to thank Wolfgang Hörmann and Josef Leydold for agreeing to publish the library under a BSD license and for helpful feedback on the implementation and this note. In addition, we thank Ralf Gommers, Matt Haberland, Nicholas McKibben, Pamphile Roy, and Kai Striega for their code contributions, reviews, and helpful suggestions. The second author was supported by the Google Summer of Code 2021 program[5].

References
  1. R Core Team. (2021). R: A Language and Environment for Statistical Computing.
  2. Virtanen, P., Gommers, R., Oliphant, T. E., Haberland, M., Reddy, T., Cournapeau, D., Burovski, E., Peterson, P., Weckesser, W., Bright, J., & others. (2020). SciPy 1.0: fundamental algorithms for scientific computing in Python. Nature Methods, 1–12. 10.1038/s41592-019-0686-2
  3. Harris, C. R., Millman, K. J., van der Walt, S. J., Gommers, R., Virtanen, P., Cournapeau, D., Wieser, E., Taylor, J., Berg, S., Smith, N. J., Kern, R., Picus, M., Hoyer, S., van Kerkwijk, M. H., Brett, M., Haldane, A., del Río, J. F., Wiebe, M., Peterson, P., … Oliphant, T. E. (2020). Array programming with NumPy. Nature, 585(7825), 357–362. 10.1038/s41586-020-2649-2
  4. Devroye, L. (1986). Non-Uniform Random Variate Generation. Springer-Verlag. 10.1007/978-1-4613-8643-8
  5. Dagpunar, J. (1988). Principles of random variate generation. Oxford University Press, USA.