# Monaco: A Monte Carlo Library for Performing Uncertainty and Sensitivity Analyses

## Abstract¶

This paper introduces *monaco*, a Python library for conducting Monte Carlo simulations of computational models, and performing uncertainty analysis (UA) and sensitivity analysis (SA) on the results. UA and SA are critical to effective and responsible use of models in science, engineering, and public policy, however their use is uncommon. By providing a simple, general, and rigorous-by-default library that wraps around existing models, *monaco* makes UA and SA easy and accessible to practitioners with a basic knowledge of statistics.

## Introduction¶

Computational models form the backbone of decision-making processes in science, engineering, and public policy. However, our increased reliance on these models stands in contrast to the difficulty in understanding them as we add increasing complexity to try and capture ever more of the fine details of real-world interactions. Practitioners will often take the results of their large, complex model as a point estimate, with no knowledge of how uncertain those results are Ferretti *et al.*, 2016. Multiple-scenario modeling (e.g. looking at a worst-case, most-likely, and best-case scenario) is an improvement, but a complete global exploration of the input space is needed. That gives insight into the overall distribution of results (UA) as well as the relative influence of the different input factors on the ouput variance (SA). This complete understanding is critical for effective and responsible use of models in any decision-making process, and policy papers have identified UA and SA as key modeling practices Azzini *et al.*, 2020EPA, 2009.

Despite the importance of UA and SA, recent literature reviews show that they are uncommon – in 2014 only 1.3% of all published papers Ferretti *et al.*, 2016 using modeling performed any SA. And even when performed, best practices are usually lacking – amongst papers which specifically claimed to perform sensitivity analysis, a 2019 review found only 21% performed global (as opposed to local or zero) UA, and 41% performed global SA Saltelli *et al.*, 2019.

Typically, UA and SA are done using Monte Carlo simulations, for reasons explored in the following section. There are Monte Carlo frameworks available, however existing options are largely domain-specific, focused on narrow sub-problems (i.e. integration), tailored towards training neural nets, or require a deep statistical background to use. See Olivier *et al.* (2020), Razavi *et al.* (2021), and Douglas-Smith *et al.* (2020) for an overview of the currently available Python tools for performing UA and SA. For the domain expert who wants to perform UA and SA on their existing models, there is not an easy tool to do both in a single shot. *monaco* was written to address this gap.

## Motivation for Monte Carlo Approach¶

### Mathematical Grounding¶

Randomized Monte Carlo sampling offers a cure to the curse of dimensionality: consider an investigation of the output from $k$ input factors $y = f(x_1, x_2, ..., x_k)$ where each factor is uniformly sampled between 0 and 1, $x_i \in U[0, 1]$. The input space is then a $k$-dimensional hypercube with volume 1. If each input is varied one at a time (OAT), then the volume $V$ of the convex hull of the sampled points forms a hyperoctahedron with volume $V = \frac{1}{k!}$ (or optimistically, a hypersphere with $V = \frac{\pi^{k/2}}{2^k \Gamma(k/2 + 1)}$), both of which decrease super-exponentially as $k$ increases. Unless the model is known to be linear, this leaves the input space wholly unexplored. In contrast, the volume of the convex hull of $n \to \infty$ random samples as is obtained with a Monte Carlo approach will converge to $V=1$, with much better coverage within that volume as well Dyer *et al.*, 1992. See Figure 2.

### Benefits and Drawbacks of Basic Monte Carlo Sampling¶

*monaco* focuses on forward uncertainty propagation with basic Monte Carlo sampling. This has several benefits:

- The method is conceptually simple, lowering the barrier of entry and increasing the ease of communicating results to a broader audience.
- The same sample points can be used for UA and SA. Generally, Bayesian methods such as Markov Chain Monte Carlo provide much faster convergence on UA quantities of interest, but their undersampling of regions that do not contribute to the desired quantities is inadequate for SA and complete exploration of the input space. The author’s experience aligns with Saltelli
*et al.*(2019) in that there is great practical benefit in broad sampling without pigeonholing one’s purview to particular posteriors, through uncovering bugs and edge cases in regions of input space that were not being previously considered. - It can be applied to domains that are not data-rich. See for example NASA’s use of Monte Carlo simulations during rocket design prior to collecting test flight data Hanson & Beard, 2010.

However, basic Monte Carlo sampling is subject to the classical drawbacks of the method such as poor sampling of rare events and the slow $\sigma / \sqrt{n}$ convergence on quantities of interest. If the outputs and regions of interest are firmly known at the outset, then other sampling methods will be more efficient Kroese *et al.*, 2013.

Additionally, given that any conclusions are conditional on the correctness of the underlying model and input parameters, the task of validation is critical to confidence in the UA and SA results. However, this is currently out of scope for the library and must be performed with other tools. In a data-poor domain, hypothesis testing or probabilistic prediction measures like loss scores can be used to anchor the outputs against a small number of real-life test data. More generally, the “inverse problem” of model and parameter validation is a deep field unto itself and Council & others (2012) and Shiffrin *et al.* (2008) are recommended as overviews of some methods. If *monaco*’s scope is too limited for the reader’s needs, the author recommends `UQpy`

Olivier *et al.* (2020) for UA and SA, and `PyMC`

Salvatier *et al.*, 2016 or `Stan`

Carpenter *et al.*, 2017 as good general-purpose probabilistic programming Python libraries.

### Workflow¶

UA and SA of any model follows a common workflow. Probability distributions for the model inputs are defined, and randomly sampled values for a large number of cases are fed to the model. The outputs from each case are collected and the full set of inputs and outputs can be analyzed. Typically, UA is performed by generating histograms, scatter plots, and summary statistics for the output variables, and SA is performed by looking at the effect of input on output variables through scatter plots, performing regressions, and calculating sensitivity indices. These results can then be compared to real-world test data to validate the model or inform revisions to the model and input variables. See Figure 3.

Note that with model and input parameter validation currently outside *monaco*’s scope, closing that part of the workflow loop is left up to the user.

*monaco* Structure¶

### Overall Structure¶

Broadly, each input factor and model output is a *variable* that can be thought of as lists (rows) containing the full range of randomized *values*. *Cases* are slices (columns) that take the *i*’th input and output value for each variable, and represent a single run of the model. Each case is run on its own, and the output values are collected into output variables. Figure 4 shows a visual representation of this.

### Simulation Setup¶

The base of a *monaco* simulation is the `Sim`

object. This object is formed by passing it a name, the number of random cases `ncases`

, and a dict `fcns`

of the handles for three user-defined functions detailed in the next section. A random seed that then seeds the entire simulation can also be passed in here, and is highly recommended for repeatability of results.

Input variables then need to be defined. *monaco* takes in the handle to any of `scipy.stat`

’s continuous or discrete probability distributions, as well as the required arguments for that probability distribution Virtanen *et al.*, 2020. If nonnumeric inputs are desired, the method can also take in a `nummap`

dictionary which maps the randomly drawn integers to values of other types.

At this point the sim can be run. The randomized drawing of input values, creation of cases, running of those cases, and extraction of output values are automatically executed.

### User-Defined Functions¶

The user needs to define three functions to wrap *monaco*’s Monte Carlo structure around their existing computational model. First is a `run`

function which either calls or directly implements their model. Second is a `preprocess`

function which takes in a `Case`

object, extracts the randomized inputs, and structures them with any other invariant data to pass to the `run`

function. Third is a `postprocess`

function which takes in a `Case`

object as well as the results from the model, and extracts the desired output values. The Python call chain is as:

`postprocess(case, *run(*preprocess(case)))`

Or equivalently to expand the Python star notation into pseudocode:

```
siminput = (siminput1, siminput2, ...)
= preprocess(case)
simoutput = (simoutput1, simoutput2, ...)
= run(*siminput)
= run(siminput1, siminput2, ...)
_ = postprocess(case, *simoutput)
= postprocess(case, simoutput1, simoutput2, ...)
```

These three functions must be passed to the simulation in a dict with keys `'run'`

, `'preprocess'`

, and `'postprocess'`

. See the example code at the end of the paper for a simple worked example.

### Examining Results¶

After running, users should generally do all of the following UA and SA tasks to get a full picture of the behavior of their computational model.

- Plot the results (UA & SA).
- Calculate statistics for input or output variables (UA).
- Calculate sensitivity indices to rank importance of the input variables on variance of the output variables (SA).
- Investigate specific cases with outlier or puzzling results.
- Save the results to file or pass them to other programs.

### Data Flow¶

A summary of the process and data flow:

Instantiate a

`Sim`

object.Add input variables to the sim with specified probability distributions.

Run the simulation. This executes the following:

- Random percentiles $p_i \in U[0,1]$ are drawn
`ndraws`

times for each of the input variables. - These percentiles are transformed into random values via the inverse cumulative density function of the target probability distribution $x_i = F^{-1}(p_i)$.
- If nonnumeric inputs are desired, the numbers are converted to objects via a
`nummap`

dict. `Case`

objects are created and populated with the input values for each case.- Each case is run by structuring the inputs values with the
`preprocess`

function, passing them to the`run`

function, and collecting the output values with the`postprocess`

function. - The output values are collected into output variables and saved back to the sim. If the values are nonnumeric, a
`valmap`

dict assigning numbers to each unique value is automatically generated.

- Random percentiles $p_i \in U[0,1]$ are drawn
Calculate statistics & sensitivities for input & output variables.

Plot variables, their statistics, and sensitivities.

### Incorporating into Existing Workflows¶

If the user wants to use existing workflows for generating, running, post-processing, or examining results, any combination of *monaco*’s major steps can be replaced with external tooling by saving and loading input and output variables to file. For example, *monaco* can be used only for its parallel processing backend by importing existing randomly drawn input variables, running the simulation, and exporting the output variables for outside analysis. Or, it can be used only for its plotting and analysis capabilities by feeding it inputs and outputs generated elsewhere.

### Resource Usage¶

Note that *monaco*’s computational and storage overhead in creating easily-interrogatable objects for each variable, value, and case makes it an inefficient choice for computationally simple applications with high $n$, such as Monte Carlo integration. If the preprocessed sim input and raw output for each case (which for some models may dominate storage) is not retained, then the storage bottleneck will be the creation of a `Val`

object for each case’s input and output values with minimum size 0.5 kB. The maximum $n$ will be driven by the size of the RAM on the host machine being capable of holding at least $0.5 * n(k_{in} + k_{out})$ kB. On the computational bottleneck side, *monaco* is best suited for models where the model runtime dominates the random variate generation and the few hundred microseconds of `dask.delayed`

task switching time.

## Technical Features¶

### Sampling Methods¶

Random sampling of the percentiles for each variable can be done using scipy’s pseudo-random number generator (PRNG), or with any of the low-discrepancy methods from the `scip.stats.qmc`

quasi-Monte Carlo (QMC) module. QMC in general provides faster $O(\log(n)^k n^{-1})$ convergence compared to the $O(n^{-1/2})$ convergence of random sampling Caflisch, 1998. Available low-discrepancy options are regular or scrambled Sobol sequences, regular or scrambled Halton sequences, or Latin Hypercube Sampling. In general, the `'sobol_random'`

method that generates scrambled Sobol sequences Sobol, 1967Owen, 2020 is recommended in nearly all cases as the sequence with the fastest QMC convergence Christensen *et al.*, 2018, balanced integration properties as long as the number of cases is a power of 2, and a fairly flat frequency spectrum (though sampling spectra are rarely a concern) Perrier *et al.*, 2018. See Figure 5 for a visual comparison of some of the options.

### Order Statistics, or, How Many Cases to Run?¶

How many Monte Carlo cases should one run? One answer would be to choose $n \geq 2^k$ with a sampling method that implements a (t,m,s) digital net (such as a Sobol or Halton sequence), which guarantees that there will be at least one sample point in every hyperoctant of the input space Joe & Kuo, 2008. This should be considered a lower bound for SA, with the number of cases run being some integer multiple of $2^k$.

Along a similar vein, Dyer *et al.* (1992) suggests that with random sampling $n \geq 2.136^k$ is sufficient to ensure that the volume fraction $V$ approaches 1. The author hypothesizes that for a digital net, the $n \geq \lambda^k$ condition will be satisfied with some $\lambda \leq 2$, and so $n \geq 2^k$ will suffice for this condition to hold. However, these methods of choosing the number of cases may undersample for low $k$ and be infeasible for high $k$.

A rigorous way of choosing the number of cases is to first choose a statistical interval (e.g. a confidence interval for a percentile, or a tolerance interval to contain a percent of the population), and then use order statistics to calculate the minimum $n$ required to obtain that result at a desired confidence level. This approach is independent of $k$, making UA of high-dimensional models tractable. *monaco* implements order statistics routines for calculating these statistical intervals with a distribution-free approach that makes no assumptions about the normality or other shape characteristics of the output distribution. See Chapter 5 of Hahn & Meeker (1991) for background.

A more qualitative UA method would simply be to choose a reasonably high $n$ (say, $n=2^{10}$), manually examine the results to ensure high-interest areas are not being undersampled, and rely on bootstrapping of the desired variable statistics to obtain the required confidence levels.

### Variable Statistics¶

For any input or output variable, a statistic can be calculated for the ensemble of values. *monaco* builds in some common statistics (mean, percentile, etc), or alternatively the user can pass in a custom one. To obtain a confidence interval for this statistic, the results are resampled with replacement using the `scipy.stats.bootstrap`

module. The number of bootstrap samples is determined using an order statistic approach as outlined in the previous section, and multiplying that number by a scaling factor (default 10x) for smoothness of results.

### Sensitivity Indices¶

Sensitivity indices give a measure of the relationship between the variance of a scalar output variable to the variance of each of the input variables. In other words, they measure which of the input ranges have the largest effect on an output range. It is crucial that sensitivity indices are global rather than local measures – global sensitivity has the stronger theoretical grounding and there is no reason to rely on local measures in scenarios such as automated computer experiments where data can be easily and arbitrarily sampled Saltelli *et al.*, 2008Puy *et al.*, 2022.

With computer-designed experiments, it is possible to construct a specially constructed sample set to directly calculate global sensitivity indices such as the Total-Order Sobol index Sobol, 2001, or the IVARS100 index Razavi & Gupta, 2016. However, this special construction requires either sacrificing the desirable UA properties of low-discrepancy sampling, or conducting an additional Monte Carlo analysis of the model with a different sample set. For this reason, *monaco* uses the D-VARS approach to calculating global sensitivity indices, which allows for using a set of given data Sheikholeslami & Razavi, 2020. This is the first publically available implementation of the D-VARS algorithm.

### Plotting¶

*monaco* includes a plotting module that takes in input and output variables and quickly creates histograms, empirical CDFs, scatter plots, or 2D or 3D “spaghetti plots” depending on what is most appropriate for each variable. Variable statistics and their confidence intervals are automatically shown on plots when applicable.

### Vector Data¶

If the values for an output variable are length $s$ lists, NumPy arrays, or Pandas dataframes, they are treated as timeseries with $s$ steps. Variable statistics for these variables are calculated on the ensemble of values at each step, giving time-varying statistics.

The plotting module will automatically plot size $(1, s)$ arrays against the step number as 2-D lines, size $(2, s)$ arrays as 2-D parametric lines, and size $(3, s)$ arrays as 3-D parametric lines.

### Parallel Processing¶

*monaco* uses *dask.distributed* Rocklin, 2015 as a parallel processing backend, and supports preprocessing, running, and postprocessing cases in a parallel arrangement. Users familiar with *dask* can extend the parallelization of their simulation from their single machine to a distributed cluster.

For simple simulations such as the example code at the end of the paper, the overhead of setting up a *dask* server may outweigh the speedup from parallel computation, and in those cases *monaco* also supports running single-threaded in a single for-loop.

### The Median Case¶

A “nominal” run is often useful as a baseline to compare other cases against. If desired, the user can set a flag to force the first case to be the median 50th percentile draw of all the input variables prior to random sampling.

### Debugging Cases¶

By default, all the raw results from each case’s simulation run prior to postprocessing are saved to the corresponding `Case`

object. Individual cases can be interrogated by looking at these raw results, or by indicating that their results should be highlighted in plots. If some cases fail to run, *monaco* will mark them as incomplete and those specific cases can be rerun without requiring the full set of cases to be recomputed. A `debug`

flag can be set to not skip over failed cases and instead stop at a breakpoint or dump the stack trace on encountering an exception.

### Saving and Loading to File¶

The base `Sim`

object and the `Case`

objects can be serialized and saved to or loaded from `.mcsim`

and `.mccase`

files respectively, which are stored in a results directory. The Case objects are saved separately since the raw results from a run of the simulation may be arbitrarily large, and the Sim object can be comparatively lightweight. Loading the Sim object from file will automatically attempt to load the cases in the same directory, but can also stand alone if the raw results are not needed.

Alternatively, the numerical representations for input and output variables can be saved to and loaded from `.json`

or `.csv`

files. This is useful for interfacing with external tooling, but discards the metadata that would be present by saving to *monaco*’s native objects.

## Example¶

Presented here is a simple example showing a Monte Carlo simulation of rolling two 6-sided dice and looking at their sum.

The user starts with their `run`

function which here directly implements their computational model. They must then create `preprocess`

and `postprocess`

functions to feed in the randomized input values and collect the outputs from that model.

```
# The 'run' function, which implements the
# existing computational model (or wraps it)
def example_run(die1, die2):
dicesum = die1 + die2
return (dicesum, )
# The 'preprocess' function grabs the random
# input values for each case and structures it
# with any other data in the format the 'run'
# function expects
def example_preprocess(case):
die1 = case.invals['die1'].val
die2 = case.invals['die2'].val
return (die1, die2)
# The 'postprocess' function takes the output
# from the 'run' function and saves off the
# outputs for each case
def example_postprocess(case, dicesum):
case.addOutVal(name='Sum', val=dicesum)
case.addOutVal(name='Roll Number',
val=case.ncase)
return None
```

The *monaco* simulation is initialized, given input variables with specified probability distributions (here a random integer between 1 and 6), and run.

```
import monaco as mc
from scipy.stats import randint
# dict structure for the three input functions
fcns = {'run' : example_run,
'preprocess' : example_preprocess,
'postprocess': example_postprocess}
# Initialize the simulation
ndraws = 1024 # Arbitrary for this example
seed = 123456 # Recommended for repeatability
sim = mc.Sim(name='Dice Roll', ndraws=ndraws,
fcns=fcns, seed=seed)
# Generate the input variables
sim.addInVar(name='die1', dist=randint,
distkwargs={'low': 1, 'high': 6+1})
sim.addInVar(name='die2', dist=randint,
distkwargs={'low': 1, 'high': 6+1})
# Run the Simulation
sim.runSim()
```

The results of the simulation can then be analyzed and examined. Figure 6 shows the plots this code generates.

```
# Calculate the mean and 5-95th percentile
# statistics for the dice sum
sim.outvars['Sum'].addVarStat('mean')
sim.outvars['Sum'].addVarStat('percentile',
{'p':[0.05, 0.95]})
# Plots a histogram of the dice sum
mc.plot(sim.outvars['Sum'])
# Creates a scatter plot of the sum vs the roll
# number, showing randomness
mc.plot(sim.outvars['Sum'],
sim.outvars['Roll Number'])
# Calculate the sensitivity of the dice sum to
# each of the input variables
sim.calcSensitivities('Sum')
sim.outvars['Sum'].plotSensitivities()
```

## Case Studies¶

These two case studies are toy models meant as illustrative of potential uses, and not of expertise or rigor in their respective domains. Please see https://

### Baseball¶

This case study models the trajectory of a baseball in flight after being hit for varying speeds, angles, topspins, aerodynamic conditions, and mass properties. From assumed initial conditions immediately after being hit, the physics of the ball’s ballistic flight are calculated over time until it hits the ground.

Figure 7 shows some plots of the results. A baseball team might use analyses like this to determine where outfielders should be placed to catch a ball for a hitter with known characteristics, or determine what aspect of a hit a batter should focus on to improve their home run potential.

### Election¶

This case study attempts to predict the result of the 2020 US presidential election, based on polling data from FiveThirtyEight 3 weeks prior to the election FiveThirtyEight, 2020.

Each state independently casts a normally distributed percentage of votes for the Democratic, Republican, and Other candidates, based on polling. Also assumed is a uniform ±3% national swing due to polling error which is applied to all states equally. That summed percentage is then normalized so the total for all candidates is 100%. The winner of each state’s election assigns their electoral votes to that candidate, and the candidate that wins at least 270 of the 538 electoral votes is the winner.

The calculated win probabilities from this simulation are 93.4% Democratic, 6.2% Republican, and 0.4% Tie. The 25–75th percentile range for the number of electoral votes for the Democratic candidate is 281–412, and the actual election result was 306 electoral votes. See Figure 8.

## Conclusion¶

This paper has introduced the ideas underlying Monte Carlo analysis and discussed when it is appropriate to use for conducting UA and SA. It has shown how *monaco* implements a rigorous, parallel Monte Carlo process, and how to use it through a simple example and two case studies. This library is geared towards scientists, engineers, and policy analysts that have a computational model in their domain of expertise, enough statistical knowledge to define a probability distribution, and a desire to ensure their model will make accurate predictions of reality. The author hopes this tool will help contribute to easier and more widespread use of UA and SA in improved decision-making.

## Further Information¶

*monaco* is available on PyPI as the package `monaco`

, has API documentation at https://

- Ferretti, F., Saltelli, A., & Tarantola, S. (2016). Trends in sensitivity analysis practice in the last decade.
*Science of the Total Environment*,*568*, 666–670. 10.1016/j.scitotenv.2016.02.133 - Azzini, I., Listorti, G., Mara, T., & Rosati, R. (2020). Uncertainty and Sensitivity Analysis for policy decision making.
*An Introductory Guide. Joint Research Centre, European Commission, Luxembourg*. 10.2760/922129 - EPA, U. (2009).
*Guidance on the Development, Evaluation, and Application of Environmental Models (EPA/100/K-09/003)*. Office of the Science Advisor, Council for Regulatory Environmental Modeling. https://nepis.epa.gov/Exe/ZyPDF.cgi?Dockey=P1003E4R.PDF - Saltelli, A., Aleksankina, K., Becker, W., Fennell, P., Ferretti, F., Holst, N., Li, S., & Wu, Q. (2019). Why so many published sensitivity analyses are false: A systematic review of sensitivity analysis practices.
*Environmental Modelling & Software*,*114*, 29–39. 10.1016/j.envsoft.2019.01.012 - Olivier, A., Giovanis, D. G., Aakash, B. S., Chauhan, M., Vandanapu, L., & Shields, M. D. (2020). UQpy: A general purpose Python package and development environment for uncertainty quantification.
*Journal of Computational Science*,*47*, 101204. 10.1016/j.jocs.2020.101204