Simulation-based probability: Monte Carlo for “hard” events
Many probability questions become difficult when the event is complicated (custom rules, thresholds, multi-step logic,
or random variables that do not have a convenient closed-form distribution). A standard tool in computational probability is
Monte Carlo simulation, where we repeatedly simulate the experiment and estimate the probability from frequencies.
1) Core idea: estimate probability by relative frequency
Suppose you define an event \(A\) (“success”) for a single randomized trial. For each trial \(i\), define an indicator:
\[
I_i=\mathbf{1}\{A \text{ occurs on trial } i\} =
\begin{cases}
1, & \text{if }A \text{ occurs}\\
0, & \text{otherwise}
\end{cases}
\]
After \(n\) independent trials, the Monte Carlo estimate is:
\[
\hat{p}=\frac{1}{n}\sum_{i=1}^{n} I_i.
\]
By the Law of Large Numbers, \(\hat{p}\to p\) as \(n\to\infty\). In practice, larger \(n\) means a more stable estimate.
2) How the script works (safe mini-language)
This calculator lets you write a small scenario script that is evaluated once per trial.
You can create variables and finish with an event condition:
\[
\texttt{d1 = dice(6)}\quad
\texttt{d2 = dice(6)}\quad
\texttt{event = (d1 + d2) >= 7}
\]
Rules:
- One statement per line. Use
x = expression.
- Finish with
event = ( ... ) (or end with a boolean expression line).
- Allowed operators: \(+,-,\times,/,^,\) comparisons (\(<,<=,>,>=,==,!=\)) and logic (
and/or/not or && || !).
Built-in random generators (each call creates a new random draw in that trial):
dice(sides[,count]): sum of count dice with given sides (default count=1).
randint(a,b): integer uniformly from \(a\) to \(b\) inclusive.
uniform(a,b): uniform real on \([a,b]\).
bernoulli(p): returns 1 with probability \(p\), else 0.
normal(mu,sigma): normal draw \(N(\mu,\sigma^2)\) (Box–Muller internally).
The script is parsed and evaluated without executing arbitrary JavaScript, so it stays safe and predictable.
3) Confidence bounds for \(\hat{p}\)
If the event is “success/failure”, then \(S=\sum I_i\) is a binomial count (under independence).
A simple error idea is that the standard deviation of \(\hat{p}\) scales like \(1/\sqrt{n}\).
This calculator reports Wilson score confidence bounds, which are generally more stable than the plain
normal approximation when \(p\) is near 0 or 1:
\[
z=\Phi^{-1}\!\left(\frac{1+\text{conf}}{2}\right),\quad
\text{center}=\frac{\hat{p}+\frac{z^2}{2n}}{1+\frac{z^2}{n}},\quad
\text{half}=\frac{z\sqrt{\frac{\hat{p}(1-\hat{p})}{n}+\frac{z^2}{4n^2}}}{1+\frac{z^2}{n}}.
\]
\[
\text{CI}=[\text{center}-\text{half},\ \text{center}+\text{half}].
\]
4) Reading the graph
The plot shows the running estimate \(\hat{p}(t)\) as trials accumulate. Early on it can move a lot,
but it stabilizes as \(t\) grows. The optional confidence band shows running Wilson bounds.
The “trial outcome cloud” is a visual sample of individual successes (near 1) and failures (near 0).
5) Practical tips for better simulations
- Increase \(n\) until the estimate stops changing meaningfully for your purpose.
- Use a deterministic seed when you want reproducible demos or debugging.
- Start with a simpler event, confirm it works, then add complexity step-by-step.
6) University note: variance reduction
Monte Carlo error decreases like \(O(1/\sqrt{n})\), which can be slow. Advanced methods reduce variance without increasing \(n\) as much:
- Antithetic variates: pair negatively correlated trials to cancel noise.
- Control variates: subtract a correlated quantity with known expectation.
- Importance sampling: sample more often in “important” regions and reweight.
- Stratified sampling: ensure coverage across subregions of the sample space.
If you’re estimating extremely rare events, naive Monte Carlo may need enormous \(n\). Importance sampling is often the right tool.