Why Python’s Random Numbers Aren’t Really Random (And Why It Matters)

Kshitij Kutumbe
4 min readDec 20, 2024

--

When you run a Python program to generate random numbers, you might imagine some mystical process creating pure chaos inside your machine. Unfortunately, that’s not what’s happening. What you’re getting from libraries like NumPy or Python’s random module isn’t "true randomness"—it’s an illusion crafted by clever algorithms. For most cases, this illusion works just fine, but in certain critical scenarios, it can fail spectacularly.

Let’s uncover why most Python libraries don’t offer true randomness, why that’s usually okay, and when it’s not.

The Two Flavors of Randomness

1. True Randomness

True randomness is chaos in its purest form. It comes from unpredictable physical phenomena, like the decay of radioactive atoms or atmospheric noise. Think of flipping a coin or rolling dice; these outcomes depend on countless tiny factors and can’t be calculated beforehand.

  • Examples of True Randomness:
  • Radioactive decay (used in quantum physics)
  • Thermal noise in circuits
  • Atmospheric noise (used by services like Random.org)

True randomness is ideal for cryptography, secure key generation, and scenarios where absolute unpredictability is crucial.

2. Pseudo-Randomness

Pseudo-randomness, on the other hand, is an elaborate magic trick. It’s generated by algorithms that take an initial number, called a “seed,” and apply mathematical formulas to produce numbers that look random. But underneath, it’s all predictable and repeatable.

  • Key Characteristics of Pseudo-Randomness:
  • Deterministic: The same seed will always produce the same sequence.
  • Fast and computationally cheap.
  • Passes statistical tests for randomness but lacks true unpredictability.

While pseudo-randomness works for most tasks, it can be problematic when real-world chaos or high security is required.

How Python Libraries Generate “Random” Numbers

NumPy and the Mersenne Twister

Python’s most popular library for numerical computations, NumPy, uses a pseudo-random number generator (PRNG) called the Mersenne Twister. This algorithm is powerful and widely used because of its:

  • Speed: It generates random numbers in bulk efficiently.
  • Periodicity: Its sequence is astronomically long — 219937−12^{19937} — 1219937−1 — so it doesn’t repeat for practical purposes.
  • Reproducibility: You can set a seed to guarantee the same sequence every time.
import numpy as np

np.random.seed(42)
print(np.random.rand(3))

The ability to reproduce results is a feature, not a bug. Scientists and engineers rely on reproducibility for experiments and debugging.

Why “Fake” Randomness is Usually Good Enough

For most applications, pseudo-randomness does the job. Here’s why:

  1. Statistical Randomness: PRNGs like Mersenne Twister pass rigorous randomness tests, making them suitable for simulations, machine learning, and gaming.
  2. Speed Matters: True randomness, derived from hardware or external services, is slower and harder to scale.
  3. Control is Key: With PRNGs, you can recreate experiments exactly by setting a seed, which is essential in research and data science.

When Pseudo-Randomness Becomes a Problem

Despite its benefits, pseudo-randomness can lead to trouble in these situations:

Cryptography

PRNGs are predictable. If an attacker knows the algorithm and seed, they can reproduce the entire sequence. For secure encryption keys, true randomness is non-negotiable.

High-Stakes Simulations

Subtle patterns in PRNGs can bias results. For example, in Monte Carlo simulations or financial modeling, even small deviations from true randomness might skew outcomes.

Gaming and Lotteries

Imagine if a casino used a predictable PRNG for slot machines. It would be a hacker’s dream.

How to Get True Randomness in Python

1. The secrets Module

Python provides the secrets module for cryptographically secure randomness. Unlike NumPy’s PRNG, secrets pulls entropy from the operating system, such as /dev/urandom on Linux.

import secrets

# Generate a secure random number
secure_random = secrets.randbelow(100)
print(secure_random)

2. Hardware Random Number Generators (HRNGs)

Modern CPUs often include hardware random number generators that leverage physical processes like thermal noise. Intel’s RDRAND instruction is a good example.

3. Random.org

If you need true randomness without hardware, you can use an online service like Random.org, which generates numbers based on atmospheric noise.

import requests

response = requests.get("https://www.random.org/integers/?num=5&min=1&max=10&col=1&base=10&format=plain&rnd=new")
print(response.text)

This approach ensures high entropy but introduces latency and requires an internet connection.

Why Python Doesn’t Default to True Randomness

  1. Performance: Generating true randomness is slower and resource-intensive.
  2. Scalability: True randomness can’t handle the volume of numbers required for large-scale computations.
  3. Sufficient for Most Use Cases: Pseudo-random numbers are “good enough” for shuffling data, initializing neural networks, and running simulations.

Takeaways for Developers

  • Know Your Needs: For most projects, NumPy’s pseudo-randomness is sufficient. But for cryptography or high-stakes applications, look elsewhere.
  • Use the Right Tool: When security or unpredictability matters, use Python’s secrets module, HRNGs, or Random.org.
  • Understand the Limits: Pseudo-random numbers aren’t magic; they’re a practical compromise.

Final Thoughts

Randomness in programming is more nuanced than it seems. Libraries like NumPy aren’t flawed for using pseudo-randomness — they’re optimized for performance and reproducibility. But as developers, we need to recognize when “random enough” isn’t enough.

Stay tuned if you are interested in such topics as well as deep dive code implementations in the NLP and Generative AI space.

--

--

Kshitij Kutumbe
Kshitij Kutumbe

Written by Kshitij Kutumbe

Data Scientist | NLP | GenAI | RAG | AI agents | Knowledge Graph | Neo4j kshitijkutumbe@gmail.com www.linkedin.com/in/kshitijkutumbe/

No responses yet