top of page
Writer's pictureGang Tao

Calculating Pi with Random Stream: A Monte Carlo Simulation Approach

The value of Pi (π) is the ratio of a circle's circumference to its diameter, typically represented by the Greek letter π. It is one of the most fundamental constants in mathematics. π is also equal to the ratio of the area of a circle to the square of its radius and is crucial for accurately calculating the circumference and area of circles, as well as the volume of spheres. π is an irrational number, meaning it is a non-repeating, infinite decimal. In everyday life, it is commonly approximated as 3.14 for basic calculations. Today, I will use a probabilistic algorithm, the "Monte Carlo method," to calculate Pi in one line of streaming SQL.



Monte Carlo is a city known for its casinos in the Principality of Monaco, and the name symbolizes probability. The Monte Carlo method was introduced by the renowned mathematician John von Neumann in the 1940s while participating in the U.S. Manhattan Project, which developed the atomic bomb. The principle of this method is to use a large number of random samples to understand a system and thereby calculate the desired value.



To compute Pi using the Monte Carlo method, imagine a circle inscribed in a square. The ratio of the areas of the circle and the square is π/4. By randomly generating n points within the square (with these points uniformly distributed), we can check if their distance from the center is less than the radius of the circle to determine whether they fall inside the circle. By calculating the ratio of points inside the circle to n, then multiplying this ratio by 4, we get the value of π. Theoretically, the larger n, the more accurate the calculation of π will be.



Leveraging Timeplus’s streaming SQL, users can easily run such simulations with simple SQL.


First let's first generate a random stream in Timeplus. The following DDL will create a long running, randomly generated stream, to simulate a point in the square, x and y are uniformly distributed random numbers with minimal value 0 and maximal value 1.


CREATE RANDOM STREAM points
(
 `x` float DEFAULT rand_uniform(0.01.0),
 `y` float DEFAULT rand_uniform(0.0, -1.0)
)

Note: there is an optimization that the same expression will be evaluated once in one SQL, so same rand function with generate same result, so I changed the second random max to -1 to avoid this issue, which does not impact the simulation, you can take it as we are simulating on another Quadrant. 


Then we can use one SQL to do the simulation to get the Pi.


WITH simulated_points AS
(
  SELECT
    x, y,
    sqrt((x * x) + (y * y)) AS distance_to_center,
    if(distance_to_center < 1, 1, 0) AS within_circle
  FROM
    points
)
SELECT
  count() AS total,
  sum(within_circle) AS points_in_circle,
  (points_in_circle / total) * 4 AS Pi
FROM
  simulated_points

Explanation of the above SQL:

  1. sqrt((x x) + (y y)) is the distance of the generated point to the center

  2. when the distance is less or equal to 1, we mark the point as within the circle, using if(distance_to_center < 1, 1, 0)

  3. The query is a global aggregation, which will aggregate all the points accumulated.

  4. count() returns the total number of simulated points

  5. sum(within_circle) returns the number of points fall into the circle

  6. (points_in_circle / total) * 4 is the simulated value of Pi


When you run the above SQL in the Timeplus web console, you will get something like this:



With more simulated points, you should get a closer simulation of Pi. In the above sample run, with about 20K simulation, we got a very close result - 3.1368.


To try this, run this query on the Timeplus public demo server.


 

Summary


In today’s blog, I have demonstrated how to use a random stream and one line of SQL to perform a Monte Carlo simulation. Random stream is a very useful tool provided by Timeplus, it is widely used in our showcase, functional test and performance test. And It is super simple to use. There are some other use cases where users can leverage random streams to generate a timer to help stream analysis or monitor which I can share later.



 

Reference

bottom of page