§ A visualization of Pi digits
Following tradition, as a tribute to pi day, I will attempt to recreate a particular visual representation of the digits of \(\pi\). The original creation is due to Martin Krzywinski, and can be found here. The digits of \(\pi\) were computed using the Chudnovsky algorithm. Due to the CPU-intensive nature of such a computation, I precomputed the digits and saved to them a .dat
file for reuse. This is less time consuming than computing the digits every time when visualizing them (or if you write buggy programs like I do). Below is the python script that computes the \(n\) first digits of \(\pi\) and saves as them to .dat
file:
def pi_digits(n):
"""
Computes PI using the Chudnovsky algorithm from
http://stackoverflow.com/questions/28284996/python-pi-calculation
"""
# Set precision
getcontext().prec = n
t = Decimal(0)
pi = Decimal(0)
d = Decimal(0)
# Chudnovsky algorithm
for k in range(n):
t = ((-1)**k)*(factorial(6*k))*(13591409+545140134*k)
d = factorial(3*k)*(factorial(k)**3)*(640320**(3*k))
pi += Decimal(t) / Decimal(d)
pi = pi * Decimal(12) / Decimal(640320**(Decimal(1.5)))
pi = 1 / pi
return str(pi)
Since the digits will be displayed on a grid, the python script takes as input the dimensions of the grid, the number of rows and number of columns respectively.
Now that we have computed the \(n = rc\) digits of \(\pi\), we want to
visualize them. The visualization was done using scatter
from matplotlib
where the digits are laid out on a \(r \times c\) evenly-spaced meshgrid and the values of the digits are used to fill the colors of the points on the grid, these colors are taken from a discrete color map of size \(10\). The python script used to read the digits from a file and visualize them as explained above is as follows:
def discrete_cmap(N, base_cmap=None):
"""Create an N-bin discrete colormap from the specified input map from:
https://gist.github.com/jakevdp/91077b0cae40f8f8244a
By Jake VanderPlas
License: BSD-style
"""
# Note that if base_cmap is a string or None, you can simply do
# return plt.cm.get_cmap(base_cmap, N)
# The following works for string, None, or a colormap instance:
base = plt.cm.get_cmap(base_cmap)
color_list = base(np.linspace(0, 1, N))
cmap_name = base.name + str(N)
return color_list, base.from_list(cmap_name, color_list, N)
if __name__ == '__main__':
...
...
r, c = np.shape(pi_digits) # pi digits in meshgrid
x, y = np.meshgrid(np.arange(r), np.arange(c))
# initialize figure
fig, axes = plt.subplots(figsize=(8, 4))
color_list, dcmap = discrete_cmap(n, "terrain")
# swap y and x
scat = axes.scatter(
y, x, c=digits[x, y], s=3, cmap=dcmap
)
axes.axis('off')
legend1 = axes.legend(*scat.legend_elements(), loc='upper center',
bbox_to_anchor=(0.5, 1.1), ncol=n, fontsize=4.5,
markerscale=.5, frameon=False)
axes.add_artist(legend1)
# show figure
plt.tight_layout()
plt.show()
Above code snippet should be able to reproduce something along these lines
What would be even more visually appealing is if we could connect the same-valued digits that are one point away from each other in either direction with edges as in the original poster. For example, for a \(3\) by \(3\) grid
\begin{align} &3,1,4 \newline &1,5,9 \newline &2,6,6 \end{align}
In this case the \(1\)s across the diagonal from the first row and second row will be connected with an edge. Similarly the \(6\)s along the third row will be connected with an edge.
To do this, we need to able to find the indices of the connected clusters of numbers across the whole grid and use them to draw the edges between the corresponding grid points. Scouring around the internet, I found that such a computation is related to a concept in graph theory, which is the concept of connected components of an undirected graph. Finding the connected components of all the digits from \(0-9\) across the grid will give us what we seek. At first sight, this seemed like a daunting task and I was apprehensive of whether a vanilla python implementation would be up to the task. However, it turns out this task is related to labeling components in a pixel array. Consider the following \(3\times3\) pixel array.
\begin{align} &0,1,0 \newline &1,0,1 \newline &0,1,0 \end{align}
The four \(1\)s in the grid would be deemed as a label/component and we would have
only one component in the grid. The scipy
library has a high performance C implementation,
scipy.ndimage.label
, to label the components and subsequently extract their
indices. The single downfall of this method is that it also considers isolated
groups (\(1\)s surrounded by \(0\)s in all directions) as connected components.
However, one can filter out isolated groups from the original array as done here. Putting this all together, we can produce the stunning visuals below. Here are the visualizations for \(23\times23\)
The complete source code and even more higher quality images can be found here.