Introducing πrate — Pi Day 2020

This Pi Day, πrate is going to end privacy. All personal information that has ever existed or will ever exist, the PII credentials of every earthling, can be searched just by using this website which utilizes the almighty power of Pi. You can find your identity related information which makes you, you say, your name, date of birth, CVV number and even your unique super-duper secret pin!

Want to get freaked out? https://www.0x48piraj.com/projects/pirate

That's scary, πrate stores my personal information!?

We do not share your personal information, we just display the locations where your information is, just like a GPS Navigator. It means, all your personal information is still there, quietly sitting in π — it's never going away, you can't scrub the digits, can you? And not only yours, That's right! Every person you've ever met, or anyone else has met or will meet!

Data Protection Laws? All of it is in π! They were always there!

So I've looked up my personal information in π, what did I gain?

Nothing.

Why you created this?

Think of a scenario where you forgot your credit card number, what will you do? You will have to get it revoked and stuff. But, what if you've written them down somewhere; you could use a piece of paper, a post-it-note so to say but your friendly neighbourhood Billy can look upto it and cause trouble. Use πrate to create an unique seq and no-one will understand what the seq means and by merely using πrate again, you can retrive your credit card number! Amazeballs, right?

Why Chunking Data?

Well, this is just an initial prototype, we all know that it can take a while to find a long sequence of digits in π, so for practical reasons, we should break the long strings up into smaller chunks that can be more readily found, but don't worry, there's always Moore's law!

Why Pi?

Pi's ubiquity goes beyond math. The number crops up in the natural world, too. It appears everywhere there's a circle, of course, such as the disk of the sun, the spiral of the DNA double helix, the pupil of the eye, the concentric rings that travel outward from splashes in ponds. Pi also appears in the physics that describes waves, such as ripples of light and sound. It even enters into the equation that defines how precisely we can know the state of the universe, known as Heisenberg's uncertainty principle.

Finally, pi emerges in the shapes of rivers. A river's windiness is determined by its "meandering ratio," or the ratio of the river's actual length to the distance from its source to its mouth as the crow flies. Rivers that flow straight from source to mouth have small meandering ratios, while ones that lollygag along the way have high ones. Turns out, the average meandering ratio of rivers approaches — you guessed it — pi.

Albert Einstein was the first to explain this fascinating fact. He used fluid dynamics and chaos theory to show that rivers tend to bend into loops. The slightest curve in a river will generate faster currents on the outer side of the curve, which will cause erosion and a sharper bend. This process will gradually tighten the loop, until chaos causes the river to suddenly double back on itself, at which point it will begin forming a loop in the other direction.

Because the length of a near-circular loop is like the circumference of a circle, while the straight-line distance from one bend to the next is diameter-like, it makes sense that the ratio of these lengths would be pi-like.

What are the chances?

100% for everything that's chunked. And nearly 100% for things like your name and your date of birth.

Where do these numbers come from, and how can you compute them?

Let's say you're searching for a single digit in Pi, and pretend again that Pi is random. If you pick a number between 0 and 9 at random, the chance that it's equal to your search digit is 1 in 10, (10%, or 0.1).

 

That's pretty simple, but what happens if you want to search for a two digit string? Well, you can approximate this by picking two numbers. If the first doesn't match, then it's over. But if the first does match, you have to try to match the second, too. Each of these has a probability of 0.1, and we'll assume that the numbers are completely independent. So 10% of the time the first number matches, and 10% of 10% of the time, both numbers match, which is just 1%, or a probability of 0.01. We'd have a 1 in 1000 chance (0.001) of finding a three digit search string, and so on.

If we assume that Pi is random, the above formula gives us the chance that any particular position matches. So for a two digit search string, there's a 1% chance that it matches at position 1, a 1% chance that it matches at position 2, and so on. So the chance of finding the search string at all is equal to the chance of finding it at any of those positions. How do we figure that out?

Let's turn the problem on its head. The chance of finding it is simply the opposite of the chance of not finding the search string. "Well, duh, that's obvious!" you may say -- but wait. We already figured out this kind of chance earlier. How do we not find something? Well, we first have to not find it at position 1, and then not find it at position 2, ... and keep on going all the way to the end of our digits. This is just like what we did earlier to figure out the chance of one position matching!

If we have a 10% chance of matching at any position, then we have a 90% chance of not matching. So the odds of not matching the entire string of pi is equal to 90% of 90% of 90% of ... and so on, for each digit of Pi that we have. Mathematically, this would be 0.9 to the power of "N" (0.9N) if we have N digits. And then the odds of finding the string would just be 1 - (0.9)N.

Putting that all together, we know that the chances of finding a search string at any position are 0.1d, where "d" is the length of our search string. So the entire probability is 1 - (1 - 0.1d)N.

Continuing along the mathematical path, it turns out that we've accidentally stumbled into something called the binomial probabilities. Binomials come about when you ask "what are the odds of getting some number k of heads out of n flips of a coin." Just to make things tricky, let's let the coin be biased in some way - it gets "heads" with probability p (that is, if p = 0.6, then 60% of the time, the coin lands heads).

Luckily for us, asking about zero occurrences of heads is easy, as the formula above showed. But we could ask other questions, like "what are the odds of finding my birthday twice in the first 100,000,000 digits of Pi?" These questions are harder (computationally) to answer than the zero case, because we have lots of different ways to find your birthday twice. We could find it once at position 1 and once at position 2, or once at position 1 and once at position three, and so on. Even very fast computers start to choke when the numbers get big. And then we could make it even worse -- what if we want to know how likely it is to find your birthday at least 100 times in Pi?

The solution to this problem is to use what's known as the Poisson approximation to the binomial, when the numbers are large. We can actually approximate the above formula as:

Odds (finding string of length k in N digits of pi) = 1 - 1/e(N*0.1d)

That looks a little complicated, until we realize that 0.1d is just really just one divided by the number of search strings that have d digits. So if d is three, there are 1000 strings (0, 1, 2, ..., 999). So 0.13 = 1/1000 = 0.001. And N is just the number of digits of pi. Ah-ha! So what this really means is that we can calculate the odds simply as 1 - 1/e(digits of pi / possible searches). So if we have 100,000,000 digits of pi, and we can search for 100,000,000 possible strings (8 digit search strings), then our probability is simply 1 - 1/e. With twice as many digits as search strings, the probability becomes 1 - 1/e2. And so on.

At this point, other people have explained the math far better than I, so I leave you to the good graces of the Internet.

Do you see when this blog was published? (Psst.. 3:14 PM)

UPDATE:

Yeah, so, many folks were not able to grasp what the project is, they actually were generating random details, copying the generated random credit card numbers, and plastering it over Netflix. Yes, the same folks pursuing undergraduate computer science degree were doing this kind of stuff, I'm not talking about my grandma, come on!

I remember not starting with,

Shoutout to criminals with skimmers, you don't have to goto scary public ATMs and other POS systems to install your baby and steal credit card info because you can get Corona virus, thanks to πrate, you can steal everything just by sitting at home, sipping your latte.

I thought I did a good job in painting the picture, but, I sure didn't. Here we go again,

Pirate is not a credit card generator which you can use to do transactions. It is like a search engine aptly titled in the GitHub which searches your data in Pi. The random generation feature was added because, for obvious reasons, no user will want to enter their bank account details on a newly created, 'weird' looking website, thus, the random generator function, just to demonstrate the capabilities.