Reading and understanding new definitions is a crucial skill for any developer to have. Whether it is the next big library, major version updates to your favorite framework, or even stepping into a new codebase with complex ERDs (entity relationship diagrams). As a math fan/evangelist I want to put those mental muscles to work in a two part series on random numbers. In today’s blog we will take a deeper look at the definition of a “random number” to understand how it can pose a problem in the world of computing. In a follow up blog I will cover the techniques used by computer scientists to overcome these challenges.
Random Numbers in Software Development & Tech
Random numbers are a figment of your imagination. The term “random number” can be loosely defined as “A number generated in a way that *people* can’t predict”. Think of the ways you would generate a “random” number. Rolling dice, flipping coins, spinning wheels, drawing cards from a deck. These all have a real world, physical, component, that is separate from the number itself. This is because the randomness of a number has nothing to do with the number itself and everything to do with the way the number is generated. Simply put, if you didn’t see the dice, there’s no way to tell the number 20 generated randomly from the number 20 written on a piece of paper (ask any D&D player).
However, it is also a pretty bold claim to say that “random numbers don’t exist” when they show up in our lives and on our computers. So what exactly is the definition of a random number? If you do some googling for the definition of a “random number” you might stumble on one offered by whatis.techtarget.com which states:
Random numbers are numbers that occur in a sequence such that two conditions are met: (1) the values are uniformly distributed over a defined interval or set, and (2) it is impossible to predict future values based on past or present ones.
The best place for mathematical definitions Wolfram Alpha gives the less concrete but more accurate answer:
A random number is a number chosen as if by chance from some specified distribution such that selection of a large set of these numbers reproduces the underlying distribution. Almost always, such numbers are also required to be independent, so that there are no correlations between successive numbers.
Defining Random Numbers in Software
Both definitions illustrate a paradox in the way humans think about random numbers. We want a random number to have the ability to be any number. However, we also want them to be completely unrelated to any of the numbers that have come before. Herein lies the problem. When numbers start to feel too related humans start to assume that they are not random. To illustrate this point, math teachers around the world run an experiment.
Imagine a classroom of fifty students. Split them into two groups. Have one group flip a coin 50 times and record their results. Have the second group simulate what they think the results of 50 coin flips would be. Then, when you aren’t looking have the groups write their results up on the board. You can tell them apart with the following statistical fact. The chances that a flipped coin will be heads or tails at least 5 times in a row are almost 100%. The chances that its heads or tails at least 10 times in a row are roughly 7-10%. However the chances a human would write “tails” ten times in a row when faking are 0 (unless of course they have ready this and are trying to ruin the experiment). The result is clear, our minds prefer deliberate difference over pure randomness.
If you don’t have the coin, or enough students that do your bidding for credit, the same principle can be illustrated using a thought experiment. I have a machine that prints off numbers. It is your job to determine if the numbers are random or not. How many times in a row would it need to print the number 3 before you decide that it is definitely not random? If you were then asked to explain how you know its not random, how would you do it? Your only argument would be the number of times you got the same result. However, no matter how unlikely, there would still be a chance it was randomly generated that many times.
A better argument might be that from the start the numbers were never random at all. A machine made them. It wasn’t rolling dic or spinning wheels, it was running code and if runs the same code again you should get the same numbers again because code runs the same way every time, that is what makes it code. Hence, our exploration of the definition of random has left us with an even deeper question about computers. How can you get code that runs the same way every time to generate numbers that a human will think are random every time?
Stay tuned for part 2 of this article series titled This Blog isn’t Random at All for the answers. In the meantime, check out this article about devops and tech by Spire’s Director of Devops.