## RE: Back to Basics

Just re-reading this old thread and noticed your post didn’t get a response.

The mathematical definition of randomness is very subtle. As you point out, if it’s generated by an algorithm, it ain’t random, it’s deterministic.

These days, I think Kolomogorov complexity is the most common way to formalize the notion of “randomness”. It works like this…

Pick a programming language — it doesn’t really matter which one, for reasons that I’ll explain in a moment. You just need a way to be able to express programs. Now, consider all possible self-contained programs whose output is a given string. One of those programs is the shortest of the set. Let the length of that program be the “complexity” of the string.

So, simple, repetitive strings are represented by short programs… the string of 100 ones, in Ruby, is something like:

``puts '1'*100``

Simple repetition is also obviously represented by a short program, as are patterns.

The particulars of the programming language don’t really matter, as we’re really looking at the relative sizes of programs — the overhead of having to do `puts 'some string'` vs the string itself is constant, though it might be a different constant for each programming language. Likewise the overhead of doing a loop varies with each language, but is just a small constant, compared to the length of the string the loop can output.

But for some strings, the shortest program that can generate that string is basically `puts 'the string itself'` — which is (except for some overhead introduced by the programming language) as long as the string itself. The string can’t be reduced in size.

So, Kolomogorov’s idea of randomness is that a random string is one that can’t be represented by a program shorter than the string itself.

There’s another, somewhat related notion, from Information Theory, which says that a string is random if no prefix of the string contains information about the remainder of the string, or, equivalently, if knowledge of the symbols that have come before a particular point in the string cannot help predict the next symbol.

I kind of like the Information Theory version better, because it uses probabilities, which allows for a Bayesian interpretation (instead of frequentist), which essentially says: There’s no such thing as randomness in the real world. What we perceive as randomness is nothing more than a lack of knowledge (or, more precisely, information) about the underlying mechanisms.

Thread parent sort order: