Benford's law, also called the first-digit law, states that in lists of numbers from many real-life sources of data, the leading digit is 1 almost one-third of the time, and further, larger numbers occur as the leading digit with less and less frequency as they grow in magnitude, to the point that 9 is the leading digit less than one time in twenty.
This counter-intuitive result applies to a wide variety of figures from the natural world or of social significance - including electricity bills, street addresses, stock prices, population numbers, death rates, lengths of rivers, physical and mathematical constants, and processes described by power laws (which are very common in nature).
It is named after physicist Frank Benford, who stated it in 1938. However, it was earlier stated by Simon Newcomb, in 1881.
Did you get all that? Benford's law says that if you take any random sample of numbers and find the distribution of the leading digit, it should look like this:
While that's all well and good, it doesn't mean much until you see it. The manager on my team showed me the results from a previous Benford analysis that they had performed and the curve matched nearly perfectly. I thought that was just amazing, so I tried it myself. The first thing I tried was with some employee expense reporting data.
While it's not perfect, it still follows the general trend. I think the problem here is that people tend to round their expenses. If someone had an out of pocket expense of $29.88, they're probably just going to report it as $30. This was skewing my data, so I wanted to try something with little concious human interaction. This led me to a random data set online showing the number of faculty at various colleges in 1995.
As you can see, it's a much closer and smoother line than the expense reports, but it still isn't as close as the results I saw on our test at work. It's still pretty impressive though. I thought I'd try one final test, so I pulled some completely random census data from the state of Illinois. This provided me a really large population to sample from.
It's fairly spot on. It's crazy how it doesn't really matter where your numbers come from, it will (almost) always follow this same trend. I'm sure this has no bearing in your life, but it does in mine and I find it really interesting, so I thought I would share.











