Logarithms: Mobile Phones, Modelling & Statistics?

  • Details
  • Transcript
  • Audio
  • Downloads
  • Extra Reading

Have a Question? No Registration Required https://app.sli.do/event/2K3ekZKBwMaBESKvAF83wi

Logarithms were perhaps once thought of as just an old-fashioned way to do sums on slide rules. But they underpin much of modern life, from modelling the COVID pandemic to Claude Shannon’s mathematical theory of information (which makes mobile phones a reality) and making sense of Cristiano Ronaldo’s crazy Instagram follower numbers.

This lecture will explore the basics and history of logarithms, and then show how they are a natural way to represent many models and datasets.

Download Transcript

Logarithms: Mobile Phones, Modelling and Statistics

Professor Oliver Johnson

22 May 2024

 

Introduction: Sums and pandemic data

You might be surprised if I claimed that the apparently terrifying sum 2,147,483,648 x 2,199,023,255,552 = 4,722,366,482,869,645,213,696 was as easy and as natural as the very simple one 31+41=72. However, the aim of this lecture is to explain that in a certain sense this is true, thanks to the mathematical concept of the logarithm – which was first developed and popularised by the work of early Gresham Professors.

Further, it might come as a surprise that this same idea also helps us understand topics such as the coronavirus pandemic. When, in the early foothills of the second wave of autumn 2020, Chris Whitty and Patrick Vallance held a press conference at which they shared the following slide, they were subject to a certain amount of derision in the press and on social media.

The reason for this is that the slide depicts exponential growth, which is not a natural process for many of us to visualize and extrapolate. One reason for this difficulty in interpretation is the way in which the data is presented – using a linear scale on the y-axis. This is perhaps the standard scaling for numerical data, where each step up represents a constant amount of additive growth (an increase by 10,000 each time in this case).

However, pandemics do not tend to grow in this kind of linear way. The equations which govern their evolution suggest that multiplicative change is a more natural way to think about growth in infections. This can be represented using a different type of y-axis scale, a so-called log scale, where effectively we plot the logarithm of the number of cases rather than the raw number itself. Using this scale, each vertical step represents a constant amount of multiplicative growth (a doubling for example).

In the lecture, I argue that had Whitty and Vallance plotted their data on a logarithmic scale then the second wave growth would have seemed much more natural and predictable. Cases at the time plotted in this way lay close to a straight line, which is easy to extrapolate by eye. Had the data been represented using logarithmic axes, I believe that politicians and the public could have appreciated the danger of the second wave much sooner, and acted in ways that could have reduced the risk of further lockdowns.

 

A brief history of logarithms

Logarithms were first introduced by John Napier (1550-1617), but it is interesting to note that Napier did not give the simplest and most common form of the function. This development was due to Henry Briggs (1561-1630), who was the first Gresham Professor of Geometry and who described the logarithm in the modern form in which it is generally applied today.

There is a further Gresham connection through the kind of logarithmic scale that I described above. Such representations were developed by Edmund Gunter (1581-1626), the third Gresham Professor of Astronomy, as a means of representing numerical data. Gunter’s work was built on by William Oughtred (1574-1660), who turned these log scales into a physical device, the slide rule.

In order to understand how the slide rule works, the lecture reviews the basics of logarithms. The key is that for example 3 is the logarithm of 8 because we can obtain 8 by multiplying 2 by itself 3 times. Similarly 2 is the logarithm of 4 because multiplying 2 by itself 2 times gives this answer.[1]

Then, we can understand the sum 4x8=32 by thinking that in total there are 2+3=5 factors of 2 on the left-hand side of the equation, and 2 multiplied by itself 5 times gives the 32 on the right. In other words, when we multiply numbers together, their logarithms add together. This is the key relationship which defines the effect of the log, and which underpins all the kinds of real-world examples we see below.

 

Application: Logarithms and epidemics

As I have already mentioned, logarithms gave us insight into the progression of the COVID pandemic. This was not a coincidence. As I’ve described, logarithmic scales are natural in scenarios where processes tend to evolve in a multiplicative way – and this is exactly how pandemics tend to behave. At a very crude level, at least early in an epidemic, each infected person will tend to infect a similar number of people (this value being the famous R number). As a result, total numbers of infections will double at a fairly consistent rate, with the move from 100 to 200 infections taking roughly as long as the move from 100,000 to 200,000. Using the logarithm can tame this wild kind of growth and help us extrapolate such growth out into the future.

Of course, exponential growth doesn’t last forever. Indeed, it’s clear that such repeated doubling would run out of people to infect in due course. However, a more delicate analysis was performed in the classic “SIR” paper of Kermack and McKendrick, published in 1927. Under this model, everyone is considered to be in one of three categories – Susceptible (not yet infected), Infected (and hence able to infect others) and Recovered (now immune). By analysing the interactions between numbers of people in these different categories at any given time, Kermack and McKendrick were able to deduce the existence of a Herd Immunity Threshold (short of the whole population) at which time epidemics would naturally peak on their own.

Further, as I described in the lecture, by recasting these equations in terms of the logarithm, it is possible to understand that straight line growth on a logarithmic scale is in some sense the default behaviour for a pandemic. While the numbers will eventually deviate from such a trajectory, in practice this can take far longer than we might like.

A particular example of this came in the second wave of the virus in Autumn 2020. As I’ve described above, simple projections based on exponential growth (represented as a straight line on a log scale) captured the evolution of cases and hospital admissions to a high degree of accuracy for a period of weeks at a time. In particular, in the lecture I showed how my crude prediction (in mid September) that hospital admissions in the Northwest of England would lead to us reaching the levels of the Spring 2020 wave at around Hallowe’en were born out by the actual data that occurred.

 

Application: Logarithms and information

Another setting where logarithms naturally arise is in the mathematical study of information. It might seem surprising that information itself can be treated like a physical resource, being quantified and processed, but many deep insights by the American mathematician and engineer Claude Shannon in his 1948 paper A Mathematical Theory of Communication created a framework which allows us to do this.

Shannon argued that the amount of information that we learn from the fact that a particular event has occurred should be some function of its probability. Rare events should bring more information (I learn more about the state of the world from spotting a rare animal than a common one) which helps fix what kind of function should be used. By thinking further about the fact that the amount of information that we gain from two independent events (such as a coin flip and a dice roll) should be the sum of the amount of information we gain from each one occurring, Shannon was able to argue that the right function to use was the logarithm.

This insight allowed Shannon to define the concept of information-theoretic entropy, which is a measure of surprise. It quantifies the idea that some random processes are more random than others. Further, Shannon showed that this entropy governs the amount of compression that can be achieved when trying to squash down a set of data (such as zipping a computer file into a smaller format for easy storage on a hard drive). Indeed, Shannon introduced the terminology of a bit[2], which is the natural unit in which his entropy quantity should be measured.

These brilliant insights of Shannon, with the logarithm at their heart, mean that his development of information theory underpins much of the modern world. The quantities in which we deal now are very much larger, as we talk in terms of megabytes, gigabytes and terabytes that are formed as huge multiples of Shannon’s basic unit. However, every time you negotiate a mobile phone contract, check your broadband speed or decide what size of memory card to buy, you are following a programme of technology set on its way by Shannon’s original work of the 1940s.

 

Application: Logarithms and social media followers

One common scenario when we work with the logarithm of data rather than the data itself is when the numbers span many orders of magnitude. For example, the standard Richter, pH and decibel scales are all logarithmic scales – it is much easier to deal with and comprehend the numbers that result on a narrow range than having to deal with millions, billions or trillions. For example, each step up on the Richter scale corresponds to an increase by a factor of 10 in the energy associated with the corresponding earthquake. This means that an earthquake quoted as 9 on the Richter scale is ten million times (10 multiplied by itself 7 times) more powerful than one with a Richter value of 2.

Another scenario where numbers vary hugely is when comparing numbers of followers on social media. For example, footballer Cristiano Ronaldo has over 600 million followers on Instagram, whereas some people have fewer than 60 – again a factor of ten million. It’s natural to wonder whether there are evolution laws underpinning of networks that can allow us to understand why these kinds of wide variations might arise, and to see whether the logarithm gives insight into them.

A simple model for randomly evolving networks was proposed by the Hungarian mathematicians Erdos and Renyi in the 1950s, but it does not capture the kind of heavy hitter behaviour seen in the Instagram numbers mentioned above. In this model, everyone has the same probability of following anyone else, but in practice we know that this is very far from the case. Follower networks often evolve according to what is known as the Matthew Effect, after the Bible verse “for whoever has will be given more” (Matthew 25:29). For example, a Twitter user with many followers is likely to gain more retweets, meaning that their content is generally more visible on the network, and they can build yet more followers.

There are mathematical models of networks developed in the 1990s which seek to capture this kind of behaviour, and which theoretically result in the kind of ‘power law’ behaviour which is often claimed to exist in real-life network data. However, it’s worth noting that these claimed power laws are somewhat controversial in statistical circles! There are robust arguments even to this day about the exact structure of the graphs of follower counts. However, the one thing that everyone agrees on is that logarithmic scales are the right way to settle these arguments, and that log transformations of the numbers and looking for straight lines in the resulting graphs is the right way to resolve these controversial questions.

 

Conclusion

While introduced by Napier in 1614 and popularized and developed by the work of Gresham’s own Professors Briggs and Gunter, logarithms remain a key tool in making sense of the modern world. As I have described, they are naturally suited to making sense of epidemics, of information and of the properties of social networks. However, these are only a few of the applications where understanding logarithms and exponential growth can give insights into real-life situations.

As I describe in my book Numbercrunch, exponential growth is a natural model for many scenarios involving finance, due to the compounding effect of inflation and interest rates. For example, just as plotting pandemic data on logarithmic scales led to good predictions of COVID numbers, such representations give a natural description of football transfer fee records. Similarly exponential growth describes well the stratospheric progress in computing power we have seen in the last 70 or so years, with this closely following a trajectory of constant rate doubling known as Moore’s Law. We can expect to see exponential falls in the price of renewable energy and exponential growth in the numbers of electric vehicles, giving us hope that technology can provide solutions to climate change beyond those predicted by naïve linear extrapolations.

Logarithms are a powerful tool. The more you use them, the more you’ll spot them, and as with epidemic and social media example there are often structural reasons why they give a natural representation of data.

 

© Professor Oliver Johnson, 2024

 

References and Further Reading

I describe these mathematical ideas and others in more detail in my book Numbercrunch, Oliver Johnson (Heligo Books, 2023)

I write about how mathematics can help make sense of the world at my blog, bristoliver.substack.com

 

© Professor Oliver Johnson, 2024

 

[1] Strictly speaking these are logarithms to base 2, because 2 is the number we multiply by each time. Other choices of base are possible – for example logarithms to base 10 underlie the Richter scale in seismology as we will see later.

[2] Bit is short for binary digit: a bit can be a zero or a one.

References and Further Reading

I describe these mathematical ideas and others in more detail in my book Numbercrunch, Oliver Johnson (Heligo Books, 2023)

I write about how mathematics can help make sense of the world at my blog, bristoliver.substack.com

This event was on Wed, 22 May 2024

Professor Oliver Johnson

Professor Oliver Johnson

Oliver Johnson is Professor of Information Theory in the School of Mathematics at the University of Bristol, where he is Director of the Institute for...

Find out more

Support Gresham

Gresham College has offered an outstanding education to the public free of charge for over 400 years. Today, Gresham plays an important role in fostering a love of learning and a greater understanding of ourselves and the world around us. Your donation will help to widen our reach and to broaden our audience, allowing more people to benefit from a high-quality education from some of the brightest minds.