Topics in the History of Financial Mathematics: Early commerce to chaos in modern stock markets

Friday, 25 April 2008
Barnard’s Inn Hall

The event looks at the development of mathematics in a commercial and financial context. Starting with early commerce, through to the development of double entry book keeping, (Amatino Manucci), development of accounting (Luca Pacioli), theory of speculation (Louis Bachelier), history of optimisation, cash and foreign exchange, Black Scholes option pricing models, chaos and misbehaviour of markets with interesting twists. The event is chaired by Michael Mainelli and Simon Gardiner. The emphasis is on the historical development of mathematical techniques. This event will be of interest to students of accountancy, actuarial sciences and financial mathematics; historians of mathematics and commerce; practising accountants, actuaries and academics, and interested members of the public.

This event was held by the British Society for the History of Mathematics jointly with Gresham College.

Part One: 'The Influence of Amatino Manucci and Luca Pacioli' and 'Louis Bachelier and his Theory of Speculation'

This is the first part of a study day. It includes the following talks:
     Introduction by Professor Michael Mainelli
     The Influence of Amatino Manucci and Luca Pacioli by Dr Fenny Smith
     Louis Bachelier and his Theory of Speculation by Professor Mark Davis

Listen to the lecture

Download the video

Download the audio

Part Two: 'History of Optimization'

This is the second part of a study day. It includes the following talks: 
   Introduction by Simon Gardiner 
   History of Optimization by Professor William Shaw

Listen to the lecture

Download the video

Download the audio

Transcript of the lecture

HISTORY OF OPTIMISATION

Professor William Shaw

 

First of all, I want to thank both Gresham College and the Society for inviting me to give this presentation.  The theory of optimisation is a truly massive subject, so I?m going to tread a certain path through it, but just to give you the scope of what the subject is about and what I'm going to talk about, optimisation can refer to many, many things.  We can talk about profit and risk, and that will be the main theme of this discussion, but there are all kinds of problems.  One of the classic ones that is often thought about is the classic travelling salesman problem, which these days finds its expression in the analysis of complex vehicle routing problems.  In relation to Mark's talk, you can think about using utility maximisation philosophies to actually go into the pricing of options.  I will actually focus on something that I've been involved with, with the investment banking community, which is largely on portfolio theory - what assets to buy or sell, and how much of each to have.  So this is very much a history coloured by my own experience with the investment banking community over the last 25 years, supplemented by some interesting cultural themes and also, perhaps also, coloured by my interest in computational finance.  So I'm going to leave several things out, but hopefully give you a pretty good idea of a certain part of the theory.

Now, being an applied mathematician, I'm going to run a common trick, which is to work on two different timescale.  The first timescale is really going to be about the last 50 years, and I'm going to start with something that a good number of you may have even studied at school, and that's the basic idea of linear programming.  It's these days something you find in many high school syllabuses. 

I'll start with a very simple example.  You've got a farmer who can think about keeping x hundred sheep or y hundred goats.  The farmer will make so much money out of keeping each type of animal, but there are various costs and constraints associated with keeping these animals and feeding them, and there's fields to be grazed on, and one tries to maximise one's profit subject to certain constraints.  Now, when we look at this at school, we drew some very simple things: we draw some graphs to show what is the feasible region.  So in this idea, what we're going to have is that the farmer can, say, keep up to 400 sheep, or 300 goats, and these are real animals, okay - you can't go short goats, you can't graze negative goats on your field!  So, in the early history of this, all of these were thought as real quantities which are non-negative, and the classic theory of optimisation starts off with real objects.  I noted from Penny's talk we got into this idea of positive and negative quantities.  The early history of optimisation actually looks at real objects which are actual positive real variables and, in many cases, integers.  For various reasons, the farmer has some constraints.  He can keep up to 400 sheep - you can't have negative sheep - 300 goats - not negative goats - and then maybe, because of the size of the field, there's a constraint on the total number of animals. 

So we can build a feasible region which is this polygon here, bounded by the zeros, the three, the four, and a constraint, say, from the size of the field.  That's a very simple idea.  What happens next?

Well, it turns out that sheep are twice as profitable as goats, so the profit structure for this is essentially 2x + y, twice sheep plus goats, and so you get a profit structure which is a multiple of 2x + y.  We can look at the lines of constant profit, and you have an increasing function going up in this direction.  Every configuration on here, on one of these lines, is equal profit. 

Now, how do you maximise your profit?  Well, you combine those two structures, the profit structure and the constraints structure, and you work along the profit lines looking at where you finally hit the edge of the region, and it's at this point here, where you can't increase the profit any further by going outside the so-called feasible region.

Now, a very simple idea here: we're looking at maximising a linear function, linear functions never have zero gradient, so there are never stationary values, so maximisation is all about edge conditions, boundary conditions, where do you get to the edge.  You can see here that, in this case, we've actually hit a vertex of the bounding polygon.  That's an absolutely typical situation for linear programming.

Now, that simple idea gave rise to a whole industry in the period starting around the end of the Second World War.  Linear functions never have stationary values, so the extreme values in the linear case are always on a boundary and typically on a vertex, as in that example.  That leads you to the idea of searching amongst the vertices of the problem, and an idea for an algorithm is to find one that works and then work your way around the bounding polygon increasing your profit at each move from vertex to vertex.  That idea was finally crystallised into a very important algorithm by George Dantzig just after the Second World War, the late 1940s, and it's the so-called simplex algorithm, listed by one computational journal as one of the top 10 algorithms of the 20th Century.

The basic idea of course is very simple, as in those graphs.  Those of you that have never looked at this, when I teach this, I refer my students to the book by Bill Press and colleagues, Numerical Recipes, which gives a wonderfully practical discussion of it.  Another early contributor to this was John von Neumann, who also introduced the concept of duality into these problems, so finding a complementary problem that you could also think of solving with the same solution.

Now, what happened next?  Several people in the post-War period have worked on either improving the simplex method or doing it differently.  There are a great many contributors - I want to just highlight a few.

At the beginning of the 1970s, Klee and Minty showed that Dantzig's method, if you construct a suitably annoying feasible set, can be forced to visit all of the vertices of this big n-dimensional polygon.  So that leads to certain efficiency questions.  Having to search through all the vertices, the problem becomes very big as the problem size increases.  Dantzig started off with I think around...figuring out how to optimise 70 people doing 70 jobs, something like that.  As the problem gets much, much bigger, you get into potentially quite large computation times.  There was great interest in finding methods that brought down the growth of the problem as the number of variables increases, and there was important work by Khachian in 1979.  In 1984, I remember I was a junior research fellow in Cambridge, and the applicable maths community in Cambridge, there was great excitement because Karmarkar had introduced a whole new way of thinking about these problems which involves moving through the interior of the feasible region, trying to find this optimal boundary point, and that led to a competing algorithm with better growth as the problem size increases. I'm not - I sort of looked at this again in preparing for this lecture, and it's not clear to me that the competition between the classic simplex method and its evolution and the Karmarkar method, it's...they get pretty similar results on a large class of problems.  You can find problems that are better for one and not the other, and so on.

Now, these days, the basic idea of the simplex method is built into many computational tools.  So I can, for example, give that problem to this little computer here, and change the profit structure, fiddle around with it and change the optimal points, and you can just get off-the-shelf tools that will do that.

Now, using that, I want to show you an interesting feature.  I'm going to draw a little picture here, and this is actually very important for the financial applications.  You could think about changing the relative profitability of sheep and goats, and this you could think about changing the expected return structure in a financial portfolio optimisation problem.  As we do this, I'm going to change the relative profitability of the sheep and the goats.  Now, at the moment, these profit lines are wandering off increasing in this direction, and the boundary of my polygon is up here.  Now, as I change my relative profitability, what happens when we get to here is that the solution of the optimisation problem jumps discontinuously to another vertex.  In fact, at this level point, the optimum could be anywhere along that bounding line.  So we can see right away that there's an instability in the optimal configuration as a function of the problem parameters.  As we change the profit structure again, the optimal point jumps once more, and if I were to go to an extreme, it would jump to the bottom.  So we have this situation that these optimal configurations are, in some sense, unstable functions of the parameters.  Now, that is a recurring theme in this subject, and it's continuing to exercise people to this day in the context of more complicated problems, but it even occurs in the linear case.

Now, this is a purely linear problem.  There's no risk here.  There's no idea of the sheep getting sick or anything like that.  It's a pure profit maximisation problem.  If we want to think about adding some risk, we then go into a much deeper problem, and indeed, to ideas that actually go back much further in history.  In mathematical finance, the idea of maximising a profit is, of course, very important, but it's modulated by the need to simultaneously manage the risk of loss, and of course this is something we're all acutely aware of in the present market conditions.  This all boils down to a sophisticated version of the very everyday idea of not putting all your eggs in one basket.

Now, I had a look just to see how many different places that idea comes up, and there's an astonishing number of cultural references in history to it.  In Don Quixote, written in the 17th Century, it said: "It is the part of a wise man to keep himself today for tomorrow and not venture all his eggs in one basket."  One of my favourite quotes on this actually introduces an altogether more sophisticated idea.  William Shakespeare, in the Merchant of Venice, cunningly introduced the idea of actually having temporal diversification in your asset structure, when Antonio, in a conversation with his friends, told his friends he wasn't spending sleepless nights worrying over his commodity investments, and he says: "My ventures are not in one bottom trusted, nor to one place, nor is my whole estate upon the fortune of this present year, therefore my merchandise makes me not sad."  Hidden in there is this idea of thinking about having a balance between the near future and the rather more distant future in terms of the return on investments.

So there it goes back a few hundred years.  In fact, as many of you will know, it goes back rather older than that, rather further than that, but let's just formulate the maths a little bit.  If you're going to spread your assets around into many different baskets, the question is which baskets are you going to use, and how much are you going to put into each one, and that's the critical question.  Mathematically, we think of this as a weight determination problem.  We have a collection of weights.  Initially, again, they're all thought of as positive.  In the old days, you couldn't go short; you held positive numbers of shares, and we would have a collection of weights, all non-negative, adding up to unity.  So that's represented by this equation here, and the question is how do you pick these WI, the fractions of your investment that you put in each asset.  People were worried about this for a very long time.  Now, until recently, as I say, these weights were assumed to be non-negative.  These days, we explicitly allow for short selling conditions.

This - the solution to this is well-known.  I first heard of it as a result of a Newton Institute workshop talk a while back.  The oldest reference I'm aware of is the 4th Century by one Rabbi Isaac bar Aha.  You have to be able to read Aramaic if you want to see the original, but this is, for me at least - and I appreciate the correction if there is any - the earliest known written example of an asset allocation strategy, and it's the simple equal allocation strategy, and what you should do with your wealth, you should hold a third in land, a third in merchandise, and a third, as he put it, "at hand", which means in cash.  That is the so-called one over n strategy.  If you have n assets, you put an equal amount into each one.  That is actually widely studied, and is actually a very robust approach because, if you keep doing that, you're not necessarily turning over your portfolio very frequently when you move from one configuration to the other.  So it has certain attractive features, and you can find quite a lot of recent research on looking at that.

Now, in preparing for this talk, I wondered what the Ancient Chinese had been up to, and, not being an expert, I asked Xunyu Zhou, one of the world's leading optimisation theorists, and Chinese, as to whether the Ancient Chinese had had any thoughts on this matter, and he said something very, very interesting to me a few days ago.  He said that, as far as he could tell, optimisation was largely absent from Ancient Chinese thought and that it was, in the main, a concept introduced to China from the West.  His view was that this might have something to do with Confuciun culture, where you stay in the middle and never venture off to extremes, so that the idea of maximisation was somewhat alien to the culture.  I asked Xunyu this question, having rummaged around looking at various texts on the internet, and Confucius did not actually say anything along the lines of our Rabbi, but he did find one other thing which I thought was quite good fun, especially in current market conditions, and it translates very well: "He who will not economise will have to agonise," which is something that weighs heavily on my mind, just having had to remortgage.

I'm slightly surprised at Xunyu's response because the idea of, say, balancing what one does across several areas, a risk minimisation idea, seems to me to be consistent with that view of staying in the middle.  On the other hand, the optimisation process seems to be fighting with the principle, so I...will look into that.

Now, the joke goes that for the next one-and-a-half thousand years, people were busy with writing grant proposals and not getting very much real research done, and it had to wait for Harry Markowitz and co-workers after the Second World War to develop this idea further.  Harry Markowitz, at the end of World War Two, went to work for the Rand Corporation and met George Dantzig and got interested in optimisation.  This led to a great deal of fascinating work, which I will now drastically oversimplify.

The idea was that one should, in one version of the problem, minimise the combination of risk minus profit, and the relative weighting of those two is a parameter of the problem that measures your aversity between risk and return.  So, for example, in pension investments, certainly as one gets towards later life, one thinks about a very small version of Lamda - you want to keep the risk down - whereas, perhaps when you're younger or you have a more balanced collection of investment, one thinks of a large value of Lamda.  So this is the new idea.

Now, this conceptual idea and the mathematics of it was something that was introduced in Harry Markowitz's PhD thesis.  I was very amused by Mark's comment about the reaction to an earlier PhD thesis because, during Markowitz's thesis defence, the eminent economist Milton Friedman thought that these ideas for portfolio management were not economics, which is quite amusing because this work led to the so-called Nobel Prize for Economics or, I guess, as Mark raised the issue, the Nobel Memorial Prize for Economics, but let's not go down that route, let's not have that argument!  Anyway, as with all great innovators, there was initially some considerable scepticism on the matter.

Now, there are various related formulations of this problem.  You can think about maximising return subject to a [risk] bound.  You can - let me just blow this up a bit in fact - you can think about minimising risk subject to a return goal.

Here's something that fund managers are very fond of.  They like to work relative to an index.  So fund managers have this annoying habit of not working in terms of absolute returns but in relative returns, and this allows them, for example, to make comments that - the tech fund crisis was a good example of this.  So you would get a dreary economic statement saying that your technology ISA had lost 85% of its value, but compared to the Pacific stock technology index, we outperformed by 8%...!  So a lot of the investment theory is formulated in terms of returns relative to a background index, which is very convenient for the fund managers.

Either way, whether in terms of absolute or relative weights, the classic choice of optimisation function is a generalisation of the linear structure to some kind of quadratic structure.  Now, where's that quadratic structure come from?  It's come from the notion that the risk is characterised essentially by the variants of the configuration.  That's an important...it's an important idea, and it's a rather important limitation.  Markowitz was aware of its limitations in the early version of the theory, but it was only really with this particular structure that it was possible to make theoretical progress in those early studies.  So, for several decades, the work has focused on this type of risk structure.  You will note that this, in trying to minimise this, you are simultaneously minimising upside deviation and downside deviation, okay - it's total risk.  The asymmetry between being happy about profit and being worried about loss isn't there in this form of the theory, but Markowitz was aware of that.

Now, in reality, there are various constraints, and in the early days, the weights were assumed to be non-negative, and you can inject into this all kinds of matrix equalities and inequalities, equalities and inequalities, and these could be to do with sector exposure, so you might not want to have more than 10% exposure to the oil sector, 5% to banks, and so on.  So these problems come with a fairly complicated constraint set, but we get the classical theory of quadratic programming out of this.

Now, in linear programming, the optimal solution is always at a boundary point, as we saw earlier.  In the quadratic case, it might also be in the middle, but it might also be on the boundaries.  So we now get a more interesting theory, where it's not just to do with boundaries, it's not just to do with differentiation to find stationary values; it's a hybrid of the two.

Let me just show you how that works a little bit because it's interesting.  What I'm just going to introduce here is a parameterised covariance matrix, three assets, and I'm actually going to work out some weights, and in the middle of the problem, if all the weights are non-zero, we can get a formula as a function of our correlation parameter, but if you look at the boundary structure, you see there is actually a more complicated interaction.  So, here's a classic QP, quadratic programming, solution for three assets, so I've got three assets - one, two, three.  For large negative values of the correlation, we're entirely in the first two assets and not in the third, and then as we change the correlation structure, what happens is that for some points, we're invested in all three assets, and then the contribution of one of them drops to zero and we remain in the other two.  We can visualise that a little bit - let me show you just what's going on here.

What I've got here is a diagram in which I've plotted the risk as a function of two of the three weights.  My first weight along here is 0-1, the second weight along here 0-1, and the third weight is obtained by solving the constraint, and the final constraint that the third weight is between 0 and one gives us this boundary line here, and this surface is the risk of the configuration.

Now, what I want to show you here is the geometry of this.  As we change the correlation structure, what happens initially is that the optimal point, that's the minimum of this surface subject to the constraints, rolls along the boundary - so it's a certain type of problem - and then it starts to migrate into the middle and finds the [minimum] of the surface.  Then, as we increase the correlation some more, it migrates to the outer edge and then just sticks there.  So you've got this interesting interplay between differentiation and boundary conditions.  Again, you can get quite big changes in whether or not, say, in this case, you don't hold Asset 2 at all, and in this case, you do hold Asset 2, so again, you have this idea of quite big changes in your optimal configuration coming out of the problem.

Much has happened since then, and I'm not going to pretend to survey it all.  I want to draw people's attention to a very nice contribution by Professor Michael Powell of DAMTP in Cambridge, one of the foremost numerical analysts of his generation.  Mike did something very interesting.  You see, at the background to all this is the notion that the risk of the structure is just characterised by the variants.  We know in reality that financial risk is not just the normal distribution.  The mean and the variants aren't everything.  You could have skew distributions, you could have kurtosis, you can have skewness, you can have all kinds of structure in there, or you could try and work not just with the moments but the whole distribution.  So one question is what do you do if you want to manage this risk/return balance problem in a situation where you're not just looking at quadratic functions.

Mike Powell, in 1989, wrote a Cambridge DAMTP report and a computer programme which actually solved the problem of maximising non-linear functions, not necessarily quadratic, subject to linear constraint sets.  So in other words, what he did, though it was buried in a big pile of FORTRAN, was to actually allow a certain class of risk minimisation problems to be solved without the assumptions of it being a quadratic structure.  Now that work was done for the computer vendor selling the IMS cell numerical libraries, and they really didn't do very much with it.  Mike has said...was rather irritated, and said very publicly that they didn't do very much with it, and it was because you had to be able to differentiate the function and do certain things with it.  But it's, in my view, an important contribution because it takes us outside of this quadratic framework, so you can think about non-quadratic risk structures.

Another important practical point about this is that Powell was giving away source code, and in comparison with some of the commercial optimisation vendors, who wanted to charge you many thousands of pounds or dollars for black boxes, Mike Powell actually put his source in the public domain and the numerical analysis report was also there as well, so you can get at it, see how it works, you can write code which talks to it.  His algorithm allows for lower bounds on weights, upper weights on weights, budget constraints, sector exposure and things.  Anyway, as the time is running short, I'm not going to sort of implement that in detail here, but if anyone is interested in getting that or seeing it working, get in touch with me at King's.  Anyway, there's some free optimiser software you can get.

Okay, let's go back to the theoretical developments.  Use of this four covariance structure is not the only way to go, and there are...there's a thread of developments in mathematical finance and economics which links to what it is one's trying to optimise. 

So, if we go back to the quadratic notion, there is a collection of ideas based on the capital asset pricing model, and the idea here is - and this is viewed with my optimisation glasses on - not to use to use the four covariants to characterise the risk, but to try to think what are the key drivers of risk.  One model of the key drivers of risk is to say that assets are correlated through their link to the main market index.  The two key equations here in this framework - it was developed by Bill Sharpe, Treynor, Lintner, Mossin, and many others - was to say that asset returns, first of all, have a component beta times the index return, plus an asset-specific component, and then there is some noise in the structure.  Now, with this model and these being independent, you end up with a covariant structure where all of the correlation components are actually determined by these betas.  So if you take the variant of the index, all the correlation in the system is governed by a structure like this, and then there is a further diagonal term which is asset-specific volatility.  That's a one factor model.  Then, there are some - that's one factor, there's four covariants - and then there are many things in between that, where you can say, let's suppose that the returns and the risk are dependent on two or more factors, and those two or more factors could be purely abstract mathematical objects, defined by an eigenvalue analysis of the covariant structure, or they could be based on real, observable economic factors, like foreign exchange rates, oil prices, as well as index values.  So you get a family of models based on this, where the covariant structure is resolved into a kind of vector of betas, which gives you the multi-factor component, plus again some residual asset-specific risk.

Now, one issue with all of this is that the four covariants approach, the CAPM approach, and this approach, which collectively goes by the name of APT, all give you different answers for the weights, so it's a little bit of an issue to decide what to do.

Let's just talk briefly about time dependence.  There are many times of generalisation, this idea where you fold in some time dependence.  One is to analyse the idea of rebalancing your portfolio periodically.  So, every three months, you want to create a new optimal configuration, and there, people think, for example, about market impact studies.  So if I've got one collection of weights and I need to go to another which is different, how do I get from A to B at minimal cost?  There's work by various people on that - Almgren & Chriss in 1999, a paper, did quite a lot on that.  There's the whole idea of doing - the whole [?] problem in continuous time, and that's been thought about by Xunyu Zhou and his collaborators in the last few years.  So there are a host of time dependent variations.

I want to talk briefly about so-called post-modern portfolio optimisation theory. This largely revolves around the idea that variance is not everything.  There's a bit more to it than that, but variance is not all of risk in the realistic non-normal world, so there are various types of realistic objective function under consideration, semi-variants, which was actually one of Harry Markowitz's original ideas, which was to think purely about the downside risk.  You can think about modulating risk by having not just variants but skewness and kurtosis, and you can avoid moments completely by having measures that are functions of the entire distribution.  You can find ideas - the Sortini Ratio, Keating's Omega Parameter, and so on - all of which give ideas for more interesting classes of objective function which allow you to combine both the sort of non-normality of the system together with the one-sided thing you do, which is worrying about a loss but not a profit.

Robustness?  Okay, this is very important.  There's been a lot of work really in the last few years on this.  You've seen, even in my little sheep and goats example, that the solutions to these problems are unstable.  Instability is going to come back again later I suspect.  But here we have a situation where sheep and goats, if you change the profitability, the optimal configuration changes.  We know, for example, that as we vary from a four covariance model of risk back down to CAPM or back up through APT, we can get different answers for the weights.  Why is that a problem?  It's just a fact about the system.

It's a problem for two reasons - at least two reasons.  Firstly, it costs money to change from one configuration to the next, so if your optimal configurations are jumping from one state to another, you're transacting for instability reasons.  The second thing is that we don't actually know the parameters of these functions we're trying to maximise that well.  We don't know the future returns precisely.  We just have estimates - don't have a clue.  The covariance structure is an estimate based on historical data.  There might not be enough historical data to get a covariance matrix that actually has the right properties. 

Now, Fenny cheered me up wonderfully by referred to Girolamo Cardano towards the end of her talk.  Cardano was the person who completed this sequence about thinking of different types of numbers.  You had whole numbers, you had fractions, and then you had reals and you had negative numbers, and when it came to Cardano, he finally came up with imaginary and complex numbers.  Now, one of my dubious claims to fame, I was the first person in the history of Nomura international, in the 1990s, to come up with complex numbers for option prices!  Now, how did I [struct that way]?  It's actually related to these covariant structure degeneraties, because if you have a covariant structure and you don't enough data to define it properly, it can be degen - it can have zero eigenvalues in it, and that means that there can be a whole valley in the risk structure and you can move along there without changing your risk.  Not only could it be degenerate that way, if a trader has done a "what if" calculation and fiddled with a correlation, you can actually go to a covariant structure that's got negative eigenvalues.  When you apply this to multi-asset option pricing and diagonalise it, you end up with negative pseudo-volatility squares and imaginary volatilities, and then you end up with imaginary numbers in your Black Scholes model, and of course that's ridiculous.  Fortunately, I was able to point the finger of blame at the trader who'd changed his covariant structure, so it was all alright, but it's a reflection of the fact that these covariant structures are really hard to deal with [carefully], and in particular, you don't know them that well, so near degeneracy in covariance can cause absolute chaos, and then the covariance is not that well-known anyway, and the question is what do you do about that uncertainty.

So in the last few years, there's been work at Imperial College, managing it with various scenarios.  In the United States, Goldfarb-Iyengar have said these covariance numbers are not precise things - we'll have little balls, and they can live within those balls.  Some people have rectangular uncertainty domains, and so - and there's work by some former colleagues of mine in Oxford, Rafael Hauser, Dennis Zuev, Ray Tutuncu - I think he works for Goldman's now - all looking at this idea of robustifying the solution, and what they do is to introduce the idea of finding the best worst case.  So you think about all the bad things that could happen, and then you improve those as much as possible.  One thing that's going on with that is that most of the approaches I've seen on this keep the classic quadratic structure.  One thing that really needs to be done is to develop, in detail, combine this robust approach with a more realistic, you know, skew, one-sided optimisation structure.

Okay, so I think we're probably all getting close to gagging for lunch. I'll skip this last thing.  One thing that everybody's worrying about in the fund management business these days is the so-called 130/30 philosophy of investing, which involves, or not, various degrees of optimisation, but I will skip that and go on to my summary.

Optimisation is a topic that's under active investigation at the present.  We've got - and there's all kinds of different ways of looking at it.  Mathematicians are interested in the mathematical ideas - how should you characterise the risk structure?  Numerical analysts and computer scientists are interested in finding efficient algorithms.  Fund managers are interested in their bonus.  All of us, most of us here, will have pensions that we would like managed well and on a rational basis, so it's important to everybody that this gets done right and there's a lot of work on it. 

And then beyond the financial thing, you know, we would like to see food distributed around the country with minimal fuel costs, minimal environmental impact.  One of my old school colleagues runs an optimisation company solely devoted to vehicle routing on that basis.  Very important work.

Tasks need to be allocated efficiently amongst workers, and that was Dantzig's original problem; chips designed efficiently; if you're interested in home cinema, you want shortest possible video pathways in your AV amps, and so on and so on and so on.

So there are all kinds of wonderful problems linked to this, and it's a - this subject is really only 50 years old, and there's going to be a lot more to come.

 

 

©Professor William Shaw, Gresham College, 25 April 2008

Part Three: 'Mathematics and Foreign Exchange' and 'When Computing Met Finance'

This is the third part of a study day. It includes the following talks:
   Introduction by Robin Wilson, Gresham Professor of Geometry
   Mathematics of Currency and Foreign Exchange by Professor Norman Biggs 
   When Computing Met Finance by Dr Dietmar Maringer

Listen to the lecture

Download the video

Download the audio

Transcript of the lecture

MATHEMATICS OF CURRENCY

AND FOREIGN EXCHANGE

 

Norman Biggs

 

I have to say that when I set out to prepare this talk, the material that I'm going to actually present today I imagined would be approximately a quarter of what I was going to talk about, but I got into this subject and I found that I was really rather satisfied with the conventional wisdom on some of these things, so what I'm going to talk about is - it says there - arithmetic at the end of the 13th Century, so this is definitely medieval, and I apologise for not going into the later developments in foreign exchange, which have a great deal of mathematical interest, but in a different way.

So the talk's actually going to be in three parts, as usual.  I'm going to start by talking a little bit about the financial, commercial background; I then want to talk a little bit about the technology, as it would be said, that is the arithmetical tools that were available; and then I want to see what you can deduce by putting them together.

So we begin with some money. If you were a Viking, at the beginning of the 10th Century, this is what you would have understood by money.  This is the Cuerdale Hoard, which was found - Cuerdale is in Lancashire, and all the objects are bits of silver - we now call them hacksilver - and among them, there are some rather special bits of silver which we call coins.  They were not of course Viking coins; they were Saxon coins, Anglo-Saxon coins. The Vikings took them over and treated them as part of their money system.

So what do we mean by money?  For our purposes today, 19th Century analysis will suffice.  A modern economist would give a series of lectures at this point on the functions of money and its uses, but Jevens wrote a book in 1972, which was very popular and very influential, and he found it worthwhile to distinguish from the beginning between a medium of exchange, by which we mean the money objects, the coins the hacksilver, and so forth, and the accounting units which are used to measure value. He believed that money arose mainly because these devices made barter, or avoided the inconvenience of barter.  Modern economists would think of other uses for money, and many other kinds of money of course, but let's just think about barter for a moment.

Here we have a picture from the 11th Century, which was printed by Salzman under the title of "A Simple Bargain".  So, we think we know what's supposed to be happening here.  There's a chap who appears to have a chicken, and a chap who appears to have something else, and they're trying to barter, but it's not as simple as that.  First of all, the something else looks a bit like an old sock, which wouldn't actually be much good for barter, and then, if you look a little more closely, you'll see that the chicken man has in his hand a coin, in other words money, but it's not clear whether he's received the money from the sock man or whether he's actually giving the money to the sock man.  I've looked at the original of this, or at least what Salzman says about it, and it's still totally unclear what this is supposed to represent.  Those of you who get the BHSM bulletin will have read a paper by John, from the Open University, John Mason, who explains how barter came to cover an enormously complex operation in the later Middle Ages, with the intervention of money in all forms, money and credit, whether it was ready money or not.  So barter wasn't all that simple. 

Coins, however, did simplify the situation, but in order to use coins in trade, you've got to know certain things about them.  In medieval times, a coin was worth, that it was accounted for, in the terms of the precious metal that it contained, and you could only measure that by actually doing two things.  You first of all had to ascertain the mass, which you could do by weighing, but then you had to work out the fineness, in other words, what proportion of the mass was actually the precious metal.  That second thing, the assaying, or finding the fineness, was rather difficult.  There are methods known, and have been known since antiquity, for assaying both silver and gold, but they are not something you can do on the spur of the moment.  There is a touchstone method, but it's extremely unreliable.

So weighing was normal in trade, and here is a rather depressing picture.  These are some poor slaves are being sold, but you can see that the essence of the transaction is that the coins that are being paid for the slaves have to be weighed, and so they are not accounted for by the number of the coins - it is the actual amount of silver or gold that is concerned.

Some improvement came along in Western Europe, Christian Europe, when gold coins were reintroduced.  There had been gold coinage in Roman times, and it was reintroduced, gold coinage, in Islam of course, where gold was in supply from Africa, but gold coinage didn't come into Western Europe until, wasn't reintroduced until the 12th, 13th Centuries.

So here we have the four representative coins from that time: starting off with the typical Islamic dinar; and then a coin, Castilian coin, imitating the dinar, the second one, obviously the same design, but with a different kind of lettering; and then we get into the more typical European-type coins, something called an Augustale of Frederick II; and then, playing a large part in the rest of this talk will be the florin, which was the Florentine coin which began to be minted in 1252.

Because there were a lot of these things coming into circulation, it was necessary to know the fineness of the coins.  I've mentioned the weight.  We assume that a merchant could ascertain the weight of an individual coin, but in order to ascertain its fineness, he either had to do a very complicated operation, which was beyond the bounds of practicality, or refer to some reference.  Here we have the list made by Pegolotti, a Florentine merchant, which was compiled around 1300, and these are just some of the gold coins that were in circulation at that time.  I've put little marks against the four that were actually depicted on the previous slide. 

So the first one is the florin, and the Florentines claimed that this was 24 carats fine, so you see something there which says the number 24 is the crucial one - 24 meant pure, so everything was divided into 24 parts, and instead of a percentage, which of course they didn't have, it was how many parts out of 24 were gold.  The Florentines believed that their florin was pure gold.  If you look down, you'll see that the Castilian coin was 23¾ carats fine; and the Allessandrian bisant, the Islamic coin, was believed to be 23½; and the poor old Augustale, or Agostantini as it's called here, was only 20 and something, 20 and a bit, carats fine.    In a transaction of course, large sums of these things were being used, and it was necessary to know this fineness.

So that's the starting point.  Now we move on to the international situation.

So, if you want to send your wool to Florence from London, then, well, how are the Florentines going to pay for it?  They could send florins, but that would be a risky business, sending a large amount of money by ship, and so in the 13th Century, the mechanism of the bill of exchange grew up.  I'm going to just very briefly explain how that worked.  This is based on the account given by the eminent medieval historian, Peter Spufford, who has made a study of all these subjects.

So this is the situation for sending a shipload of wool from London to Florence, and how it's going to be paid for using the bill of exchange mechanism.  So there are the four parties, two of them in London and two of them in Florence.  On the left-hand side, they're the principals if you like, and on the right-hand side, there are the bankers.  What happens is that the agent in London sends the shipload of wool to Florence, and various entries are made in the books to account for that.  The remitter, as it's called, in Florence then pays for the wool, but pays a banker, a drawer, in Florence, in Florentine currency, florins, and various accounting things are made.  In order to settle up, we then draw up, the banker draws up the bill of exchange, which he sends back to the remitter, and also notifies an agent of his, the payer, in London, and the bill then goes back round the system, is presented by the payee in the London to the payer, and the payer then pays the payee, this time in London money, sterling.  So the point is that, in order to carry out this transaction accurately, there is obviously going to be some mathematics involved, and the next step is to ask what technology, that is what mathematical tools, arithmetical tools, were available at this time in order to carry out such transactions.

Now, I'm glad to say that Fenny Smith and I agree on several of the points I'm about to make, so excuse me if I'm repeating things that she said.  But we all know that you can't do sums with Roman numerals, and of course the Romans weren't stupid - they managed to manage their empire, large empire, for several hundred years without actually trying to do sums with Roman numerals.  For sums, they used the abacus, the calculus, the counters on a grid, and they became very adept at doing that.  The word "abacus" can be confusing because many people think of abacus as meaning some eastern object - sliding beads and so forth.  Abacus, in this context, means any grid or plan onto which the counters could be used.  The other problem with it is that of course the abacus would quite often be just scratched in the sand or drawn on a slate, and so very few of these things have survived, and the counters themselves, although there were undoubtedly lots of the counters, when they're dug up nowadays, the archaeologists tend to say that they were gaming counters or things of that kind, that they're quite often mis-classified.

There were other ways of doing arithmetic, and here's a particularly horrid one, moving into the medieval period now.  This comes from Bede, and this is finger reckoning. Although it's great fun, I cannot really believe that it was a very practical method of doing arithmetic, and it's really just there for interest.

However, when we come to the turn of the millennium, things begin to improve.  This is usually associated with the man who became Pope Sylvester II, who I shall pronounce as Gerbert - if anybody knows better, please tell me - who is credited with various steps in the improvement of arithmetic.  It's confusing, because not only is he credited with introducing an improved form of abacus, but he's also associated with the introduction of the Hindu-Arabic numerals.  So let's see what actually was available, and for this, I'm going to rely on two accounts, and I rely on these two because they are basically confirmatory - they both say roughly the same thing, although they don't appear to a great extent in the published accounts in the history of mathematics. 

The first one, this is a book by Florence Yeldham, published in 1926, called "The Story of Reckoning in the Middle Ages", which I would recommend.  She gives this manuscript from Ramsey Abbey, and she describes the accompanying instructions from the manuscript.  This is, if you like, an improved form of abacus.  The first thing to notice is you've got the Roman numerals heading the columns, so you've got...working from the left, you've got units, tens and hundreds, and then repeated again, units, tens and hundreds in the thousands, if you like, and so on up to...I think there are nine lots of the columns.  So that's Roman numerals.  But then, hidden amongst the arches at the top, you will see Hindu-Arabic numerals, and also, while we're talking about the writing, what's written along the bottom are in fact the Roman names for fractions, which indicates that division sums were being done here as well.  So, the other important thing, which is not apparent from this diagram, but which is confirmed by the descriptions, is that the counters that were used in Gerbert's abacus did in fact have the Hindu-Arabic numerals written upon them, and the calculations were done by moving these numbered counters around.

A similar account is given by Turchill, who was believed to be a clerk in the royal household, often said to be a clerk in the Exchequer.  That is slightly confusing.  The book which I use, rely on for this, is the account given in Poole's book on the Exchequer in the 12th Century.  Poole was of course mainly interested in the Exchequer, but the Exchequer is rather misleading, because although an abacus of this kind was undoubtedly used in the Exchequer, and indeed gave its name to the Exchequer - the chess board - it was a simpler kind of abacus for a quite different purpose.  It was for counting the sums of money that the sheriffs brought in from the counties, and it was only really...the sums that were done were only really additions and subtractions.  But Turchill I think must have been a clerk in a more senior position, if you like, and he wrote about the abacus in general, and his account tallies very much with the one that was given in the manuscript described by Yeldham.

So, what were the two distinctive features?  The grid was arranged in the columns, and the counters were labelled with Hindu-Arabic numerals.  This is - Gerbert was around the turn of the millennium.  In fact, I think he was Pope at the turn of the millennium.  These accounts are 1100, beginning of the 1100s, the 12th Century, so this is all quite some time before even Fibonacci, and my conclusion is that, in the higher realms of government, and perhaps in the large monastic estates, Gerber's abacus, with all the frills, was well-known in the 12th Century.  Now, that shouldn't deflect us from noticing that of course older forms of the abacus, with plain counters, useful for doing additions and subtractions and accounting and so forth, were used by merchants and, as Fenny has said, we know that that carried on into I say the 17th Century - you may say the 18thCentury.  People still used the jetons and the counters for doing simple arithmetic.  However, from a mathematical point of view, the real interest is that Gerbert's abacus evolved into a system where you weren't just moving the Hindu-Arabic numerals on the counters around, but actually using the same procedures, the same algorithms, for doing the calculations with the numerals themselves.  In other words, what we would often call pen-reckoning, but doesn't have to be with a pen.  It could be traced in the sand with a stick, or it could be they were written on a slate, and for both of those reasons, it's unlikely that we'll find many extant examples of them, but nevertheless, my belief is that the Gerbert abacus and the pen reckoning algorithms, using Hindu-Arabic numerals, were very similar, and in certain quarters, they were well-known by this time.

Here is one of the few examples of pen-reckoning, from 1320.  These are just long addition sums, but it's clear that that's using the sort of standard methods.  There are the carrying symbols and so forth, and the numerals are almost recognisable to a modern eye there, so you can check that out.

So, one reason why I think there's some confusion about what was going on, about the technology that was available, is that there are misleading clues.  One of the misleading clues is the oft-quoted fact that in 1299, the Florentine bankers' guild is said to have banned the use of the Hindu-Arabic numerals, and you will read this in a number of standard texts without really any further comment, but of course they weren't actually banning the use of the Hindu-Arabic numerals in the banks.  All they were saying was that you must carry on publishing your accounts in the traditional method, using the Roman numerals to record these things, so that everybody, or at least everybody who was interested, could actually understand them.  In other words, the general population who could understand Roman numerals could see what was going on.  They were not banning the use of Hindu-Arabic numerals in the banks, and in fact, as I say there, almost certainly, for about 100 years at that time, the Hindu-Arabic numerals had been in use.  And just to echo again something that Fenny said, we know that this led to the establishment of the schools in Florence and other Italian city states, where the children were told - confusingly it says abaco, abaco and algorismo, but one shouldn't think of those as being distinct things. The algorithms for doing calculation is really what's being concerned there.

I recommend this book by Alexander Murray, which I don't see quoted very much in the history of maths literature.  He approaches it from a medieval historian point of view, but in fact, he has several chapters which I find compelling and convincing on this subject.

Ah, there we go!  You've seen this one before, and like you, I think it's a source of confusion.  We have this smiling Boethius it's supposed to be, which of course is complete nonsense historically, doing his Hindu-Arabic sums there, algorithms; and we have poor old Pythagoras, looking very glum, trying to use an abacus, but he's using the plain counter sort of abacus and not enjoying it very much, by the look of it.  So this is to be thought of as a sort of modern Bostock and Chandler sort of book, which sort of tells you how to do things at a certain level but doesn't actually lead you on to the higher levels of thought.

Okay, well let's see what happens when we try to put these two things together?

Let's now sort of see when we try to put these things together.  Now, there's a problem here, that in order to describe this succinctly, to a modern audience, I'm more or less forced to use modern notations, symbols and so forth.  We must not think that this was the way the medieval arithmeticians thought about it.  They didn't even have an equals sign - as we know, that was Robert Recorde 200 years later.  They certainly didn't have the idea of let x equal such-and-such, and then doing elementary algebra.  So it was all done in a more...verbal kind of way, and the arithmetics of the time would talk about the rule of three, for example, which to a modern eye, is just a simple proportion sum, but they would have a number of mechanisms for working out, if it was a is to b as c is to d, depending on which one was the unknown that you wanted to calculate, you had different rules and methods for doing it.  Okay, but I'm going to talk in these terms and hope that we can thereby elucidate what was going on, without overriding the mode of thinking that lay behind it.

So, by 1300, of course there were many places, centres of population, where trade was flourishing.  Let's first of all think about a single place, x, and in this place - we'll go back to Jevens' distinction - there would be a money object, which I call Mx and there would be accounting units, which merchants would use to keep their books.  The relation between the two would be determined by the authorities in that place.  If it were London, even in earliest times, they were subject to the king for the coinage and so forth, so the king would say how much a money object that the king had issued ought to be accounted for, what it was worth, if you like, in accounting units.

Well, let's look at London.  So the accounting unit was the sterling penny, which is just an abstract object from this point of view.  It's an accounting unit.  Now, there was something called a penny, a coin, a money object, but it wouldn't necessarily be the same as an accounting unit.  It might have been clipped, and if you wanted to make sure you were getting your money's worth in any particular transaction, you might have to weigh out the silver coins and make sure you had the right weight of silver, and even then, you would be trusting that the coins were of the correct fineness, that is contained the correct amount of silver.  So for larger transactions actually, in London at this time, things were calculated in marks, and a mark was said to be worth 160 sterling pence, or if you prefer to use the next accounting unit, 12 of these sterling pence was an accounting unit called a shilling.  There was no coin called a shilling - that's something that's quite clear - no coin called a shilling until the 16thCentury in this country, but the accounting unit of a shilling goes back much further.  So the mark is not a pound or a certain number of shillings; it's a certain number of pence - it's actually 13 shillings and 4 pence, a very odd amount - and when, in England, they started to mint gold coins, they tended to be actually in these proportions, nobels of 6 shillings and 8 pence, and things like that.

In Florence, on the other hand, they had of course a different kind of penny - denari piccoli - small pennies, and they were small, and they were often not made of silver - well, the objects were not.  The accounting unit, however, which was the abstract thing, was - the accounting unit that they used was called the denari piccoli.  By this time, we're talking about 1300.  The gold florins were in use in Florence, and it was decreed that a gold florin was worth 348 of these accounting units - that's actually 29 twelves, 29 twelve times...yes, 29 times 12, yes, that's right.  So the Florentine shilling was called a soldo, and it was 29 of those.

So suppose we're trying to send our shipload of wool from London to Florence.  Obviously, the transaction is going to be governed by what we would call an exchange rate, and I'm going to tell you how I see this from a modern viewpoint, and then we'll look at what the Florentine bankers wrote down about this.

So I would define the exchange rate as the number of florins that equals one mark; in other words, it's the relationship between money objects, actually coin - well, actually there wasn't a coin called a mark.  There might have been 160 pennies called a mark, but in fact, there might have been a weight of silver called a mark, or there might even be an ingot of silver called a mark.  The use of ingots is quite well documented.  So there's that exchange rate, which is the modern way of defining it, and of course there's the reciprocal one if you're going the number of marks that equals one florin and so forth. Now, this is not a constant number, fixed for all time, because it depends on economic factors, which is a sort of catchall term for saying that if there was a shortage of gold or a shortage of silver, gold and silver are used for other things rather than making money objects, and therefore they have a value which is affected by shortages and so forth.  Other things as well can affect the exchange rate.  So this is variable, and if we're going to do any sums involving the exchange rate, we've got to have a range of exchange rates available so that we can work out which one to use on a particular occasion.

This is what Pegolotti did.  This is his table for the exchange between London and Florence, and it gives a range of values of a parameter alpha, and the corresponding values of a parameter beta.

So let's just look at one of the lines, focus on the one which has 34 in it, okay.  What that says is that when 34 sterling pennies go for a florin, then the mark, the sterling mark, goes for 6 lire, 16 soldi, and 5-and-11-seventeenths denari.  So each line is a statement of the form that I've written at the bottom.  You'll notice there's the confusion...not confusion, but at least the double use.  Some of the things here are the actual money objects that are going to be handed over, and some of them are the accounting units which are going to appear in the bankers' books.  So that's why it has to be written in that form.  So the exchange rate, e, that I had on the previous slide, has to be interpreted in terms of the Pegolotti table.  So Pegolotti is making a table of values of a parameter beta in terms of parameter alpha.

Well, I think I've said one of these things already.  The first one is that the number of denari in order to avoid having large numbers of denari, you have these higher units, super units, of the soldi and the lire, which incidentally, are the same multiples, 12 and 20, as would have been used in London, except that in London, it was a sterling penny, a shilling, and a pound.  That's actually just a matter of convenience.  We've also noted that the two types of money, or two uses of money, are signified in the table, because when payments were made, money objects had to be involved, and when the books were kept, the accounting units were involved. 

Now, of course, we're already getting into the stage where we see new types of money being created, because the bill of exchange that I talked about is now becoming an object of money in itself, because it doesn't - if you'd got a bill of exchange which said somebody was going to pay you something, you didn't actually have to present it for payment yourself, you could try trading it on to somewhere else.  So we already begin to see this layer upon layer of financial operations beginning to take place.

Here are the sums, and once again the caveat about this is not the way a Florentine banker would have done it, but it tells us in modern terms what the sums were, and there's some point in doing that.  So, first of all, if the alpha in Pegolotti's table was the number of sterling pence that made a florin, and we can translate that into the number of marks that make a florin, and in terms of the e that I had before, that was the reciprocal one over e.  Similarly, the beta, the answer that comes out - so the beta thing, to remind you, is the number of Florentine units, florins, that are equivalent to a mark.  That comes out to be e time vf, vf being the 384 number, the number of denari piccoli that make a florin, and eliminating, doing the algebra, which of course medieval arithmeticians didn't do it this way, you see the relationship between beta and alpha straightaway.  So in order to do this, make up his table, Pegolotti had done this sum, and that's why those strange fractions came into it.  He had divided vl times vf, and vl - I've done this in general because the other thing perhaps to keep at the back of our minds is the fact that exchange was not just between two places. There was a whole network of places between which exchange could take place, and so we would make suitable substitutions for London and Florence, and we'd get the exchange between Florence and Bruges and places like that, for example.  So, that's what the table is.  It's a table basically of division sums.

But now, we can begin to see a little bit more what's going on - hopefully this has thrown some light on that.  So, if you're a clerk in the bank in Florence, what you want to know is how you're going to account for the shipload of wool, which is supposed to be worth so-many marks, and what you're going to put in your books, and the answer to that...  Well, no, first of all, the answer to that depends upon what is the exchange on [London], what is today's exchange rate, the parameter alpha.  So Pegolotti's table told you that if alpha was 34, then beta was whatever it was, okay, and in order to get the accounting answer that the clerk required, he would have to multiply the number of marks by the value of beta, the value of beta that corresponded to the given alpha.  So the clerks were doing multiplication sums - relatively simple, probably could be done with the abacus, the unnumbered abacus, the abacus with plain counters.  Certainly, the later books tell us how we could have done that kind of thing.  But in order to draw up the table, either Pegolotti, or the person who calculated it for him, had to do a division sum, and quite a complicated division sum.  He had to divide this number by alpha in order to obtain the correct value of beta.  So, this seems to me to make it very clear that even by this time, there were two different levels - perhaps more than two, but anyway, at least two different levels of expertise in arithmetic.  You had the level of Bostock and Chandler calculus, where people can do the sums - in other words, when you know that the derivative of x squared is 2x or whatever it is - and you had some other people who really understood a bit more and who could do more complicated calculations.  The second group of people, who would be, if you like, in the back room at the bank, they would be either, by this time, probably not using Gerbert's abacus, but actually using the Hindu-Arabic numerals pen-reckoning and so for, but their expertise, their methodology, had developed from the Gerber version.

So, to round off...  There are a large number of conclusions, and if you go to that book by Alexander Murray, which I mentioned, you will find a very useful attempt to put all this in the context of the growth of numerate thinking in the later Middle Ages, asking such questions as to what extent the growth in mathematics - Cardano, Tartalli and so forth, the solution of algebraic equations - to what extent that depended upon the base that had been established earlier on for commercial reasons.  Of course we shouldn't think that commercial arithmetic was the only stimulus to the development of arithmetic.  Astronomy was equally important, and the calculations that had to be done by the astronomers would have used, I believe, similar sophisticated methods and certainly weren't done with the plain counters that the merchants used. 

So, well, let's just, as I say, summarise what's here.  Different levels of expertise, and it's not always apparent from the evidence that's in front of us.  The evidence can be unbalanced because we have the printed books from the 16th Century, which are telling us all about how merchants can do their calculations and so forth, and there's a lot of that.  Evidence for the pen-reckoning method is harder to come by because it was essentially ephemeral, and whether it was done by scratching the figures in the sand or by writing them on slates, it tended to disappear.

And then finally, the generalisation about the numerate thinking, that there is in fact evidence that the commercial considerations were important in developing numerate thinking, possibly only indirectly, however, in that the abacus schools and so forth led to generations of Italians in particular, but other countries as well, where numerate thinking and so forth was part of the training.  There were some things that didn't happen, and it's always...the dog in the night-time is always an interesting speculation.  It doesn't seem that there was any serious progress in the concept of number at this stage.  You might think that, given the fact that the exchange rate, for example, small changes in the exchange rate could lead to quite significant differences in the way the sums came out, you might think that that would lead to the idea of the number as a continuum, but in fact, the standard method throughout the Middle Ages of dealing with smaller quantities was to invent smaller units, so you had farthings, and then, at one point, they invented something called a mite, which was a fictitious thing equal to one twenty-fourth of a penny, but there was no idea of what would lead to the decimal notation, that is, having tenths, and then tenths of tenths, and tenths of tenths of tenths and so on, having the same multiplier over and over again.  Although that was, in the Hindu-Arabic system, used for multiples, it didn't seem to come in for fractions, sub-multiples, until the 16th Century, as we know, and without that, it's hard to lead on to the idea of the number continuum and the infinite decimals and so forth which you require to deal with that, and of course the calculus itself, which requires the notion of small change to be codified in some way.

So, I hope that that's shed a little light on the topic.  I'm, as I say, I was heartened by the fact that your talk had come to similar conclusions in some of these aspects, and I'd be interested to hear any comments that people have.  Thank you.

 

©Norman Biggs, Gresham College, 25 April 2008

 

 

 

 

 

 

When Computing Met Finance

 

Dietmar Maringer

Good afternoon.  I think my talk differs a bit from all the other presentations in at least two respects: for one, in computational finance, we usually don't think in centuries, we think in terms of years and decades, and this already is quite a long time.  One of my students came to see me the other day, and he asked me what do I think about this, his words, "ancient book", and he gave me a book from 1991!  So we have some sort of different idea of what is a long span in computing, in particular computing and finance.  Obviously, computing in itself has been around for quite some time, but in finance, it's a little bit tricky, because - this is probably the second thing which is different to some of the other topics - computing and finance, it's sort of a strange love affair, because it's one of the things where, initially, neither of them admits, yes, we do have common interests, and once you can no longer hide it, everyone says, oh obviously, what's your problem, it's always been joint interests!  This is exactly what's happened in computational finance.  So, for mathematical finance, for example, there are clear papers and clear dates where you can see, now, this is where Black Scholes came up with their idea, but in computational finance, sometimes it's difficult to pinpoint when something changed.  One of the few areas where you actually can pinpoint something is when you look at institutional aspects. 

So traditionally, stock markets worked in a rather market-type way, as you would expect a market.  People gather, some of them want to buy, some of them want to sell, sometimes the roles change in between, and a traditional case for a stock market was something like a open outcry.  So people, like in the picture, meet on a marketplace, or on a trading floor, and they just shout what they want to do, shout the prices and the market maker or they themselves find out what the prices are. 

Some times, in the 1970s, '80s, markets and stock markets switched over to electronic markets, and this was something obvious, because when you look at different market places - the first one was in New York, the NASDAQ, which in 1971 opened its doors, and it was, from day one onwards, an electronic market.  The London Stock Exchange closed its trading floor permanently in the early-1990s.  There had been a parallel system for two or three years around, but in 1992, they went electronic.  Strangely enough, Switzerland, revolution started slightly earlier.  In the 1960s, there was the Swiss...the stock exchange [?].  In the 1980s already, they had very strong computer support, and in 1996, which is later than London, they also switched to an electronic market.  New York followed only last year, where they currently have this hybrid market.  So you can have both things, and you still have the trading bell which opens and closes the market obviously, and where still you have your people meeting in the room.  So, when it comes to institutional aspects, there are some clear dates you can attach to certain things.

Another thing which probably made a difference to finance, as far as computing is concerned, is the advent of the internet, which an impact on two levels.  For one, it provided information.  So in the early 1990s, an internet browser looked something like this, where you had your very basic structure.  You had your hyperlinks, and the nice thing was you, yourself, in particular if you work in the proper institution, could provide information which is open to everyone who has access to the internet. 

So along came Bloomberg.  This is a screenshot from 1996.  Unfortunately, a couple of the pictures are missing, but providing information was crucial in those days and had a real impact on trading behaviour. 

Same for Reuters, and what I like about this screenshot, if you have a very close look - probably you can't read it, down here, it says "text version only".  Those were the days where you really struggled with your broadband connection because broadband didn't exist as such, so you had very slow connections and you only got text information and you actually could choose whether you want to have pictures on it or not, or if you just want to have the text literally.

But the internet also has another, or had another impact: people started trading over the net.  So it was not just professional, but it was also the average man in the street who could trade him or herself using the internet.  So Ameritrade - again, this is a screenshot from 1996 - were amongst the first ones to provide these sort of services, and they pride themselves on this front page that they have over 300 million in assets.  Nowadays, no one would be really impressed with this sort of number, but in those days, it was quite a big deal.  Eventually, they also provided research tools.  They registered their slogan "Believe in yourself."  If you look at the date, this was late-1990s.  After the burst of the internet bubble, this slogan was nowhere visible, but they still provided still this information, this cheap, relatively cheap access, so you could trade for $8 per trade, provided you trade more than 10,000 stocks, but still, it was reasonably cheap in those days, because in those days, you had large margins.  This is one of the aspects where computing really made a difference.

Just one last example, ICAP originated as a merger between companies in 1998, and now are London-based, next to Liverpool Street, and as far as I understand, are currently the largest internet brokers worldwide, and they're still based here in London.

So the technical revolution, and to some extent, the internet revolution, had an obvious impact on finance directly.  Where it was less obvious that there was actually an impact, and when it really got started, is with all the other aspects.  So one of the aspects currently being a big deal is automated trading.

Automated trading means you don't have a human trader giving a buy or sell order but you have a machine giving a buy or sell order.  Now, a couple of years ago, it was, allegedly, roughly 10% of the volume traded on the London Stock Exchange based on orders by algorithms or by computers.  Two years ago, it was 30%, last year it was 40%, and for 2008, they estimate 60% plus.  So automated trading has become a big deal, and behind most of these systems stand more or less sophisticated trading algorithms.  Some of them are more or less straightforward, some of them are less straightforward, but they do have a major impact on finance, on how stock prices behave, and on how markets behave obviously.  So the idea is, with all this automated trading generated by machines and generated by computers, that these buy and sell orders are given by machines, and these machines follow certain algorithms, follow certain rules. 

The reason why people used this sort of automated trading are multi-fold.  What [was this thing is] arbitrage?  Arbitrage means you can make money for nothing.  We already had this example in a previous talk today.  As you might gather from my accent, I'm not British, I'm Austrian.  We have Euros, so if I come over, exchange my Euros to British Pounds, and immediately would exchange them back to Euros in a different country without any [temporal] delay, and I'm left with more Euros than I started off with, then this would be a case for arbitrage, and this obviously must not exist.  There are very straightforward relationships, for example, between exchange rates and limits between exchange rates, give and take transaction costs, which must not be violated, and machines are very quick in spotting these inequilibrium.  So machines can be used in automated trading to exploit arbitrage situations, but then, as a consequence, they have an impact on the price, they drive the price back into equilibrium, and the arbitrage opportunity vanishes.

The next thing where automated trading is used is risk management and hedging. Hedging means you want to reduce the risk in a portfolio or in a single asset, or in any sort of financial investment, because you want to limit it and you want to build, literally, a hedge around it, and this quite often is done with options.  Now, we already had a very interesting talk about options and how option prices evolved and what the underlying assumptions are, and we already had a very detailed discussion of the Black Scholes equation. 

Now, this Black Scholes equation is one model to price put option.  The example in the morning was about call option, meaning I'm allowed to buy something, I have the right to buy something.  The put option is the equivalent on the selling side, so you have the right to sell a certain underlying asset at a specific point in time for a pre-specified price - the strike price.  Black Scholes came up with...or published this result in the early 1970s, and they assumed, or made a couple of quite realistic, or reasonable, assumptions that we have this geometric Brownian motion, as in process good enough to [describe] what the process of the underlying stock is, and just to keep things simple, we don't have a dividend until maturity. 

Now, some - actually, the usual thing for stocks to be at least one dividend a year, so a couple of years later, Black came along and suggested a slightly modified version of the Black Scholes equation, where he deals with the case that you have a European put and you have one dividend until maturity.  What does it do? It simply corrects for it.  He assumes the dividend payment is safe, so it just discounts it, he splits it off the stock price, it leaves the remainder as the new stock.  Very clever idea...

But it's still a European put, meaning we are still only allowed to exercise at this one specific point in time.  Now, if you have a put option, then if you look closely at the price and price behaviour, there's actually some point in time where you would be quite happy if you could sell the underlying right away and you don't have to wait until maturity.  So for puts, unlike for calls, in puts, it is the case that sometimes you actually want to exercise prematurely.  The only trouble is things suddenly become a little bit more complicated, and MacMillan eventually solved the problem and suggested this equation, to price an American put - again, assumption no dividend underneath.

In the previous talk, Schachermayer was quoted in one instance.  Schachermayer in those days was working in Vienna.  Vienna was an interesting place to live and to work in, and particularly in those days, and I was fortunate enough to work at the same department as Schachermayer with someone called Fischer.  Fischer, in those days, also was working on option pricing, and he extended the model and introduced the case that you actually have one dividend until maturity, and this is the option price and these are the parameters that go on.  So I think you get the idea: you can easily grow and grow and grow the complexity of this product, and with it, you can easily also grow the complexity to compute the result.  If you have a closer look at this equation, you notice that, here, we already have a bi-variant normal distribution. 

We also derived a different pricing model for credit risk - tricky to tell nowadays, but those were the days! - credit risk where we wanted to price guarantees on loans.  The idea was, because, in those days, option pricing theory was the topic to look at, we used results from option pricing theory, so what we did is we had a model where we priced it as an option, on an option, on an option, on an option, and so on and so forth.  For every point in time where you have to pay your interest, or when your loan is due, you introduce one additional option, because that is one point in time where something could happen.  So if you have one interest payment and one point where you pay interest and pay back your loan, you have an option on an option.  If you had two interest payments plus redemption, you had option on option on option.  If you had...you get the idea!  The problem was, for every additional option, you have got one additional dimension in your normal distribution.

Now, solving this problem, again, the equation got longer and longer.  Solving numerically and really number-crunching problem like this meant if you have a four-dimensional normal distribution in those days, basically, you pushed a button.  You had a five-dimensional one, you had time enough to get yourself a coffee; you have a six-dimensional one, you can wait over the weekend; you had a seven-dimensional normal distribution, it took substantially longer; eight-dimensional, we estimated roughly 10,000 years!  Because the computational complexity explodes, and this is already one of the crucial things about computing: it's not good enough to have faster machines, because what's the good of a machine that is 10 times as fast?  What's the difference between 10,000 years and 1,000 years?  If you do, in particular now, this high frequency finance, it's simply not working, so eventually you have to come up with more sophisticated algorithms which circumvent the problem in itself, or eventually, you just draw a line and say, now, that's the limit of complexity we can deal with.  So in actual fact, these sort of modelling approaches eventually came to a halt.

There were a couple of alternative option types.  There were Bermudan options, because Bermudas are right in between Europe and America, and if Europe is one point in time that you can exercise, and America is any point in time that you can exercise, then obviously Bermuda is a good name for a type that you can...where you have a mixture.  So you have some window in time where you can exercise.  There are other exotic options, with all sort of fancy...things when you can exercise, how the exercise price is actually computed or predetermined or found out, where you have a situation if you hit a barrier once time to maturity, then it's good enough you don't have to hit it at the expiration day.  Many alternatives - also now we know that CDOs and CDO squares, which were one of the ingredients for the credit crunch and the whole crisis recently, they all gave us quite a sort of a headache.  Unfortunately, we can't use all the beautiful mathematics because we don't necessarily get to a closed form solution.  And the next thing we also have to take in mind, following Black Scholes, in option pricing, quite often we make the assumption that we really do have this geometric Brownian motion, which ideally actually we should have.  Unfortunately, stock markets do not behave accordingly.

Now, this is a distribution of the daily returns of the Dow Jones over a quarter of a century.  Those of you who work in statistics might recognise that this one lies similar to a normal distribution, which is one of the ingredients for the geometric Brownian motion, but it's not really a normal distribution because it's too slim.  If you have a very close look, you'll find a couple of outlyers, and these outlyers should have happened with a probability of one in seven million years.  In actual fact, we had a dozen of them over 25 years.  So it happened with way too high probability and this is why computing now uses - when you actually - when it actually comes to solving these option pricing problems, Monte Carlo simulation is used.  So the idea is you use simulations of the underlying stock paths, you find out what the options would be worth if this really is the outcome, you do this over and over and over and over again, and then eventually you get an idea of the distribution of the terminal price, for example, of the option, and then you get an idea of what this thing should be worth today, because there you have much more...much more sort of flexibility in designing the underlying - you can have as many dividends as you want.  The problem is we never know how good we are with this sort of simulation, so it's always a good idea to?  And people in mathematical finance are still looking very hard into option pricing theory and to writing models for this, which in computational finance obviously are always like gold dust, and they are the...the margins we would like to hit.

Nonetheless, this whole theory can be used for automated trading, and this was actually one of the first applications in automated trading, and if you remember the previous slide, one of the applications was, the first one was arbitrage, and the second one was hedging.  Now, one of the main things with options is that their price is really driven by the price of the underlying.  So if we have the right to buy to something, then obviously - the right for a specific price, then obviously this right is more valuable if the underlying is more valuable.  So if the price of the underlying goes up, then this buying option increases in value.  At the same time, the right to sell the underlying decreases in value.  So the put has exactly the mirrored hockey stick we saw in the morning in the call pricing problem.  The nice thing about the approach by Black Scholes was that they take this into account and their approach, what they say in their model, exactly the change in the put option given that the underlying changes, and this thing is called delta.  That's the first derivative of the put price with respect to the underlying's price.  This is actually a quite helpful thing because if you know that if your stock price drops by one pound, and your put goes up by, say, 50p, then what do you do?  You buy two puts and one stock, and the price movements offset each other.  That's the idea of hedging, as simple as that. 

Obviously, if you look at the graphs, and the delta can, since it's the first derivative, it's a tangent on any of these lines - they just differ in terms of time to maturity - you get different deltas.  But nonetheless, that's the way it works, and that's the underlying idea of this whole thing, of this no arbitrage condition, so actually the circle closes.  Unfortunately, it does not always work as nicely as we would like to see it work. 

This is 1987, and the main thing happened in mid-October, which really gave us all a headache and what you can see is, is this big jump, downward movement in the price, and then obviously one of the main assumptions in the Black Scholes model is violated, because we no longer have a continuous price process, we have a jump downwards, and even worse, thanks to the joint effort of mathematical finance and computational finance, these... downward movement was accelerated because, in those days, everyone believed in Black Scholes, all these hedging strategies had hard-wired the delta in their automated trading system.  So if they wanted to insure against drops in the underlying prices, they [justified] buying or selling signals of the corresponding options.  Eventually, the market runs out of liquidity, and eventually, it's just a vicious circle.  Now, I'm not suggesting that this is the only ingredient for this...event.  There were a couple of additional things going on because, at one point in time, trading was stopped, liquidity was an issue, but one of the ingredients really was the automated trading which was not done properly.

I think we had one question in the morning - what happens if the optimisation - if everyone uses the same optimisation technique?  That's exactly the problem - was the problem in those days: they all had the same hedging strategy.  So the good thing is, by now, we have learned our lessons, and now these sort of things should happen no more because we know what drove these sort of events, at least accelerated them, and I will come to that in a minute, how actually overcome the problem.

The other aspects in computational finance and automated trading are you want to have superior predictions, and...yep, automated trading is a self-fulfilling prophesy in itself.

So the next thing where we might be interested in is we want to have superior predictions.  Again, this is one of the areas where finance, at some point in time, looked over to what computer science does, and one of the areas they looked into was artificial intelligence.  Artificial intelligence has been around for a couple of decades by now.  It's, again, difficult to officially mark the day in your calendar when its birthday is, but in the early 1900s, cybernetics was setting out - 1920s was one of the century's?  Artificial intelligence itself, the term was coined in 1958.  Now, this is sometimes quoted as the birth of artificial intelligence, but in those days, people had different ideas about what is an intelligent thing.  So for quite some time, having a computer programme that can play chess was the ultimate thing to achieve in computational intelligence.  Having a system that can do mathematical logic was the ultimate thing to do in artificial intelligence. 

So one of the fathers of artificial intelligence, John McCarthy, who actually coined this phrase, he was working in mathematical logic and how to use computer systems to do mathematical logic.  Alan Turing, then working in Cambridge, was also interested in what actually is intelligence, so he came up with this, what's now called a Turing test, where you can - which, in those days, was a criterion on whether something is intelligent or not, and his suggestion was, if you "speak" (inverted commas) to this machine and you cannot tell does the answer come from the machine, because it's sitting behind a curtain, is the answer coming from a machine, is it coming from a person, and if you can't tell the difference, then it must be intelligent.

A couple of years later, Joseph Weizenbaum, then at the MIT if I'm not mistaken, wrote a nice little computer programme called ELIZA, which did exactly the same thing, and story has it that he showed this programme to his secretary and asked her to play around with it, and she actually did, and eventually, he came back, wanted to ask her how she was getting along, and she stopped him and said, "Oh, don't interrupt me, this is personal!"  What the thing actually did is it had a couple of buzzwords, so if it didn't recognise any of the words, it just said, "Tell me more about it".  This was obviously very much encouraging for a person to keep on typing.  If it had certain words which resembled, or had in its list resembling something like "holiday", then it made statements, "Oh, that must be pleasant," or something like this.  So very simple rules, and it actually passed the Turing test, at lease when applied by this one person.  So this idea of what is intelligence is - has been a problem ever since, and we still haven't found a clear definition of what is intelligence, because whenever you make, or come up with a definition, eventually you reach this hurdle, and then, oh no, it's not really intelligent - we must make it tougher, because it's just bits and bytes inside the machine.

But one of the crucial points in artificial intelligence was yet another PhD thesis, because we had Bachelier today, we had Markowitz today, all PhD thesisists.  Marvin Minsky also wrote a PhD thesis in order to get his PhD, and he introduced something which is, by now, one of the standard methods, neural networks, and I'll talk about this in a minute.  In those days, obviously, he hadn't...machine like this.  He used 3,000 vacuum tubes to simulate a net of 40 neurons.  Meanwhile, and similar to Bachelier, he also faced some sort of criticism, because in his examination, people - he was doing  a PhD in Mathematics - people struggled to acknowledge that this actually is mathematics, what he is doing. 

Nowadays, we know it actually can be regarded as some form of non-linear regression, so it is, in one way or another, it is mathematics, but nowadays, artificial intelligence has moved on.  We now speak more of soft computing, because we have given up this idea that as long as it has mathematical logic inside and it's based on clear, well-defined rules, it is intelligent, as long as these rules are really clever.  Now, we have something like soft computing. 

We also have a new term which is called which is called computational intelligence, no longer artificial intelligence.  I remember I went to a conference in Cardiff, a couple of years ago by now, which was one of the first conferences which actually had computational intelligence in its name, and no one actually knew what makes the difference between artificial intelligence and computational intelligence, so they had a competition, and every participant was asked to write on a piece of paper and submit it to a ballot, and the winning suggestion was it's Welsh for artificial intelligence! 

It's really difficult to tell what's the difference.  The main or the core of the definition is it uses computational methods.  So it's not the cleverness of the idea but you have a computer computing something which looks and smells like an intelligent being, but it's no longer claimed that the thing itself is intelligent - it just mimics, it simulates intelligent behaviour.  Again, this is not a 100% spot-on definition, so don't quote me on this, but this is the main idea.

Neural networks are probably the strongest bit of artificial intelligence that have made it into finance.  Now, how do neural networks work?  The idea is, or the story at least is, it mimics the brain cells.  You have some input. If the input is strong enough, the cell triggers a signal itself.  So, small input, obviously too small, no reaction.  A slightly larger input, send it into the cell, the cell is activated, and it also sends a signal.  The thing is, you can have several inputs into one neuron that just add it up, and again, if the sum is strong enough, it activates, but you also can have nets of neurons, so not just one neuron, but neurons which are interconnected.  One signal sends its signal into many neurons, and every neuron receives its input from several input sources, and eventually, neurons are sending to neurons.  That's the basic idea.

So if we simulate a net like this, then this is what we get.

If we had other inputs, other inputs, the outputs would be different.

Now, what you can do if you use this artificial net of networks is you can increase or decrease some of the inputs.  How do you do that?  You introduce weights for these links, and here, symbolised by a thick line, you multiply it with a factor of, say, 10.  If it's a thin line, then you multiply it with a weight of, say, 0.5.  So you increase, artificially increase or decrease the signals, and what you want to do is you want to get an output which is as close as possible to what the output actually should be.  So what can you do?  You can use this thing, for example, since it is some sort of regression, you can use this thing here to model stock prices.  So what could you do?  You input the past history.  You input what the market currently does, and out comes a prediction for today's stock return. 

How do you train a network?  You use all data, you play around - well, hopefully not play around, but you find your weight such that it would have worked as good as possible in the past.  Then you know you have a working network, at least on historic data, and then you apply it for a couple of days by feeding in new information and new inputs.  You can also use it for other sort of things because you can, depending on how inside these neurons you have activation functions, step functions, signal function, all sort of different functions are possible, typically between 0 and one, or minus one and plus one.  You can also use it binary decision making.  You can also use it for probabilistic predictions.  So these sort of methods became very popular in the 1980s, and in particular, in the 1990s, because by then, people had the computational CPU resources to actually train the networks and do all the data mining required to get sound results.

So, in the 1990s, you suddenly found literally hundreds and probably thousands of applications of neural networks to all different sorts of financial problems.  They used it to predict bank failures, by using balance sheet information, or information about the customers.  These sort of methods or models worked quite well.  Others used it for exchange rate forecasting.  Yet another networks were used to spot trading signals, so past data were fed into the network, and out came a buy or sell signal for stocks, for foreign exchanges, and so on and so forth.  So this artificial intelligence side became very popular because, for some miraculous reason, it seemed to work.  From a theoretical point of view, this should not have been able to make any money, because if we believe in a geometric Brownian motion, if we believe in a normal distribution, prices shouldn't have a memory, and this is one of the crucial assumptions in all the underlying theoretical models.  But apparently, they do have some patterns, and nowadays, in one of the leading journals - probably theleading journal in finance is the Journal of Finance.  They used neural networks for this in the Journal of Finance, but they also use it to detect - and there are not too many, but a couple at least, papers on technical trading rules, because, to some extent, it still is a mystery why it actually works, because we are back to the original question - now, if it works, why doesn't everybody use it, and why doesn't the effect in itself vanish?  To some extent, the effect actually does vanish, so with neural networks, nowadays probably, you have a little bit of a hard time to really make money.  So nowadays, you have to come up with something more sophisticated.  But nonetheless, they are quite popular, and again, from a mathematical point of view, a statistical point of view, there's just one sort of non-linear regression and that's the way they actually can be treated.

Another thing which we in finance now use and that comes from artificial intelligence is evolutionary computation, which ticks more or less all the boxes to qualify for soft computing, because the idea with evolutionary computation is you don't pre-specify an awful lot of rules.  You just set up a rather vague system and let it evolve of itself over generations.  And what this does is it uses the principles of natural evolution, and one of the pioneering methods was the one suggested by John Holland, in the 1970s - if I'm not mistaken, yet another PhD thesis, or at least linked to it - where the idea was pretty similar to what we see in biology.  We have two parents, they mate, produce offspring, and the offspring inherits part of one parent and part of the properties of the second parent, and there's also mutation going on.  The main thing here is we don't have something like the DNA.  We have something even simpler - we only have a binary code.  If we have two parents, if parent one is 0011, and the next one is 1001, then what is done is you pick one random point, you cut the two genes into bits, and rearrange them.  If you want to have mutation on top of it, you pick one of the genes and randomly change it - or not to randomly change it, because in a binary world, changing means go from a 0 to a 1 and from a 1 to a 0, so it's pretty simple actually.  The good thing is, as simple as this might be, it works, because what you do is you start off with so-called population of these strings.  You generate offspring, and you just check is the offspring better than one or two of the parents or one of the existing solutions.  If so, the chances are it will replace; otherwise, it will not replace it. 

This is actually one of the...or referring to one of the publications in this Journal of Finance.  Blake LeBaron is one of the leading figures in this sort of application and also in artificial stock markets, yet another application of computational finance, where he provides a set of technical trading rules - moving average, hedge holder, you name it - and the binary string represents whether one market participant uses a certain rule or does not use it.  So if the first rule is, for example, moving average, then this 0 indicates that this trader does not follow this rule, but it follows the second rule, and not the third and fourth, but the fifth rule, and so on. 

So we have one trader who might look like this.  We have another trader who has a different [chain], another one, and another one, and another one, and then they combine their rules, and then their performance is tested against their offspring's performance.  If the offspring generates a higher profit, then chances are the original ones are eliminated and the new ones survive, or the other way round.  The funny thing is, it works.  Not really much guidance, but it works.

Another thing which is based on this idea is genetic programming, which is the next step, introduced - and here we're already coming to what, again, my student calls ancient - we're coming to the 1990s, and John [Cozer], mainly, suggested an approach which is called genetic programming, because his idea was, now, this shouldn't just work for bit strings, it actually should work also to generate computer code.

So if you represent an arbitrary equation or an arbitrary formula as a tree - the left one is the sine of x plus x divided by 4; the second one is 3 x the sum of 2 and x.   If we have this same idea, and just recombine it, we might get a new equation, and a new one, and a new one, and a new one, and if we also have mutations and we randomly substitute this plus sign with a minus sign, for example, then we'd probably get yet another rule.  This idea of genetic programming also got very popular in finance, because what can you do?  You can develop trading rules, and this brings us back to our previous idea of automated trading.  So this is another highly important bit in...in computational finance, where people try and generate trading rules. 

This is one example, which definitely is not a historic example, because it's current work of a PhD student of mine, but this is a good indication that this is what the industry is currently using and actually has been using for quite some time, but it's also a good example that what the finance industry actually does is not always quite visible.  It's sort of a secretive love affair still, in this respect, because just a simple example...again, this is work with a PhD student of mine.  One of the major investment and broker companies, Worldwide, they offer one quantitative position per year worldwide, and it was my PhD student who got the job because he's working on these sort of topics, but he had to sign that he's not talking about his work with them, and he was not allowed to use any of his results, any of his data he worked on during the summer, because they wanted to have the exclusive rights.  So this is what, at the moment, makes it difficult to pinpoint what are the issues in computational finance, because we know what we do in academia, but we do not quite know what the industry actually does.  We have a rough idea, but now we know neural networks, hot issue, genetic programming, hot issue, but again, we don't have many dates where we can say "It started in 2003, because this is the first paper," for example.  So papers on this, for example, have been around for 5 to 10 years by now, but many of the applications for GPs are in different areas, they are not in finance, but it is pretty obvious that many people in finance use them.

Another area where computing and finance met is optimisation.  We had this brilliant talk in the morning about how optimisation changed the face of finance because, let's face it, without quadratic programming, Markowitz's problem could not have been solved.  It required the idea of quadratic programming.  So if you have Markowitz's problem, you can treat it with a quadratic programming approach.  If you give up the idea of Markowitz that short selling is not allowed, and introduce short selling so that you actually have a negative code and a negative sheet, then it actually becomes [an alternative to a closed form] solution.  The problem is the world is not always normally distributed.  So it's not something like this, or if this - if I can show this just for a second...

This comes from real assets.  We are again in the volatility and returns space, and we get this hyperbola or parabola, depending on whether you have variance or standard deviation as your risk measure.  However, if you're looking at what do these portfolios do in terms of skewness and introduce this as your third dimension, then things suddenly become very messy, because suddenly it's no longer clear what you actually want to do, and what is really good, because suddenly, you have these outlyers and you do not know how much of a positive outlyer offsets me for many, many small losses, for example.  So it's very tricky to come up with a good utility function.  One of the beauties, real beauties, about Markowitz is you don't have to make any assumptions about your investors apart that they are rational, but you don't have to assume that you are very risk-averse, or not risk-averse.  You get the basic result - this curve, regardless of the risk aversion, because this is something for low risk aversion person, this one is for high risk aversion person, but you don't need to know it when you optimise it.  If you look at an element like this, you need to know what to pick, and the same is true in particular then if you start changing your weights.  Then, suddenly, the thing might look completely different.

Again, the only thing you can do is you play around with the weights of your assets, and then, suddenly, you have no longer functions, because this thing turns into a curly-wurly, and this is not what you want to see in optimisation.

Another thing that happens is new risk measures have come along.  Value-at-Risk, for example: Value-at-Risk is not the standard deviation of what you expect, but it is the lower quantile.  This is actually quite close to what is the everyday notion of risk, because the everyday notion of risk is that things go wrong, not by how much I deviate from my expected value, and this is what standard deviation measures - both upside and downside risk.  If you use a normal distribution, it looks like this.  If you use an empirical distribution, it looks like this.  Now this, the thing is it's slightly difficult to optimise. 

How can you solve it?  You use, again, evolutionary methods, or other methods inspired by nature.  Simulated annealing is one of these methods, where you mimic how, when liquids solidify, how crystals emerge, because they want, the particles want to arrange themselves so that energy is minimised, required to keep this state stable. 

A pain in your kitchen, but actually quite clever when it comes to find shortest routes, the travelling salesman problem was mentioned, and lay pheromone trails, based on a reinforcement principle, and very quickly find the shortest routes between their nest and your sugar and candy box in the living room, so they are quite efficient at this, and we can use this for optimisation.

Another method is differential evolution.  So if we look into this problem again, this is the problem we just had on the slide.  Obviously, a traditional gradient-based search wouldn't get us anywhere, because gradient is like you drop a ball and gravity directs it down, but it very quickly will get stuck in a local optimum.  What we used nowadays or what people use nowadays, evolutionary methods, where, again, these principles from evolution are used, where current solutions are combined and recombined and not so good ones are eliminated at an early stage.  If you have a very close look, you can - we want to minimise our risk, then it's probably not a good idea to be in these high risk regions here.  I'm not too sure whether this is not visible at the time, but in this case, the deeper, or the further down, the better it is, and if you have a very close look at the graph, then you recognise that this evolution drags the solutions very quickly to the purple areas and very quickly away from the high areas.  So again, not much intelligence, actual intelligence, because it's not a clever [?], it's computational intelligence.  It looks as if they move in the right direction because they know what to do.

So, I think I need or I should finish eventually.  The one thing I definitely haven't achieved is to answer the question when did computing and finance meet, but probably I managed to shed a little bit of light onto the question of where they met.  So they met in institutional aspects, they met in terms of pricing, they met in terms of financial management, automated trading, and - I didn't address this - they also met in terms of simulators in artificial stock markets, which you can use for policy design.  So again, you build your little world, which behaves, hopefully, close to the real world. 

What sort of methods have made their way from computing into finance?  It's basically - the first thing was, obviously, hardware, and hardware-related things - information systems, databases, actual electronic trading systems.  The next thing are efficient methods, so very similar to the presentation this morning, having efficient methods that can solve quadratic optimisation problems or complex optimisation problems - not necessarily quadratic ones - are extremely helpful in finance and are widely used in terms of toolboxes or tailor-made software.  Optimisation is a hot issue, but also artificial intelligence is a hot issue.  But, the further down we go, on this line, the more difficult it is to say what actually is going on in the industry.  It's a little bit easier to say what's going on in the literature, so if you have a look at the literature, you get an idea that this really is what people do, but once again, it is sort of a secretive love affair.

 

©Dietmar Maringer, Gresham College, 25 April 2008

Part Four: 'Physics vs Mathematics'

This is the final part of a study day. It includes the following talks:
   Introduction by Michael Mainelli
   Physics vs. Mathematics: Rigor(Mortis) and other impediments to understanding financial markets
   by Professor Doyne Farmer
   Closing Remarks by Michael Mainelli

Listen to the lecture

Download the video

Download the audio

Transcript of the lecture

PHYSICS V. MATHEMATICS:
RIGOR (MORTIS) AND OTHER IMPEDIMENTS TO UNDERSTANDING FINANCIAL MARKETS

 

 

Professor Doyne Farmer

 

 

Thank you.  Well, knowing nothing about history, I thought I should at least talk about something that nobody knew more about than me, so I decided to talk about the future, since none of us really knows very much about it , so I can freely speculate.

I'm going to begin by sort of asking, at a very high level, I mean, why do we have this whole system of markets and prices - what are they about, what are prices for?  I'm actually curious what people say a little bit - I mean, somebody tell me, what are prices for?  What's the purpose?  I mean, if you think about it in the same kind of vein as...you could say, well, why do we have an immune system?  We have an immune system to keep out invading things that might take over.  So it's like, you know, why do we have an army?  Well, we have an army, at least we hope, maybe not - maybe in Europe you have one, in the United States, it's a little harder to say - to keep out other people from entering.  Well, why do we have a financial...in that same kind of mode, what would be the analogy for prices in markets?  Any thoughts?

Audience Member:  The efficient allocation of resources.

 

Allocation of resources - I would agree with that.  I just want to...I want to say it a little bit differently, in that when we're allocating resources, what are we really doing, and I would argue we're setting our goals as a society, because of course, if the price of pork bellies goes up, then people go out and raise more hogs, and so it's a kind of a self-organised way of allocating resources, but of deciding, as a society, what we're going to do, without anybody actually making the decision.  It's not the only way we do that at all, but it's at least one powerful way.  As I said, it's a self-organised method for directing the activities of individuals.  It's, as emphasised by Hayek and others, it's an efficient method for processing information and making a distributed set of decisions.  Many argued this is why the Soviet Union and social economies in general have failed because of the inability to do this kind of thing correctly. 

I think it's remarkable how it's highly specialised and geographically concentrated, and particularly now, increasingly automated.  We heard a lot about that in the last talk. 

I would argue that the most entertaining thing to do in any of the major cities in the world - well, at least London or Chicago or New York - is to actually go to the market.  If you've never been, for example, to the Chicago Board of Exchange, sort of the best of them all, it's a complete zoo!  You have people running from one place to another, you have people yelling and screaming, all in close proximity, making bizarre hand signals, and it's particularly interesting to be there when there's some real new information.  I noticed this because the instant something's really happening, you immediately hear it.  You know that you have to look around - you hear a change in the background noise, and immediately, everybody looks up at the boards to see what's going on, because you can literally feel the waves go across.  In fact, we once thought about having a microphone on the floor just to measure the background noise level so we'd know as early as possible when something's happening. 

Actually, here in London, you have the London Metals Exchange, which is the last hold-out in London of this kind of thing, which, if you haven't been there, I highly recommend trying to talk somebody into giving you a tour.  I found it amazing.

I think the other thing I'm going to try and address today is how well does it work, because I think the story that's told in the academic literature is at variance with what I would say is really going on.  So, in that note, I'm going to jump to another topic, which is market efficiency.

In the standard literature, there's three kinds of market efficiency: one is informational efficiency - are prices predictable; arbitrage efficiency - can you make profits without taking risk, or let's say, can one kind of strategy make better profits than another strategy holding some variable like risk constant; and allocative efficiency - are we making sensible allocations in the sense that, say, Pareto efficiency would say you can't make somebody else better off without making somebody else worse off, and do markets actually achieve a state of high allocative efficiency, most important of all.

Now, at a conference we had in Santa Fe in 2000, where we gathered practitioners, academics, physicists and biologists, it was kind of striking how many of the famous practitioners said - we would say, we addressed "How efficient are markets?"  They said, "Well, about 98% efficient," but when pushed to explain that, nobody had a clear view of it.  I personally think the figure is probably closer to 20 or 30%, but...you know, until I can present a hard way to do that, I can't really say I'm right and they're wrong, but nor can they.

The still dominant theory of economics, and this is according to a poll taken a few years ago - this was brought to my attention by colleague Mauro Gallegati - is rational choice in a neoclassical form, namely, you know, the idea that all agents are omniscient.  Why do I say it that way?  Well, because in a rational model, it's not just that the agents are really smart; it's that they have access to the correct models of the world, and they, in a sense, know what everybody else is doing, so they're, in that sense, omniscient.  They're selfish, they maximise their utility, under what I would argue are highly unrealistic utility functions based on psychological surveys of human behaviour.  They assume that markets clear, that people are price-takers, that is, they accept the prices that are offered without affecting those prices, and they...that the result is a Nash equilibrium where, in some sense, you have...the strategies that the agents are using...it's not possible to modify those strategies and around that point do better.  Now, the reason I mentioned that, okay, in this poll, 92.2% of economists support this, Maura Gallegati pointed out that actually there's another poll taken of how many people believe that aliens have landed on the Earth, and that's 7.8%...so the economists that don't support rational choice as the main tool are on a par with people who believe that aliens have landed on the Earth.

Now, in finance, what this means is that all information is properly incorporated into prices, that new information is therefore, by definition, random, and that prices are perfectly efficient, and that changes in future prices are random, and it implies both informational and arbitrage efficiency.  Now, it's not that I think that efficiency is a bad approximation for a lot of purposes.  It's had a brilliant success in option pricing, as, you know, we heard about in Mark Davis' talk today. I think in some domains, it works really well. 

There is a paradox that was pointed out actually originally, as far as I know, by Milton Friedman, which is that for this theory to work, the story behind it is that you have to have arbitrageurs to incorporate the information into prices; if the market is really efficient though, the arbitrageurs shouldn't be able to make better profits than anybody else, in which case, if the arbitrageurs are rational, they'll leave the market, in which case, the market can't really be efficient.  So this paradox has been sitting around now for more than 50 years, and I would say it's not really well resolved in the theory, because somehow, as we'd say in physics, that it may - I think at first order, markets are efficient, at least in certain situations, but at second order, there has to be a violation of the principle and I think it's probably very important to really understand the way in which this second order violation occurs, because it's essential for the way the market functions.

As an example, Michael mentioned I co-founded in 1991 something called Prediction Company, with Norman Packard.  We did proprietary trading.  We actually weren't a hedge fund; we were proprietary trade advisors, although we actually did all the trading ourselves, just under their ticket on the stock exchange.

I think of this, in line with the last talk, we heard what we did is a cerebella approach to market forecasting, that is, the models we built didn't have a real rational model of what was going on in the market - they were stimulus response boxes.  We looked through all the data, we found situations where, when unusual conditions occurred, or maybe when usual conditions occurred, with high statistical probability, prices would move in a direction that we could then predict.  The key to what we did was feature extraction, that is, the key was knowing what to ignore and what features of the market actually seemed to be important and cause movements. 

I think one can make an analogy to work that Hubel and Wiesel did.  They were two neurophysiologists who were trying to understand the visual cortex.  They did experiments on spider monkeys.  They would hook a spider monkey up with a little, you know, a brain helmet, and looking at it with probes in the neurons in the skull of the spider monkey, and they would then show them patterns, like moving bars or spots or things like that, and they would try and figure out which part of the spider monkey's brain was responding to this and how was this organised.  The key principle they came up with is that the spider monkey doesn't just take, you know, the pixels of the visual image and process things pixel by pixel, but rather breaks, in a sort of cascading process, breaks the image down into features and sends these high level features back into deeper parts of the brain where, from there, we don't really know what's going on, but it's clear that the feature extraction, the pre-processing that spider monkeys do, is key to understanding what's going on. 

That's what we did.  We found the right features, we pre-processed them, and then we did actually relatively simple regressions to interpret what those things really meant.  We didn't really understand the origin of most of these patterns. We could really only make this work in situations where we had abundant data, where we traded at reasonably high frequency, so that we could get a lot of examples, and where we had reasonably stationary conditions.  But I think of it as cerebella in the sense that, you know, when a baseball player - sorry, I'm using an American analogy - when a soccer player responds to somebody kicking the ball all the way down the field, they're anticipating where the ball is going to go.  They aren't using the laws of physics directly to do that.  They're using some stimulus response.  They've seen thousands upon thousands of soccer balls getting kicked, they have a little look-up table in their brain that says, well, it looks about like that, I see the ball has got some spin on it, I know the wind is blowing about like this, so they may do a little correction, but they know roughly where to go to be in the right place when the ball comes down.  That's about what our machine was doing.  Now, our machine wasn't thinking deeply about the market, but was processing much more data than a person could ever process, and so it was doing something that a human genuinely can't do, and I have to say was fully automated.  It wasn't at first, but we, at my insistence, began taking statistics on how we would have done without a trader overriding or changing the decision in some way, and what we actually did with those overrides, and once we were two standard deviations down with the overrides, I convinced everybody to shut off the overrides entirely.  So it was a completely automated system, and, as we heard in the last talk, this is becoming more and more common.  In the future, I think it's going to be even more common.  We increasingly see machines trading with other machines, not just for mechanical trade execution, but for information processing and decision making, and I think that's a trend that's only going to increase in the future, which then, coming back to the note I opened the talk on, it's interesting to think that we're leaving in the hands of these markets the control over something that's really pretty essential to human wellbeing.

Now, I also wanted to just mention a little bit about this point about first order, second order nature of market efficiency.  This is one of the few slides we convinced the Swiss to let us release.  I'm actually not quite sure how we did it, because normally they won't - they're paranoid about the silliest things imaginable.  What I'm showing in this plot is the correlation, that is, between a signal that we would generate, and the signals, we think of a signal as something that relates to a cluster of inputs of a particular type, and our trading systems were built out of several signals that we then combined.  The signals in and of themselves though should have predictive power.

So signal one here, and by the way, we're looking at data from 1975 to 1998, and in fact the model was built just on the latter part of that data, from about 1990 onward, and then later, only later, tested on the latter part.  The correlation that we're seeing up here is indicating how well the signal correlates with the movement of the stock about two weeks in the future.  So if this said 100, it would be a perfect prediction; if it says 0, it's a random prediction; if it's minus, it's actually predicting backwards.  So you see that it starts around 12, 13%, and there's a slow decline during the course of this 23 year period to something more in the vicinity of 3 or 4 or 5%. 

Now, on one hand, it agrees with what is predicted by efficient markets.  You could say Friedman was right, because in fact the market is getting more efficient through time.  On the other hand, it's taken 23 years to do that, and whereas when we started in 1991, I would have guessed there maybe were 10 firms doing the kind of statistical arbitrage that we were doing, there's probably 1,000 of them now, and yet, these signals still get traded on, even if they occasionally take large losses, as they did in August of last year. 

Even more surprising is this one down here, where, for reasons I'm not going to explain in detail, but if you know the dates, you might guess why, we have a different signal.  The signal actually doesn't exist prior to that date - it's impossible to formulate that signal.  There was a change in the market structure, and what we see is, to the surprise of efficient markets, the signal actually builds through time.  Now, it's not that the market isn't pretty efficient.  Neither of these signals are really, really strong.  We're not talking about 100 or even 50% correlations, but nonetheless, they are sufficient to make a fair amount of money.

Now, the other argument that people have made against efficient markets and rationality is, well, this is something I actually got in junk mail, so to speak, because they got it on some list, which is a kind of entertaining list.  You notice what they're doing here is they're actually giving you advice about which stocks to trade based on astrology.  What I haven't figured out is do they cast the horoscope based on when the company was born or on when you were born, I don't understand, but they do that.

This is a guy named Robert Prechter, who actually, interestingly, won a trading contest.  Many people follow him.  He's developed this theory of Elliot waves, which is based on Fibonacci numbers.  We actually heard something about Fibonacci earlier in the day.  They have cycles and super-cycles, and you can even use this to predict, you know, when we're going to have horror movies versus Mary Poppins, according to them, but okay...so people aren't rational - that's not too surprising to anybody. I know I'm not rational, and I suspect most of you aren't either.

But even within mainstream economics, there's been a widespread debate over whether prices - how well do prices actually match fundamental values, how well are these allocations being made.  This is from Campbell & Shiller, and the two plots I'm showing are plotting prices against fundamental values based on historical dividends over more than a century, and what you see is there are periods of decades at a time where prices - this is a logarithmic scale - so there are periods of decades where prices and values are out of line by factors of two.

This is a slide actually due to Cutler, [?] and [Summers] comparing - what they did was they took a 40 year period in the S&P, they looked at the 100 largest moves in the US stock market, as measured by the S&P index.  I've showed the top 12 here, I've showed the dates, I've showed the size of the moves in percent.  Then they went to the library, and they looked at the New York Times on that day, and they picked out a sentence or two corresponding to the New York Times' explanation of what went on.  I've shown in black the things that they didn't label as genuine news, or you might call it market-generated news, like - look at the top one, worry over dollar decline, fear of US not supporting dollar.  As a market predictioner, I experienced fear and worry every day.  That's one reason I was very happy to sell our company to UBS finally.  So I would say if the people who manage your money aren't experiencing fear and worry, you should have somebody else manage your money.  So in no way should fear and worry be viewed as news.  In contrast, you know, the outbreak of the Korean War, that seems like news.  You can see the news items are in a minority.  You can also see the decline in news reporting.  On the fourth item, September 3rd 1946, the New York Times actually had the courage to say no basic reason for the assault on prices - I don't think they've ever been that honest since!

Another slide here shows, again, over about a 100 year span, the volatility of the US stock market measured based on the monthly standard deviation of daily price moves.  So every month, you take the daily price moves for that month, you take the standard deviation, and you make a dot, and you do that for every month since 1885.  The striking thing that hits the eye when you look at this, and let's remember - let me go back a few slides here'to the standard view of market efficiency, and that is that - the rational choice, is that all information is properly incorporated into prices, new information is, by definition, random, and so if you have a large price move, it's because you must have more information on that day.  Now, go back and look at this plot?  What you see is there's a period, under that interpretation, corresponding roughly to the Great Depression, where for some reason, they were getting a lot more information than we're getting now.  It seems strange that they should have had so much more information in the Depression.  I would argue that something else has to be driving these large scale and persistent changes in volatility. 

I mean, we know in economic theory, there's been, there was a Nobel prize in the econometrics in this cluster of volatility.  There's actually, other than the clearly wrong theory that I mentioned, I think there's essentially no understanding of why we have periods of more and less volatility.  I will say what I believe and that it's related to liquidity.  There are periods where, if somebody wants to make a trade, it causes a large price change; there are other periods where, if somebody wants to make a trade, that is, if you're a buyer, all else being equal, you enter and you say, "I want to buy," and you initiate a trade with somebody, when you do that, you're going to push the price up a little bit.  There are periods when you're going to push the price up a lot, and other periods where you're not going to push the price up very much, and the reason for that change, there may be many reasons - I mean it may be that, for example, during the Depression, that people were just more nervous; it may be that there were, for whatever reasons, more instabilities in financial markets - but anyway, I believe there are reasons there that we can understand, and it's not just that they had more information.  As a result, we have significant changes in liquidity and, being in the middle of a liquidity crisis, it's a very topical thing at this point in time. 

It's highly variable, as I already said.  It's persistent, meaning if we have a lot of liquidity today, we're likely to have a lot tomorrow, and if we don't have much today, we're likely not to have much tomorrow.  It's the main driver, as we've shown in some of our papers, of volatility and of changes like the ones I showed you on that last slide.

I'm going to skip this slide because I don't want to run over time.

Let me just say that I think if, once you realise that liquidity is the main driver of volatility, then it presents an interesting opportunity because it's something we have control over, at least partial control over - that is, if we can make it easier for counterparties to find each other, if we can bring all the right people together in one place, then liquidity gets better.  It's been a constant battle, for example, in New York stock exchanges where there's been a tendency for liquidity fragment, in part for good reasons I think.  There was essentially a scandal in the NASDAQ over collusion between market makers.  The specialist system in the New York stock exchange has been a scandal since it was instituted, in my opinion.  So that's driven people to be constantly looking for other ways to find more efficient ways to trade.  So we can change the way the market is structured.  We can change the fees for liquidity providers versus liquidity takers.  In the London stock exchange, for instance, if you're providing liquidity by posting orders that sit on the book, and if you have a lot of orders sitting in the book, somebody can then enter and take liquidity off the book and initiate a trade and generate a smaller change.  People are actually compensated in terms of their fees for that.

You can change the way that information revelation happens to make people feel more comfortable or to change human behaviour.  In London for example, all the orders in the book are completely transparent and visible to everybody - everybody who can pay for the feed.  That's not a trivial thing actually.  But you don't even know after the fact who you traded with, so your anonymity is protected very strongly.  The rules in New York are quite different, and the rules differ on virtually every exchange, and what we're seeing I think is a kind of a Darwinian experiment in which methods of trading do people prefer and which methods result perhaps in more social welfare, although the utility of the exchanges can differ from the utility of the clients of the exchanges.  I believe, frankly, that this actually can make a difference in long term, not just in liquidity, but in long term volatility.

Now, as a physicist looking at markets, I feel...I debated whether to even put this slide in, because it's of course easy to walk in and criticise the other guys.  As you start working in economics, you really are struck by how hard these problems are, but nonetheless, just to be very blunt in my criticism, I mean - and this was the title of my original talk title about rigor mortis - I mean, you're struck, when you come in from physics, how much theorising there is in economics.  Papers are written commonly in theorem proof format, which is...per se could be okay if the hypotheses the theorems were based on you thought had any real correspondence with reality, which I think often they don?t. 

There was a kind of a change in about the 1950s, where economics became very mathematical.  It was mainly a good thing, but I think in some cases, common sense got tossed out with it.  For my taste, there's a lack of ambition in data gathering because the incentives in economics departments don't favour really ambitious data gathering.  You know, about 80% of physics is actually data gathering.  There's a lot of data gathering in economics too, but...there are many rich data sets and some, let me also say, some people are really beginning to do this, but data gathering is a pain in the ass, and there need to be better incentives for people to really do that and get tenure from doing it, because I think the data sets that are typically being used are just a minor hint of what we could do if we had better data sets.  You know, theory and data are not well connected.  That's changing in economics, it's getting much better.  There's much more of a pushing for economists to really try and make theories connect to data, but I think it still tends to be awfully qualitative.  The slavish adherence to one paradigm - it's not that that paradigm is wrong, it's just that it's not the only way to look at the world.

Finally I think it's asking, well, what is the right set of questions?  What are the appropriate set of goals for the theory?  How should one go about it?  Physicists have a blind belief that there are regularities in the world and one should find those regularities and try and understand them in the most mathematical way possible.  Whereas, you know, in economics, you really can't use the word "law" in a paper.  I always have to go through because I tend to put it in - you know, that we're trying to find a law - and my economist friends say, no, no, you can't do that - take it out.  It's okay - I'll use it here!

You know, one can ask, again, looking for the future, I believe that if we do find another civilisation out there in the universe, we'll discover that they do trade - I'd be very surprised if they didn't trade - and that their markets probably will go through some evolutions that are somewhat similar to ours.  I mean, we had a wonderful perspective today on the way that markets have changed, on the way that markets have actually affected the way we do something as basic as arithmetic, and the interplay back and forth.  I think that we would discover that they've gone through a lot of the things.  The details will all be different, but I think there will be some common principles.  We might discover, for example, they have options.  I would argue that the Black Scholes, you know, pricing formula, you can view it as it's an algorithm for pricing an option; it's actually, in a certain sense, become a law because in fact options pretty well follow the Black Scholes pricing formula, and through time, have come to follow it better than they did before.  It's actually one of the remarkable things with equilibrium theories, is that by creating a theory about how something should be done, you can change the way it is done, and then it becomes a kind of a law.  Some of the other laws may actually be more derived from psychology, they might be a bit slipperier, but I nonetheless think we will see more and more examples of such things existing, and not just on derivative pricing.  Derivative pricing is the realm where it's been very successful, but I think we can really begin to think about other topics, like the underlying.

Now, in my last few minutes, I'm just going to throw out a few things that have been found in the last 5 years or so, 5 to 10 years, some of them maybe a little longer.

Well, what is volatility?  I showed you this remarkable picture, showing the persistence of volatility.  I could show you a picture on a 15 minute timescale that would look essentially identical to the picture I showed you on century timescale.  The thing the pictures have in common is these bursts of high periods of volatility and then low periods of volatility.  Prices change a lot for a while, and then they don't change so much for a while.  There are common features across vastly different timescales and, more technically, we would say this means there's a long memory.  You can make that precise in terms of the auto-correlation function - I'll do that in a minute.

There's a very nice recent paper showing that there's an equivalence between the bid/ask spread, the market impact, and volatility and transaction time.  They're literally about the same size.  They can't differ by more than about a factor of 2, from some very simple efficiency arguments, the parallel behaviour of volume, the long memory of order flow. I probably won't even get to the last one, but I'll just show you these other examples in more detail.

One of the things that we see pretty consistently - and let me just say, there's been a lot of debate about which of these kind of things are really robust.  There have been some other claims that I don't support because we haven't found they're robust, but this one seems to be fairly robust, that is, if you look at trading volume in markets where people can freely trade large sizes, like not in the order book of the LSE, but in the off-book market where people negotiate trades over the phone, which is the ones that are off there to the left.  I should explain this picture, since everybody is staring at it.  What we're plotting on the x axis is the volume of a trade measured in some arbitrary units, set so that we've centred one in the middle - we're dividing things by the standard deviation.  You see that we're looking at a range of variation of about 9 orders of magnitude, so it's a very large range.  Then, on this axis, what we plot is the probability that the volume of a trade exceeds some threshold x, and so we're plotting that threshold as a function of the probability that we see trades that are above that threshold in size.  So in other words, as we go off to the right, we're looking at increasingly large trades, which are getting increasingly rare, and we're plotting this on double logarithmic scale, so that if there's what's called a parallel relation - if you don't know what that is, don't worry about it - you see a straight line, and we see, in many markets, we see a good approximation of a straight line when we look at that upper curve.  So, we feel that this may be something a bit like a law.  We're trying to explain it, and we don't have a good explanation yet. 

On the other hand, we do know some of the things this affects, and so, I mentioned auto-correlation a moment ago, this is just a way of saying if you have a relation between the same variable x at two different times, t and some time in the future from there, it depends on the product of the two, and all you have to know is that it's a number that's one if they are really exactly the same, it's minus one if they're exactly the opposite, and it's zero if they're randomly related, and it's somewhere in between if they're somewhere in between.  So you look at the auto-correlation of the signs of trades in the London stock exchange - and here, the sign is plus one if a buyer initiates the trade.  If the buyer is the one that takes the order out of the order book and actually causes that trade to happen, we'll call it plus one.  It's minus one if it's initiated by a seller.  So we take a couple of years of data? 

This happens to be the stock of Astra Zeneca, but they all look the same.  They look the same in the Paris market, they look the same in the London market, or sorry the New York market, they look the same in the Spanish market.  Every market we looked at, every stock, it always looks the same.  So you take this sequence of signs, like say of a million trades, that's about what this corresponds to, plus one, minus one, plus one, and you take this auto-correlation function.  Now, what you see - again, we're plotting this on this funny way, but for a lag of one, that is, if I look at one trade and I look at the next trade, then we see an auto-correlation of about 15% between those trades.  So it's telling you that it's not exactly predictive, but there's a pretty good relation from one to the next.  As we go out to longer lags, now we go 10 trades later, or 100 trades later, or 1,000 trades later, 10,000 trades later, at 10,000 trades later, we're talking about a time span of two weeks.  So I walk in the market - I can't walk in the LSE, okay, I look on the screen, because there is no there there - I look at one trade on the screen and I look at its sign.  I can then look, two weeks later, without knowing anything else at all, and I can predict the sign of that next trade, and I can do it sufficiently accurately that if I collect data over the course of a year, that's actually going to be a statistically significant prediction, because we still have these values statistically significantly above zero, two weeks out. 

Actually, I was thinking of this in a remark Mark made earlier about, well, if we don't have equilibrium, then - and I'm going to misquote you, Mark, so I apologise - but we'll have everybody piling in, I think you said, on one side or the other.  In fact, people are piling in on one side or the other, because that's what this is about.  The supply and demand is sloshing in and out of the market, like, you know, if you get in the bathtub and you put your hands and you start sloshing the water around, it's sloshing on lots of timescales.  It's maybe more like climate or the ocean or something, which also show this kind of long memory, but the remarkable thing is the market, the prices do stay pretty efficient, and what we're seeing is that the market has to go through all kinds of gyrations and adjustments to maintain that efficiency, and they have side consequences.  We believe that clustered volatility is one of them. 

So, the point being, as we go into the future, I think there are things like laws, and we're seeing some examples of them.  So one of the ones we've been able to derive is - I've given you two possible laws here.  One is this relation about volume and the heavy tails of large trades.  Another is this auto-correlation, and in both cases, there's a slope.  There's a rate at which this curve is dropping, and it's roughly linear as you go from left to right. Actually, there are better ways of showing that this really is, in some sense, linear, but when you measure these things, those slopes are simply related in this.  We can predict that relation: the slope of the volume curve is equal to - sorry, the slope of the auto-correlation is one plus...one plus that is equal to the slope of the volume curve, and we actually have a theory for why that's true.  We think it's because people trade incrementally.  They don't - if Warren Buffett wants to buy 10% of Coca Cola, he doesn't just place an order for 10 million shares in the order book of Coca Cola Company.  He talks to his brokers, and they work out a strategy, and over the course of months, they incrementally buy up little bits of Coca Cola.  That behaviour is what's causing this kind of thing.

I'm running out of time, so I'm going to...I'm going to just do one last slide and then give my conclusions.

For me, the big fascination with financial markets is that they provide a perfect laboratory in which to study social evolution, something that's been talked about since the time of Herbert Spencer, but about which I would maintain we know very little, in part because I think we manage to gather much less data about it, in a quantitative way, than biologists have.  But if evolution means dissent, variation and selection, that is, you transmit information through generations, that information has some variations, some errors or variations in it, and then you select, based on some principle, one thing or another.  We see that strongly in financial markets.  What - because we're talking about strategies, there's a certain kind of trading strategy, it gets transmitted across generations, people pick new trading strategies, the strategies are competing with each other, and we are able - we have data sets - we managed to gather, together with Terry [O'Dean] and Brad Barber and some collaborators, we have about 12 years of data from the Taiwan stock exchange, in which we can see not just every order that was placed in the order book, but we know the identity of the broker, the individual who placed the trade, and the account of which that trade got made.  So we can actually really study the heterogeneity of markets, markets as an ecology of human decision making.  The obvious difference with biology is that people can think.  Economics have worked very hard - the theory of rational expectations is centred around that idea, which we don't discount, but markets provide an interesting way to see how people actually think and how they actually make decisions.

So just to conclude, I think mathematics is going to continue to play an ever-increasing role in markets.  Markets have the great advantage that we can record what people do in great detail and study it, and we've only just begun to do that.  I think we'll be able to go to a deeper level.  We'll have laws, eventually, that will be more like physics.  I think that we are going to be in a situation where the control of markets and the participation in markets will be increasingly non-human, simply because machines can process more information and process it faster, and that by sort of...as we begin to get a better understanding of how efficient markets really are, if I'm right and they're really not very efficient now, I actually think that's a good thing because it means we can maybe actually improve how efficient they are and make them work better in the future.

Thanks.

 

©Professor Doyne Farner, Gresham College, 25 April 2008