Part one describes situations where people do not understand the rare event, and do not seem to accept either the possibility of its occurrence or the dire consequences of such occurrence.
Part two provides a synthesis of the biases of randomness as discussed in the now abundant literature on the subject.
I call the skewness issue; it does not matter how frequently something succeeds if failure is too costly to bear.
Psychologists have shown that most people prefer to make $ 70,000 when others around them are making $ 60,000 than to make $ 80,000 when others around them are making $ 90,000.
The biology of the phenomenon is now well studied under the subject heading “social emotions.” Meanwhile some historian will “explain” the success in terms of, perhaps, tactical skills, the right education, or some other theoretical reason seen in hindsight. In addition, there seems to be curious evidence of a link between leadership and a form of psychopathology (the sociopath) that encourages the nonblinking, self-confident, insensitive person to rally followers.
I start with the platitude that one cannot judge a performance in any given field (war, politics, medicine, investments) by the results, but by the costs of the alternative (i.e., if history played out in a different way). Such substitute courses of events are called alternative histories.
One can illustrate the strange concept of alternative histories as follows. Imagine an eccentric (and bored) tycoon offering you $ 10 million to play Russian roulette, i.e., to put a revolver containing one bullet in the six available chambers to your head and pull the trigger. Each realization would count as one history, for a total of six possible histories of equal probabilities. Five out of these six histories would lead to enrichment; one would lead to a statistic, that is, an obituary with an embarrassing (but certainly original) cause of death.
Reality is far more vicious than Russian roulette. First, it delivers the fatal bullet rather infrequently, like a revolver that would have hundreds, even thousands, of chambers instead of six. After a few dozen tries, one forgets about the existence of a bullet, under a numbing false sense of security.
Second, unlike a well-defined, precise game like Russian roulette, where the risks are visible to anyone capable of multiplying and dividing by six, one does not observe the barrel of reality. Very rarely is the generator visible to the naked eye. One is thus capable of unwittingly playing Russian roulette—and calling it by some alternative “low risk” name.
I do not dispute that arguments should be simplified to their maximum potential; but people often confuse complex ideas that cannot be simplified into a media-friendly statement as symptomatic of a confused mind. MBAs learn the concept of clarity and simplicity—the five-minute-manager take on things. The concept may apply to the business plan for a fertilizer plant, but not to highly probabilistic arguments—which is the reason I have anecdotal evidence in my business that MBAs tend to blow up in financial markets, as they are trained to simplify matters a couple of steps beyond their requirement.
As a derivatives trader I noticed that people do not like to insure against something abstract; the risk that merits their attention is always something vivid.
It is also a scientific fact, and a shocking one, that both risk detection and risk avoidance are not mediated in the “thinking” part of the brain but largely in the emotional one (the “risk as feelings” theory). The consequences are not trivial: It means that rational thinking has little, very little, to do with risk avoidance. Much of what rational thinking seems to do is rationalize one’s actions by fitting some logic to them.
I reckon that I outgrew the desire to generate random runs every time I want to explore an idea—but by dint of playing with a Monte Carlo engine for years I can no longer visualize a realized outcome without reference to the nonrealized ones.
I have two ways of learning from history: from the past, by reading the elders; and from the future, thanks to my Monte Carlo toy.
As I mentioned above, it is not natural for us to learn from history…It is a platitude that children learn only from their own mistakes; they will cease to touch a burning stove only when they are themselves burned; no possible warning by others can lead to developing the smallest form of cautiousness.
Actually, things can be worse than that: In some respects we do not learn from our own history. Several branches of research have been examining our inability to learn from our own reactions to past events: For example, people fail to learn that their emotional reactions to past experiences (positive or negative) were shortlived—yet they continuously retain the bias of thinking that the purchase of an object will bring long-lasting, possibly permanent, happiness or that a setback will cause severe and prolonged distress (when in the past similar setbacks did not affect them for very long and the joy of the purchase was short-lived).
It has to do with the way our mind handles historical information. When you look at the past, the past will always be deterministic, since only one single observation took place. Our mind will interpret most events not with the preceding ones in mind, but the following ones.
Our minds are not quite designed to understand how the world works, but, rather, to get out of trouble rapidly and have progeny.
I will repeat this point until I get hoarse: A mistake is not something to be determined after the fact, but in the light of the information until that point.
But somehow, overall, history is potent enough to deliver, on time, in the medium to long run, most of the possible scenarios, and to eventually bury the bad guy. Bad trades catch up with you, it is frequently said in the markets. Mathematicians of probability give that a fancy name: ergodicity. It means, roughly, that (under certain conditions) very long sample paths would end up resembling each other. The properties of a very, very long sample path would be similar to the Monte Carlo properties of an average of shorter ones.
For the difference between noise and information, the topic of this book (noise has more randomness) has an analog: that between journalism and history. To be competent, a journalist should view matters like a historian, and play down the value of the information he is providing, such as by saying: “Today the market went up, but this information is not too relevant as it emanates mostly from noise.”
The opportunity cost of missing a “new new thing” like the airplane and the automobile is minuscule compared to the toxicity of all the garbage one has to go through to get to these jewels (assuming these have brought some improvement to our lives, which I frequently doubt).
But while early on in my career such focus on noise would have offended me intellectually, as I would have deemed such information as too statistically insignificant for the derivation of any meaningful conclusion, I currently look at it with delight. I am happy to see such mass-scale idiotic decision making, prone to overreaction in their postperusal investment orders—in other words I currently see in the fact that people read such material an insurance for my continuing in the entertaining business of option trading against the fools of randomness.
Let us manufacture a happily retired dentist, living in a pleasant, sunny town. We know a priori that he is an excellent investor, and that he will be expected to earn a return of 15% in excess of Treasury bills, with a 10% error rate per annum (what we call volatility). It means that out of 100 sample paths, we expect close to 68 of them to fall within a band of plus and minus 10% around the 15% excess return…
A 15% return with a 10% volatility (or uncertainty) per annum translates into a 93% probability of success in any given year. But seen at a narrow time scale, this translates into a mere 50.02% probability of success over any given second.
Over the very narrow time increment, the observation will reveal close to nothing. Yet the dentist’s heart will not tell him that. Being emotional, he feels a pang with every loss, as it shows in red on his screen. He feels some pleasure when the performance is positive, but not in equivalent amount as the pain experienced when the performance is negative.
A minute-by-minute examination of his performance means that each day (assuming eight hours per day) he will have 241 pleasurable minutes against 239 unpleasurable ones…Now realize that if the unpleasurable minute is worse in reverse pleasure than the pleasurable minute is in pleasure terms, then the dentist incurs a large deficit when examining his performance at a high frequency.
Consider the situation where the dentist examines his portfolio only upon receiving the monthly account from the brokerage house. As 67% of his months will be positive, he incurs only four pangs of pain per annum and eight uplifting experiences. This is the same dentist following the same strategy. Now consider the dentist looking at his performance only every year. Over the next 20 years that he is expected to live, he will experience 19 pleasant surprises for every unpleasant one!
Over a short time increment, one observes the variability of the portfolio, not the returns. In other words, one sees the variance, little else. I always remind myself that what one observes is at best a combination of variance and returns, not just returns (but my emotions do not care about what I tell myself).
Finally, I reckon that I am not immune to such an emotional defect. But I deal with it by having no access to information, except in rare circumstances. Again, I prefer to read poetry. i
The same methodology can explain why the news (the high scale) is full of noise and why history (the low scale) is largely stripped of it (though fraught with interpretation problems). This explains why I prefer not to read the newspaper (outside of the obituary), why I never chitchat about markets, and, when in a trading room, I frequent the mathematicians and the secretaries, not the traders. It explains why it is better to read The New Yorker on Mondays than The Wall Street Journal every morning (from the standpoint of frequency, aside from the massive gap in intellectual class between the two publications).
Some so-called wise and rational persons often blame me for “ignoring” possible valuable information in the daily newspaper and refusing to discount the details of the noise as “short-term events.” Some of my employers have blamed me for living on a different planet. My problem is that I am not rational and I am extremely prone to drown in randomness and to incur emotional torture. I am aware of my need to ruminate on park benches and in cafés away from information, but I can only do so if I am somewhat deprived of it. My sole advantage in life is that I know some of my weaknesses, mostly that I am incapable of taming my emotions facing news and incapable of seeing a performance with a clear head. Silence is far better.
At any point in time, the richest traders are often the worst traders. This, I will call the cross-sectional problem: At a given time in the market, the most successful traders are likely to be those that are best fit to the latest cycle. This does not happen too often with dentists or pianists—because these professions are more immune to randomness.
How could traders who made every single mistake in the book become so successful? Because of a simple principle concerning randomness. This is one manifestation of the survivorship bias. We tend to think that traders were successful because they are good. Perhaps we have turned the causality on its head; we consider them good just because they make money. One can make money in the financial markets totally out of randomness.
Translating the idea in social terms, they believe that companies and organizations are, thanks to competition (and the discipline of the quarterly report), irreversibly heading toward betterment. The strongest will survive; the weakest will become extinct. As to investors and traders, they believe that by letting them compete, the best will prosper and the worst will go learn a new craft (like pumping gas or, sometimes, dentistry). Things are not as simple as that. We will ignore the basic misuse of Darwinian ideas in the fact that organizations do not reproduce like living members of nature—Darwinian ideas are about reproductive fitness, not about survival. The problem comes, as everything else in this book, from randomness.
Owing to the abrupt rare events, we do not live in a world where things “converge” continuously toward betterment. Nor do things in life move continuously at all.
Assume I engage in a gambling strategy that has 999 chances in 1,000 of making $ 1 (event A) and 1 chance in 1,000 of losing $ 10,000 (event B), as in Table 6.1. My expectation is a loss of close to $ 9 (obtained by multiplying the probabilities by the corresponding outcomes). The frequency or probability of the loss, in and by itself, is totally irrelevant; it needs to be judged in connection with the magnitude of the outcome. Here A is far more likely than B. Odds are that we would make money by betting for event A, but it is not a good idea to do so. This point is rather common and simple; it is understood by anyone making a simple bet. Yet I had to struggle all my life with people in the financial markets who do not seem to internalize it…How could people miss such a point? Mainly because much of people’s schooling comes from examples in symmetric environments, like a coin toss, where such a difference does not matter.
Accordingly, bullish or bearish are terms used by people who do not engage in practicing uncertainty, like the television commentators, or those who have no experience in handling risk. Alas, investors and businesses are not paid in probabilities; they are paid in dollars. Accordingly, it is not how likely an event is to happen that matters, it is how much is made when it happens that should be the consideration. How frequent the profit is irrelevant; it is the magnitude of the outcome that counts. It is a pure accounting fact that, aside from the commentators, very few people take home a check linked to how often they are right or wrong. What they get is a profit or loss. As to the commentators, their success is linked to how often they are right or wrong. This category includes the “chief strategists” of major investment banks the public can see on TV, who are nothing better than entertainers.
The best description of my lifelong business in the market is “skewed bets,” that is, I try to benefit from rare events, events that do not tend to repeat themselves frequently, but, accordingly, present a large payoff when they occur. I try to make money infrequently, as infrequently as possible, simply because I believe that rare events are not fairly valued, and that the rarer the event, the more undervalued it will be in price.
People in most fields outside of it do not have problems eliminating extreme values from their sample, when the difference in payoff between different outcomes is not significant, which is generally the case in education and medicine…So people in finance borrow the technique and ignore infrequent events, not noticing that the effect of a rare event can bankrupt a company.
Where statistics becomes complicated, and fails us, is when we have distributions that are not symmetric…If there is a very small probability of finding a red ball in an urn dominated by black ones, then our knowledge about the absence of red balls will increase very slowly—more slowly than at the expected square root of n rate. On the other hand, our knowledge of the presence of red balls will dramatically improve once one of them is found. This asymmetry in knowledge is not trivial; it is central in this book…To assess an investor’s performance, we either need more astute, and less intuitive, techniques or we may have to limit our assessments to situations where our judgment is independent of the frequency of these events.
But there is even worse news. In some cases, if the incidence of red balls is itself randomly distributed, we will never get to know the composition of the urn. This is called “the problem of stationarity.” Think of an urn that is hollow at the bottom. As I am sampling from it, and without my being aware of it, some mischievous child is adding balls of one color or another. My inference thus becomes insignificant. I may infer that the red balls represent 50% of the urn while the mischievous child, hearing me, would swiftly replace all the red balls with black ones. This makes much of our knowledge derived through statistics quite shaky. The very same effect takes place in the market. We take past history as a single homogeneous sample and believe that we have considerably increased our knowledge of the future from the observation of the sample of the past. What if vicious children were changing the composition of the urn? In other words, what if things have changed?
In his Treatise on Human Nature, the Scots philosopher David Hume posed the issue in the following way (as rephrased in the now famous black swan problem by John Stuart Mill): No amount of observations of white swans can allow the inference that all swans are white, but the observation of a single black swan is sufficient to refute that conclusion.
Whenever I hear work ethics I interpret inefficient mediocrity.
There are only two types of theories: Theories that are known to be wrong, as they were tested and adequately rejected (he calls them falsified). Theories that have not yet been known to be wrong, not falsified yet, but are exposed to be proved wrong.
Why is a theory never right? Because we will never know if all the swans are white (Popper borrowed the Kantian idea of the flaws in our mechanisms of perception). The testing mechanism may be faulty. However, the statement that there is a black swan is possible to make. A theory cannot be verified. To paraphrase baseball coach Yogi Berra again, past data has a lot of good in it, but it is the bad side that is bad.
The philosopher Pascal proclaimed that the optimal strategy for humans is to believe in the existence of God. For if God exists, then the believer would be rewarded. If he does not exist, the believer would have nothing to lose. Accordingly, we need to accept the asymmetry in knowledge; there are situations in which using statistics and econometrics can be useful. But I do not want my life to depend on it.
If the science of statistics can benefit me in anything, I will use it. If it poses a threat, then I will not. I want to take the best of what the past can give me without its dangers. Accordingly, I will use statistics and inductive methods to make aggressive bets, but I will not use them to manage my risks and exposure.Surprisingly, all the surviving traders I know seem to have done the same. They trade on ideas based on some observation (that includes past history) but, like the Popperian scientists, they make sure that the costs of being wrong are limited (and their probability is not derived from past data).
This problem enters the business world more viciously than other walks of life, owing to the high dependence on randomness...The greater the number of businessmen, the greater the likelihood of one of them performing in a stellar manner just by luck. I have rarely seen anyone count the monkeys. In the same vein, few count the investors in the market in order to calculate, instead of the probability of success, the conditional probability of successful runsg iven the number of investors in operation over a given market history.
Aside from the misperception of one’s performance, there is a social treadmill effect: You get rich, move to rich neighbourhoods, then become poor again. To that add the psychological treadmill effect; you get used to wealth and revert to a set point of satisfaction.
The mistake of ignoring the survivorship bias is chronic, even (or perhaps especially) among professionals. How? Because we are trained to take advantage of the information that is lying in front of our eyes, ignoring the information that we do not see.
The first counterintuitive point is that a population entirely composed of bad managers will produce a small amount of great track records. As a matter of fact, assuming the manager shows up unsolicited at your door, it will be practically impossible to figure out whether he is good or bad. The results would not markedly change even if the population were composed entirely of managers who are expected in the long run to lose money. Why? Because owing to volatility, some of them will make money. We can see here that volatility actually helps bad investment decisions.
The second counterintuitive point is that the expectation of the maximum of track records, with which we are concerned, depends more on the size of the initial sample than on the individual odds per manager. In other words, the number of managers with great track records in a given market depends far more on the number of people who started in the investment business (in place of going to dental school), rather than on their ability to produce profits. It also depends on the volatility.
In real life, the larger the deviation from the norm, the larger the probability of it coming from luck rather than skills.
We can apply the reasoning to the selection of investment categories…Assume you are standing in 1900 with hundreds of investments to look at…A rational person would have bought not just the emerging country of the United States, but those of Russia and Argentina as well. The rest of the story is well-known; while many of the stock markets like those of the United Kingdom and the United States fared extremely well, the investor in Imperial Russia would have no better than medium-quality wallpaper in his hands. The countries that fared well are not a large segment of the initial cohort; randomness would be expected to allow a few investment classes to fare extremely well. I wonder if those “experts” who make foolish (and self-serving) statements like “markets will always go up in any twenty-year period” are aware of this problem.
Researchers frequently use the example of QWERTY to describe the vicious dynamics of winning and losing in an economy, and to illustrate how the final outcome is more than frequently the undeserved one. The arrangement of the letters on a typewriter is an example of the success of the least deserving method. For our typewriters have the order of the letters on their keyboard arranged in a nonoptimal manner, as a matter of fact in such a nonoptimal manner as to slow down the typing rather than make the job easy, in order to avoid jamming the ribbons as they were designed for less electronic days. Therefore, as we started building better typewriters and computerized word processors, several attempts were made to rationalize the computer keyboard, to no avail. People were trained on a QWERTY keyboard and their habits were too sticky for change. Just like the helical propulsion of an actor into stardom, people patronize what other people like to do. Forcing rational dynamics on the process would be superfluous, nay, impossible. This is called a path dependent outcome, and has thwarted many mathematical attempts at modeling behavior.
Studies of the dynamics of networks have mushroomed recently. They became popular with Malcolm Gladwell’s book The Tipping Point, in which he shows how some of the behaviors of variables such as epidemics spread extremely fast beyond some unspecified critical level.
While it is clear that the world produces clusters it is also sad that these may be too difficult to predict (outside of physics) for us to take their models seriously. Once again the important fact is knowing the existence of these nonlinearities, not trying to model them. The value of the great Benoit Mandelbrot’s work lies more in telling us that there is a “wild” type of randomness of which we will never know much (owing to their unstable properties).
Consider that your brain reacts differently to the same situation depending on which chapter you open to. The absence of a central processing system makes us engage in decisions that can be in conflict with each other.
You may prefer apples to oranges, oranges to pears, but pears to apples—it depends on how the choices are presented to you. The fact that your mind cannot retain and use everything you know at once is the cause of such biases. One central aspect of a heuristic is that it is blind to reasoning.
This dependence on the local rather than the global status (coupled with the effect of the losses hitting harder than the gains) has an impact on your perception of well-being. Say you get a windfall profit of $ 1 million. The next month you lose $ 300,000. You adjust to a given wealth (unless of course you are very poor) so the following loss would hurt you emotionally, something that would not have taken place if you received the net amount of $ 700,000 in one block, or, better, two sums of $ 350,000 each.
In addition, it is easier for your brain to detect differences rather than absolutes, hence rich or poor will be (above the minimum level) in relation to something else. Now, when something is in relation to something else, that something else can be manipulated. Psychologists call this effect of comparing to a given reference anchoring. If we take it to its logical limit we would realize that, because of this resetting, wealth itself does not really make one happy (above, of course, some subsistence level); but positive changes in wealth may, especially if they come as “steady” increases.
There is another type of satisfaction provided by the option seller. It is the steady return and the steady feeling of reward—what psychologists call flow. It is very pleasant to go to work in the morning with the expectation of being up some small money. It requires some strength of character to accept the expectation of bleeding a little, losing pennies on a steady basis even if the strategy is bound to be profitable over longer periods. I noticed that very few option traders can maintain what I call a “long volatility” position, namely a position that will most likely lose a small quantity of money at expiration, but is expected to make money in the long run because of occasional spurts. I discovered very few people who accepted losing $ 1 for most expirations and making $ 10 once in a while, even if the game were fair (i.e., they made the $ 10 more than 9.1% of the time).
How could professionals seemingly aware of the (simple) mathematics be put in such a position? Our actions are not quite guided by the parts of our brain that dictate rationality. We think with our emotions and there is no way around it.
The most common one concerns the interpretation of evidence. They most commonly get mixed up between absence of evidence and evidence of absence. How? Say I test some chemotherapy, for instance Fluorouracil, for upper respiratory tract cancer, and find that it is better than a placebo, but only marginally so; that (in addition to other modalities) it improves survival from 21 per 100 to 24 per 100. Given my sample size, I may not be confident that the additional 3% survival points come from the medicine; it could be merely attributable to randomness. I would write a paper outlining my results and saying that there is no evidence of improved survival (as yet) from such medicine, and that further research would be needed.
A medical journalist would pick it up and claim that one Professor N. N. Taleb found evidence that Fluorouracil does not help, which is entirely opposite to my intentions. Some naive doctor in Smalltown, even more uncomfortable with probabilities than the most untrained journalist, would pick it up and build a mental block against the medication, even when some researcher finally finds fresh evidence that such medicine confers a clear survival advantage.
I often heard statements such as “the real market is only 10% off the highs while the average stock is close to 40% off its highs,” which is intended to be indicative of deep troubles or anomalies—some harbinger of bear markets. There is no incompatibility between the fact that the average stock is down 40% from the highs while the average of all stocks (that is, the market) is down 10% from its own highs. One must consider that the stocks did not all reach their highs at the same time. Given that stocks are not 100% correlated, stock A might reach its maximum in January, stock B might reach its maximum in April, but the average of the two stocks A and B might reach its maximum at some time in February. Furthermore, in the event of negatively correlated stocks, if stock A is at its maximum when stock B is at its minimum, then they could both be down 40% from their maximum when the stock market is at its highs! By a law of probability called distribution of the maximum of random variables, the maximum of an average is necessarily less volatile than the average maximum