Econometrics

Econometrics
Exploiting Patterns in the
S&P 500 and DJI Indices:
How to Beat the Market
Econometrics
Figure 1. S&P index showing local peaks (star) and local
troughs (box)
SP time series from Aug, 2007 to July, 2009
1600
Maximum model
Dependent variable: L_MAX
1500
1400
1300
1200
1100
1000
by: Marco Folpmers
900
800
Whether stock prices follow a random walk has been the central question of the finance discipline during the last
decades. The question is relevant for a couple of reasons. First the random walk assumption is inherent in many
valuation formulas (especially for derivatives) and secondly an unpredictable path is impossible to exploit for
smart traders. Recently, the consensus between supporters and opponents of the random walk seems to be that
the walk is not random, some patterns seem to be present, but it is very hard to exploit these regularities. In one
of his famous articles about market anomalies professor Richard H. Thaler concludes about anomalies (Thaler,
1987, 200): ‘A natural question to ask is whether these anomalies imply profitable trading strategies. The question
turns out to be difficult to answer. […] None of the anomalies seem to offer enormous opportunities for private
investors (with normal transaction costs).’ More recently, in 2002, Thaler has again indicated that it is hard to take
advantage of mispricings because it might take too long for prices to return to a more sensible level.1
Introduction
In this article, we will show how it is possible to identify
local peaks and local troughs in the Standard & Poor’s 500
index, how to predict these peaks and troughs and how
to exploit them with the help of a fairly straightforward
algorithm. We will compare the performance of the
algorithm with a buy-and-hold strategy and demonstrate
that the algorithm outperforms the buy-and-hold strategy
dramatically.
The great debate: do stock prices follow a
random walk?
Proponents of the Efficient Market Hypothesis claim
that stock prices follow a random walk and that it should
be impossible to predict future movements based on
publicly available information. The idea that stock prices
are unpredictable and follow a random walk (Geometric
Marco Folpmers
Dr. Marco Folpmers FRM works for
Capgemini Consulting and leads the Financial
Risk Management service line of Capgemini
Consulting NL. He can be reached at marco.
folpmers@capgemini.com.
Brownian Motion) around their intrinsic values is a
fundamental element of the Black-Scholes formula for
call and put option valuation and a series of formulas
(collectively referred to as Black’s formula) for other
types of derivatives such as interest rate derivatives.
Whereas some evidence has been found that stock
prices may depart from the random walk (e.g. the ‘January
effect’, January stock prices tend to exceed the prices
in the other months, see Thaler, 1987), the departures
found are difficult to explain. On top of that, they are
often dismissed as accidental patterns that can easily be
identified with the help of abundant data – without any
meaning whatsoever. Proponents of the Efficient Market
Hypothesis (especially University of Chicago professor
Eugene Fama) have not tired of explaining away apparent
departures from the unpredictable random walk. In 1994,
Merton Miller ascribes apparent mean reversion in the
Standard & Poor’s 500 index to ‘a statistical illusion’
(Miller, 1994).
On the other hand, followers of the Behavioral Finance
school highlight certain inefficiencies and among these
inefficiencies are overreactions to information after
which adjustment takes place.2 The phenomenon that
new information leads to an extreme reaction that is
adjusted later on is consistent with a short-term mean
reverting behavior (De Bondt & Thaler, 1987). However,
as cited above, Richard Thaler, one of the founders of the
Behavioral Finance school, concedes that, even though
Richard Thaler and Burton Malkiel debate in 2002 at
Wharton, see http://knowledge.wharton.upenn.edu/article.
cfm?articleid=651.
1
4
AENORM vol. 17 (66) December 2009
Table 1. Maximum model
700
600
Predictor
Beta
T
P
(6.43)
37.94
1.07
0.06
Std
Error
1.24
18.92
0.30
0.04
Constant
GR
UP
DST_LST_
MIN
DST_LST_
MAX
(5.18)
2.01
3.61
1.48
0.00
0.04
0.00
0.14
0.08
0.05
1.54
0.12
Model performance
0
100
200
300
Day
400
500
600
inefficiencies may be pointed out, it is nearly impossible
to profit from them.
In this article we will test whether mean reversion
is apparent in stock index data and if so, how it can be
exploited. Our analysis departs from previous research
since we are not interested in predicting the level of
the index stock price, but only in predicting the binary
attribute whether or not the price is at a local extreme.
The algorithm: the minimum and maximum
model
The underlying idea behind the algorithm is that the
index time series can be modeled as an oscillation with
unpredictable amplitude but with predictable frequency.
Our aim is to identify local peaks and troughs and not the
level of these local extremes. The algorithm is applied
to index data since index series are less influenced by
idiosyncratic factors.
First we define a local peak (L_MAXt) in the daily
opening prices of Standard & Poor’s 500 Index (It) as the
observation that is the maximum of the d preceding and
the d following observations, so:
L_MAXt = 1, if It = max(It-d, It-d+1,..., It+d)
L_MAXt = 0, otherwise
The local trough is defined analogously.
We have initialized d to a default value of 6 and
applied the definitions of the local peak and trough to
a 2-year time series of the S&P, running from August 1,
2007 to July 31, 2009. The result is shown in Figure 1,
where local local peaks are shown with the help of a star
Pairs
Concordant
Discordant
Nr
2,947.00
267.00
%
91.5%
8.3%
Ties
Total
6.00
3,220.00
0.2%
100.0%
and local troughs with the help of a box.
Within the period shown, the index reached its (global)
maximum value of 1564.98 on October 10, 2007, and its
(global) minimum value of 679.28 on March 10, 2009,
a decline of 57% in 15 months. We have also split the
sample into a development set (the first 250 observations)
and a test set (the last 250 observations). The split is
depicted with the help of a vertical line.
In order to predict a local extreme, we estimate two
models, a maximum model and a minimum model, with
the help of the development set.
We define the explanatory variables for the maximum
model as follows:
• GRt: growth rate of the index determined as GRt = It/
It-d-1;
• UPt: number of successive upward movements at
time t;
• DST_LST_MINt: distance of observation t to the most
recent minimum before time t;
• DST_LST_MAXt: distance of observation t to the
most recent maximum before time t.
Note that we use only data that is contained in the series
itself.
For the minimum model we use the same explanatory
variable with one exception: DOWNt is used (number of
For the information overreaction hypothesis tested as mean reversion, see also De Bondt & Thaler, 1989. De Bondt & Thaler
test the mean reversion in the long run (3-7 years; see also Cutler, 1991, for long-term mean reversion). See also Balvers e.a.
(2000) where it is concluded that there is strong evidence of mean reversion in stock index prices of 18 countries (16 OECD
countries plus Hong Kong and Singapore) over several years. Our purpose is to describe an algorithm that exploits mean
reversion within days. Short-term mean reversion for individual stocks has mainly be tested after an extreme performance.
2
AENORM vol. 17 (66) December 2009
5
Econometrics
Econometrics
Figure 2. performance of algorithm for S&P 500 versus buy-and-hold for test set; left-panel: value development, right-panel:
number of index stocks in portfolio
Figure 3. 101 test sets
0.4
and-hold strategy outperforms the algo.
In all applications we have not quantified the
transaction costs of the trading activity of the algorithm.
However, we believe that the outperformance is dramatic
and that the transaction costs have no significant impact
on the results.
0.3
Discussion
Split between development set and test set
0.6
Value algo and buy & hold (dotted line) Nr shares algo and Buy & Hold (dotted line)
13000
35
11000
25
10000
20
Nr of shares
30
Value
12000
9000
8000
100
Day
200
300
p_maxt
) = X_max ⋅ b_max
1 − p_maxt
v
In this equation, p_max is the estimated probability that
an observation is a maximum according to the maximum
model, X_max is the matrix containing the explanatory
variables for the maximum model, preceded by a column
containing a one-vector, and b_max is the coefficients of
the maximum model reported in Table 1.
The estimated probabilities are:
exp( X_max ⋅ b_maxt )
1 + exp( X_max ⋅ b_maxt )
With the help of a cut-off value equal to 0.1, we identify
the observations that are flagged as a local peak (p_max
above 0.1). Of course, there is a trade-off in determining
the cut-off value. If it is too high, the model will more
often fail to identify a local peak, while, on the other
AENORM 10
0
logit ( p_maxt ) = log(
6
0
6000
successive downward movements at time t) instead of
UPt.
We now estimate a logit maximum model with L_
MAXt as the dependent variable and the explanatory
variables mentioned above as independent variables.
The estimation is performed on the development set. The
results of the estimation are reported in Table 1.
The independent variables GRt and UPt are both
significant at the 5% level. The model concordance is
high, 91.5%. We can illustrate this concordance also as
follows: with the help of the estimated betas we calculate
the logit scores as:
p_maxt =
0.1
5
0
vol. 17 (66) December 2009
In this paper we have referred to the claim that, although
underlying patterns may be present in stock price
development, it is impossible to profit from these patterns.
We have shown with the help of a straightforward
algorithm that this claim is untenable.
0.2
15
7000
5000
Outperformance
0.5
-5
-0.1
250
0
100
Day
200
300
hand, if it is too low, it will generate many ‘false alarms’,
observations wrongly flagged as local peak.
A minimum model has been estimated using the
development set in an analogous way (only using DOWNt
instead of UPt).
The trading strategy
The trading strategy works as follows:
• The initial liquidity balance equals € 10,000. For each
trading day, a liquidity balance is maintained as well
as the number of index shares in the portfolio and their
value using current prices.
• When a local minimum has been identified, the
algorithm buys stocks at the current prices for a
monetary amount of 10% of the initial liquidity
balance, i.e. € 1,000. The liquidity balance decreases
with € 1,000 and the stocks bought are added to the
portfolio.
• When a local maximum has been identified, the
algorithm sells stocks at the current prices for a
monetary amount of 10% of the initial liquidity
balance, i.e. € 1,000. The liquidity balance increases
with € 1,000 and the stocks sold are subtracted from
the portfolio.
• The entire portfolio is liquidated at the end of the
period contained in the test set. The performance is
assessed in terms of one-year outperformance of a
buy-and-hold strategy.
We have first calibrated the parameters for application
of the algorithm within the development set. Thus, we
arrived at d = 6 (as stated above) and the conversion rate of
260
270
280
290
300
310
320
Observation used for split
330
340
350
10% of the initial balance at suspected peaks (conversion
from stocks to liquidity) and troughs (conversion from
liquidity to stocks). Generally, a higher conversion rate
leads to a more volatile performance of the algorithm.
Lower values of d lead to a more active algorithm. i.e.
more suspected peaks and troughs and, hence, more
trading (conversion of liquidity to stocks or vice versa).
Whether a profit can be made from the algorithm can
only be illustrated by applying the algorithm, i.e. the
minimum and maximum models estimated with the help
of the development set and the parameters settings for d,
the cut-off (0.1) and the conversion rate, to the subsequent
test set. In Figure 2 we show the relative performance of
the algorithm when applied to the test set.
From the figure we conclude that the algorithm starts
at a loss, but the value is almost always above the value
of the buy-and-hold portfolio. The one-year return of
the buy-and-hold strategy is -20.2%; the one-year return
of the algorithm is 21.0%. The outperformance equals
51.7%.
In order to prove robustness, we have applied the same
model also to the Dow Jones Industrial Average index
(DJI) for the same period (again split into a development
set and a test set): for the DJI, the minimum and maximum
models have been estimated for the same period used for
the S&P 500 index. Subsequently, the outcomes of the
models have been applied to the same test period. The
situation is very similar to the results shown for the S&P
500 index. The outperformance of the algorithm applied
to the test set of the DJI equals 59.3%.
An objection that could be made, is that we only tested
the algo with the help of one test set. In order to counter
this objection, we have performed additional tests: we
have split the development and test set not only at the
250th observation, but at all observations on the domain
[ 250, 350]. We have plotted the outperformance for all
these 101 test sets in Figure 3.
We conclude that the algo consistently outperforms
the buy-and-hold strategy. The average outperformance
equals 29.8% and there are only 4 cases in which the buy-
References
Balvers, R., Y. Wu, E. Gilliland. “Mean reversion across
national stock markets and parametric contrarian
investment strategies.” Journal of Finance, 55.2
(2000)
Bondt, W.F.M. de, R. H. Thaler. “Further evidence on
investor overreaction and stock market seasonality.”
Journal of Finance, 42.3 (1987):557-581
Bondt, W.F.M. de, R. H. Thaler. “Anomalies: a meanreverting walk down Wall Street.” Journal of Economic
Perspectives, 3.1 (1989):189-202
Cutler, D.M., J.M. Poterba, L.H. Summers. “Speculative
dynamics.” Review of Economic Studies, 58
(1991):529-546
Fama, E.. “Random Walks In Stock Market Prices.”
Financial Analysts Journal, 21.5 (1965):55-59
Folpmers, M.. “Making money in a downturn economy:
using the overshooting mechanism of stock prices for
an investment strategy.” Journal of Asset Management,
10.1 (2009):1-8
Miller, M.H., J. Muthuswamy, R.E. Whaley. “Mean
reversion of Standard & Poor’s 500 Index basis
changes: arbitrage-induced or statistical illusion?”
Journal of Finance, 49.2 (1994):479-513
Poterba, J.M., L.H. Summers. “Mean reversion in stock
prices: evidence and implications.” NBER Working
Paper Series, w2343 (1989). Available at SSRN:
http://ssrn.com/abstract=227278
Thaler, R.H.. “Anomalies: the January effect.” Journal of
Economic Perspectives, 1.1 (1987):197-201
Zeira, J.. “Informational overshooting, booms and
crashes.” Journal of Monetary Economics, 43.1
(1999):237-257
AENORM vol. 17 (66) December 2009
7