How to Identify and Predict Bull and Bear Markets?∗ Erik Kole† Dick J.C. van Dijk Econometric Institute, Erasmus School of Economics, Erasmus University Rotterdam February 1, 2012 Abstract This paper compares fundamentally different methods to identify and predict the state of the equity market. Because this state is a comprehensive economic indicator, good identification and prediction are important for financial decisions and economic analyses. We consider rules-based methods that purely reflect the direction of the market, and regime-switching models that take both average returns and volatility into account. Because the state of the equity market is latent, we develop a novel framework for statistical and economic comparisons of the different methods. Rules-based methods are preferable for identification, because ex post only the direction of the market matters. Regime-switching models perform significantly better in making predictions. Focusing on average returns and volatilities leads to more prudent forecasts and a higher utility from the perspective of risk-averse investor who engages in market timing. Both the statistical and economic distance measures indicate that these differences are significant. ∗ We thank Christophe Boucher, Thijs Markwat and seminar participants at the 8th International Paris Finance Meeting, Inquire’s UK Autumn Seminar 2010 and Erasmus University for helpful comments and discussions. We thank Anne Opschoor for skillful research assistance and Inquire UK for financial support. Kole thanks the Netherlands Scientific Organisation (NWO) for financial support. † Corresponding author. Address: Burg. Oudlaan 50, Room H11-13, P.O. Box 1738, 3000DR Rotterdam, The Netherlands, Tel. +31 10 408 12 58. E-mail addresses kole@ese.eur.nl (Kole) and djvandijk@ese.eur.nl (Van Dijk). 1 1 Introduction The state of the equity market, often referred to as bullish or bearish, is an important variable in finance and more generally in economic analysis. Despite its importance, the literature does not offer a single preferred method to identify and predict the state of the equity markets. In this paper, we compare several existing methods that can identify and predict bull and bear markets. These methods differ in their use of the characteristics of bull and bear markets. During bull markets prices gradually rise and volatility is low, while during bear markets prices can fall dramatically and volatility is high. Our comparison offer the main insight that for ex post identification, methods that focus explicitly on price increases and decreases work best, whereas for ex ante predictions, methods that combine the trend in prices with volatility are preferable. Information on the state of the equity market is obviously relevant for agents who are closely involved in financial markets. Investors may follow a market timing strategy, with a long position in the equity market when it is bullish, and a neutral or short position position when it is bearish. Investors that do not engage in market timing strategies can incorporate the different behavior of stock returns dependent on market sentiment (see Perez-Quiros and Timmermann, 2000) in their risk management. Firms prefer to issue new equity during bull markets. For regulators, the state of the equity market is important, because it can affect the credit supply with destabilizing effects on the real economy. When financial assets are used as collateral, bull markets extend the credit supply, while bear markets reduce it (see Rigobon and Sack, 2003; Bohl et al., 2007). Finally, bull and bear markets impact asset pricing, as they are an important source for time variation in risk premia (see, for example, Veronesi, 1999; Gordon and St-Amour, 2000; Ang et al., 2006). The importance of bull and bear markets concerns economic analysis in general, and not just investments. As argued by Stock and Watson (2003a), stock prices can predict macroeconomic variables, as they are discounted future dividends. As far back as Mitchell and Burns (1938), the state of the stock market is considered as an ingredient for leading indicators of the business cycle (see Marcellino, 2006, for an overview). Harvey (1989); Estrella and Mishkin (1998); Chauvet (1999) and Stock and Watson (2003b) report evidence that the state of stock market helps predicting the business cycle. 2 The literature offers two fundamentally different types of methods to identify and predict the state of equity markets: non-parametric ones based on rules and fully parametric ones based on models. Rules-based methods are more transparent than model-based methods that need statistical inferences, and more robust to misspecification. On the other hand, a completely specified model for the price process on the equity market offers more insight and its quality can be evaluated by statistical techniques, whereas rules-based methods typically require some arbitrary, subjective settings. As a final difference, model-based techniques are statistically more efficient, because they treat identification and prediction in one step, whereas rules-based methods are two-step approaches. We develop new techniques to compare the identifications and predictions of the different methods. Existing techniques, for example to test for predictive ability, do not suffice, because the state of the equity market is latent. There is no generally accepted chronology of bull and bear markets like the NBER chronology of recessions. To determine the difference between identifications and predictions of the methods, we propose a general statistical distance measure and a more specific economic measure. As a statistical measure we propose the Integrated Absolute Difference (IAD) between identifications or predictions. It is closely related to the Integrated Square Difference of Pagan and Ullah (1999) and Sarno and Valente (2004), though easier to interpret as a difference in probability. The economic measure quantifies the preference for an method over another method. We determine this measure as the fee that a risk-averse investor would maximally pay to switch from one method to another to engage in marketing timing. In the category of rules-based methods, we consider Pagan and Sossounov (2003) and Lunde and Timmermann (2004). These non-parametric methods first determine local peaks and troughs in a time series of asset prices, and then use rules to select those peaks and troughs that constitute genuine turning points between bull and bear markets. They are based on the algorithms used to date recessions and expansions in business cycle research (see Bry and Boschan, 1971, among others), and have been adapted in different ways for application in financial markets. The main rule in Pagan and Sossounov (2003) is the requirement of a minimum length of bull and bear cycles and phases.1 By contrast, 1 See Edwards et al. (2003); G´ omez Biscarri and P´erez de Gracia (2004); Candelon et al. (2008); Chen (2009) and Kaminsky and Schmukler (2008) for applications. 3 Lunde and Timmermann (2004) impose a minimum on the price change since the last peak or trough.2 In the second category, we analyze Markov regime-switching models pioneered by Hamilton (1989, 1990). In these models returns behave differently, depending on a discrete latent state process that follows a Markov chain. We consider models with two and with three distinct states. Conditional on the state, returns follow a normal distribution. Empirical applications typically distinguish two regimes with different means and variances and normally distributed innovations.3 The bull (bear) market regime exhibits a high (low or negative) average return and low (high) volatility. The number of regimes can easily be increased to improve the fit of the model (see Guidolin and Timmermann, 2006a,b, 2007) or to model specific features of financial markets such as crashes (see Kole et al., 2006) or bull market rallies (see Maheu et al., 2009). We follow the literature and consider models with two and with three regimes. A comparison of the identification of the state of the US stock market, proxied by the S&P 500, over the period 1955-2010 shows a preference for the rules-based methods. The rules-based methods simply separate periods with price increases from periods with price decreases. The regime-switching models identify periods where the risk-return trade-off is attractive, a positive mean and low volatility, or unattractive, a negative mean and high volatility. However, periods with high volatility are sometimes labeled bearish, even if they exhibit positive average returns. Periods with negative average returns may be labeled bullish as long as the volatility is low. The IADs indicate that the rules-based and model based methods produce identifications that are significantly different. A risk-averse investor would pay up to 20% per year to time the market based on a rules-based method. This fee is significant, but should be interpreted with care as it corresponds with perfect foresight. Because the state of the equity market is a comprehensive economic indicator, predicting it well may be more important than identifying it. We set up a forecasting experiment, where predictions are made from July 1983 onwards. From a utility perspective, regime2 3 Chiang et al. (2009) adopt this method. See for instance Hamilton and Lin (1996); Maheu and McCurdy (2000); Chauvet and Potter (2000); Ang and Bekaert (2002); Guidolin and Timmermann (2008a) and Chen (2009) for applications. 4 switching models are preferable. They make prudent forecasts and yield highest utility. A market timing strategy based on regime switching models produces lower volatility than based on rules. The maximum fee to switch to a regime switching model from a rules-based methods is around 16% and significant. However, looking at Sharpe ratios, the method of Lunde and Timmermann (2004) shows the best performance, with an average return of 8.7% in excess of the risk-free rate and a volatility 32.4%. The resulting Sharpe ratio of 0.27 compares favorably to the Sharpe ratio of a long position in the market at 0.21. The best performing regime-switching model yields a Sharpe ratio of 0.15. The IADs indicate that the predictions by the methods of Lunde and Timmermann (2004), of Pagan and Sossounov (2003), and by the regime switching models are all significantly different. The different results for identifications and predictions show that quickly picking up bull-bear changes is crucial. The sooner a switch is identified, the larger the gains. With perfect foresight, switches are identified immediately. When predictions have to be made, the methods detects switches with some delay, and the performance is less. Regime switching models are fastest in this respect, which partly explains their good performance. The methods of Pagan and Sossounov (2003) rapidly picks up switches, but produces many costly false alarms. We investigate whether inclusion of macro-financial variables improves the predictions of the different methods, but find that the effects are at best marginal. We use a specific-togeneral selection procedure to include those predictive variables that work best in-sample. For the rules-based approaches their use consistently lowers performance, whereas performance improves when predictive variables are included in the transition probabilities of the regime-switching models (see Diebold et al., 1994). While the IAD indicate significant statistical differences, the fees indicate that the value of these differences are small and insignificant. Our research relates directly to the debate between Harding and Pagan (2003a,b) and Hamilton (2003) on the best method to date business cycle regimes. Harding and Pagan advocate simple dating rules to classify months as a recession or expansion, while Hamilton proposes regime switching models. In the dating of recessions and expansions, both methods base their identification mainly on the sign of GDP growth and produce comparable results. For dating bull and bear periods in the stock market by regime switching 5 models, the volatility of recent returns seems at least as important (if not more) than their sign. Consequently, their identifications and predictions differs substantially from the rules-based approaches. We also add to the discussion on predictability in financial markets. We extend the analysis of Chen (2009) in several ways. First, we consider the dynamic combination of more predictive variables. Second, we include predictive variables directly in the regime switching models and do not need Chen (2009)’s two-step procedure. He treats the smoothed inference probabilities as observed dependent variables in a linear regression, which does not take their probabilistic nature into account. Our results for the rules-based approaches show that the in-sample added value of the predictive variables is not met with out-ofsample quality. Strategies with predictive variables perform worse than those without. For the regime switching models, we find quite some variation in the selected variables and their coefficients. Taken together, these results are in line with those documented by Welch and Goyal (2008) for direct predictions of stock returns. 2 Rules or regime switching: theory The state of the equity market is an important variable when making investments and for economic decision making in more general. Unfortunately, this state and its process is latent. An economic agent who wants to infer it, can typically choose between nonparametric and parametric methods. The non-parametric methods consist of sets of rules. The parametric techniques are based on models that specify the distribution of stock returns conditional on the state of the stock market and the dynamics of the state. Specifying a model implies the risk of misspecification, to which non-parametric techniques are more robust. On the other hand, a parametric setting offers statistical techniques to assess the quality of the model. Harding and Pagan (2002a) discuss similar issues with respect to business cycle dating. As rules-based methods, we consider the algorithms proposed by Lunde and Timmermann (2004) and by Pagan and Sossounov (2003). Both methods identify bull and bear markets by peaks and troughs in a price series. They differ in their selection of peaks and troughs that constitute the actual switch points between bull and bear markets. The 6 model-based methods we consider are Markov regime-switching models as pioneered by Hamilton (1989, 1990). In this approach, the state of the stock market follows a first order Markov process with a specified number of regimes. The number of regimes can be set equal to two (a bullish and bearish regime) or larger. More states can be introduced, for example to capture sudden booms and crashes as in Guidolin and Timmermann (2006b, 2007) and Kole et al. (2006) or bear market rallies and bull market corrections as in Maheu et al. (2009). In this paper, we write Stm to denote the state of the equity market at time t for method m. 2.1 Identification and prediction based on rules In the algorithm of Lunde and Timmermann (2004, LT henceforward), peaks and troughs have to meet minimum requirements on their magnitude to qualify as switch points between bull (between a trough and the subsequent peak) and bear markets (between a peak and the subsequent trough). A bull (bear) market occurred if the index has increased (decreased) by at least a fraction λ1 (λ2 ) since the last trough. The identification of peaks and troughs in a price series {Pt }Tt=1 uses an iterative search procedure that starts with a peak or trough. The identification rules can be summarized as follows: 1. The last observed extreme value was a peak with index value P max . The agent considers the subsequent period. (a) If the index has exceeded the last maximum, the maximum is updated. (b) If the index has dropped by a fraction λ2 , a trough has been found. (c) If neither of the conditions is satisfied, no update takes place. 2. The last observed extreme value was a trough with index value P min. The agent considers the subsequent period. (a) If the index has dropped below the last minimum, the minimum is updated. (b) If the index has increased with a fraction λ1 , a peak has been found. (c) If neither of the conditions in satisfied, no update takes place. 7 After these decision rules the agent considers the next period. We follow LT by setting λ1 = 0.20 and λ2 = 0.15. This implies that an increase of 20% over the last trough signifies a bull market, and that a decrease of 15% since the last peak indicates a bear market. To commence the search procedure we determine whether the market is initially bullish of bearish. We count the number of times the maximum and minimum of the index have to be adjusted since the first observation. If the maximum has to be adjusted three times first, the market starts bullish, otherwise it starts bearish. The second approach has been put forward by Pagan and Sossounov (2003, PS henceforward). Their approach is based on the identification of business cycles in macroeconomic data (see also Harding and Pagan, 2002b). They also use peaks and troughs to mark the switches between bull and bear markets. However, their identification is quite different from LT. PS do not impose requirements on the magnitude of the change of the index, but instead put restrictions on the minimum duration of phases and cycles. Their algorithm consists of five steps 1. Locate all local maxima and minima in a price series. A local maximum (minimum) is higher (lower) than all prices in the past and future τwindow periods. 2. Construct an alternating sequence of peaks and troughs by selecting the highest maxima and lowest minima. 3. Censor peaks and troughs in the first and last τcensor periods. 4. Eliminate cycles of bull and bear markets that last less than τcycle periods. 5. Eliminate bull market or bear markets that lasts less than τphase periods, unless the absolute price change exceeds a fraction ζ. We mostly follow PS for the values of these parameters, adjusted for the weekly frequency of our data. We have τwindow = 32, τcycle = 70, τphase = 16 and ζ = 0.20 (see also PS, Appendix B). We censor switches in the first and last 13 weeks, opposite to the 26 weeks taken by PS. Censoring for 26 weeks would mean that only after half a year an investor can be sure whether a bear or a bull market prevails, which we consider a very 8 long time. Since we will use this information in making predictions, we use a shorter period of 13 weeks to establish the initial and the ultimate state of the market. In both methods, the next step is to relate the resulting series of bull and bear states to a set of explanatory variables, zt−1 . We code bull markets as Stm = u and bear markets as Stm = d. Since the dependent variable is binary, a logit or probit model can be used. We opt for a logit model, as this model can be easily extended to a multinomial logit model when more states are present. We adjust the standard logit model such that the effect of an explanatory variable on the probability of a future state can depend on the current state. Some macro-finance variables may have a different (or no) effect on the probability of a switch from a bull market than from a bear market. The probability for a bull state to occur at time t is modeled as m m πqt ≡ Pr[Stm = u|St−1 = q, zt−1 ] = Λ(βqm ′ zt−1 ), m = LT, PS, q = u, d (1) where Λ(x) ≡ 1/(1 + e−x ) denotes the logistic function, and βqm is the coefficient vector on the zt−1 variables, which depends on the previous state of the market q. For notational convenience, we assume the first variable in zt−1 is a constant to capture the intercept term. We call this model a Markovian logit model, as it combines a logit model with the m Markovian property that the probability distribution of the future state St+1 is (partly) determined by Stm . If the coefficient βqm does not depend on q, a normal logit model results. If only a constant is used, the market state process is a standard stochastic process with the Markov property. To form the one-period ahead prediction for πTm+1 , the prevailing state at time T is needed. For the rules-based approaches, this information may not be available. In the LT-approach, only if PT equals the last observed maximum (minimum), and is a fraction λ1 above (λ2 below) the prior minimum is the market surely in a bull (bear) state. The PSalogirthm suffers from this problem too, since only the state up to the last τcensor periods is known. So, the market may already have switched, but this will only become obvious later. In that case, the state of the market is known until the period of the last extreme value, 9 which we denote by T ∗ < T . We construct the one-period ahead prediction recursively m m Pr[St+1 = s|zt ] = Pr[St+1 = s|Stm = u, zt ] Pr[Stm = u|zt−1 ]+ m Pr[St+1 = s|Stm = d, zt ] Pr[Stm = d|zt−1 ], T ∗ < t ≤ T + 1. (2) Starting with the known state at T ∗ , we construct predictions for T ∗ + 1, which we use for the predictions of T ∗ + 2 and so on. This iteration stops at T + 1. 2.2 Identification and prediction by regime-switching models We also consider a method for identifying and predicting bull and bear markets that is fundamentally different from the algorithms considered in the previous section. Instead of applying a set of rules to a given series, we now first write down a model that can be the data generating process of a stock market index that allows for prolonged bullish and bearish periods. Estimating such a model produces probabilistic inferences on periods of bull and bear markets in a certain index. Using such a model-based approach has several advantages. First, it offers more insight into the process under study. We can derive theoretical properties of the model and see whether it yields desirable features. Second, we can easily extend the number of states in the model. We can test whether such extensions imply significant improvements. A third advantage is the ease with which we can compare results for different markets and different time periods. Models can typically be summarized by their coefficients, whereas a simple characterization of the rule-based results may not be straightforward. The advantages come at the cost of misspecification risk. In particular (missed or misspecified) changes in the data generating process can have severe impact on the results. As rule-based approaches do not make strict assumptions on distributions or on the absence or presence of variation over time, they may be more robust. We consider several Markov chain regime switching models for the stock market, having k regimes (used as first suffix) and having either constant or time-varying probabilities (suffix C or L). For example, the label RS3L means a Markov regime switching model with three states and time-varying transition probabilities. Based on the results of Guidolin and Timmermann (2006a, 2007, 2008b), we consider k = {2, 3}, and use S m to denote the 10 set of regimes. We assume that the excess index return rt follows a normal distribution in each regime s with regime specific means and variances, m m rt ∼ N(µm s , ωs ), s ∈ S (3) We order the regimes based on the estimated means. While asymmetric or fat-tailed distributions can be used for the regimes, Timmermann (2000) shows that the mixtures implied by regime switching models constituted by normal distributions can flexibly accommodate these features. Our approach differs from Maheu et al. (2009). These authors allow for bear markets that can exhibit short rallies and bull markets that can show brief corrections. They enable identification by imposing that the expected return during bear markets including rallies is negative, while it is positive during bull markets including corrections. This setup can improve identification, though the added value for prediction is less obvious. The difference in the predicted return distributions between a bull market and a bear market rally is likely to be small. Since the actual state of the market is not directly observable, we treat it as a latent variable that follows a first order Markov chain with transition matrix Ptm that can vary over time. This matrix contains parameters m m πqst ≡ Pr[Stm = s|St−1 = q, zt−1 ], s, q ∈ S m . (4) P m πsqt = 1 applies. When the transition probabili˙ − 1) free parameters to be estimated. If ties are kept constant, this restriction leaves k (k Of course, the restrictions ∀q, ∀t : s∈S m the probabilities are time-varying, we use a multinomial logit specification m′ m πqst eβqs zt−1 , s, q ∈ S m , =P m ′z βqς t−1 e ς∈S (5) with ∀q ∃s ∈ S : βqs = 0 to ensure identification. If the number of regimes equals 2, this multinomial specification reduces to the standard logit specification m m πqt ≡ Pr[Stm = u|St−1 = q, zt−1 ] = Λ(βqm ′ zt−1 ), m = RS2L. 11 (6) This specification is mathematically similar to the logit models for the rules based approaches in Eq. (1), though it is an integrated part of the regime switching model. We finish by introducing parameters for the probability that the process starts in a P specific state, ξsm ≡ Pr[S1m = s]. Again the restriction s∈S m ξsm = 1 should be satisfied. We treat the remaining parameters as free, and estimate them. We estimate the parameters of the resulting regime switching model by means of the EM-algorithm of Dempster et al. (1977). To determine the optimal parameters describing the distribution per state, we follow the standard textbook treatments (e.g., Hamilton, 1994, Ch. 24). In appendix A we extend the method of Diebold et al. (1994) to estimate the parameters of the multinomial logit specification. 3 Comparing two filters We want to establish the difference of the results of the various approaches and test for their significance. This comparison is complicated by the latent nature of the true regime. Neither does an accepted reference list of bull and bear markets exist, contrary to for example the NBER list of recessions and expansions, which is used in the business cycle literature. As a consequence, standard ways to compare the different identifications do not avail. The evaluation of predictions suffers from the same problem. Because the true chronology of bull and bear markets is latent a prediction can never be classified as true or false. We cannot apply existing tests for predictive ability, which define a loss function over the realization and the prediction of a certain variable.4 Instead, we propose a new framework that builds on the probabilities for the different states of the market. For a purely statistical comparison, we propose a statistic that is based on the absolute difference between the probability vectors that result from different methods. While this statistic indicates how different two methods are, it does not point out which one is better. Therefore, we develop a second statistic that is based on economic decision-making. We consider an investor who wants to time the market, and derive the maximal fee that would make her ex post indifferent towards two methods. Both statistics 4 See for example Diebold and Mariano (1995); West (1996); White (2000); Corradi and Swanson (2007). 12 can be used for the identification and the predictions that the methods produce. The basis for both statistics is the interpretation of the different approaches as filters. Each algorithm m applies a filter F m : (t1 , Ωt2 ; βˆm , θ m ) → p. (7) to an information set Ωt2 to determine the likelihood ps of each state s at a point in time t1 . The information set contains a return series and a set of explanatory variables, Ωt = {(rτ , zτ )}tτ =1 .5 If t1 > t2 , this probability vector can be interpreted as a forecast probability. If the information set comprises all available information, denoted by ΩT , the likelihood corresponds with identification, and we call it an inference probability. The filter may use parameters βˆm ≡ βˆm (Ωt ) that are estimated using the information set Ωt with 3 3 m t3 ≤ t2 , or exogenously specified parameters θ , for example the boundaries λ1 and λ2 in the LT-algorithm. In case of the rules based approaches, the state at time t is identified as either bullish or bearish, so F m : (t, ΩT ; θ m ) = (1, 0)′ or (0, 1)′ for m = LT, PS. If regime switching models are used, the identification comes from the smoothed inference probabilities (see Hamilton, 1994, Ch. 22). Comparing the results of two different filters F m (t1 , Ωtm ; βˆn , θ n ) and F n (t1 , Ωtn2 ; βˆm , θ m ) 2 is equivalent to comparing the two resulting probability vectors pm and pn . The difference between pm and pn can come from a different filter algorithms F or from information sets of different length Ωtm or Ωtn2 . There should be a function g : (pm , pn ) → R, whose outcome 2 can be interpreted as a difference. So, a larger absolute value for g(pm, pn ) should indicate a larger difference. When a statistical measure is used, the function g is closely related to pm and pn . For an economic comparison, the function g is defined in the context of the economic decision-making that is considered. As pointed out by Pesaran and Skouras (2002), statistically based and economically based comparisons both have their merits. The statistical comparison is more general, and 5 The rules-based approaches actually work with prices series to define peaks and troughs. To determine turning points, subsequent rules based on returns are applied. The regime-switching method are completely based on returns. From a statistical point of view, a stationary information set is advantageous. Therefore, we assume that a set of returns is included in Ωt . Since the initial index value P˜0 does not influence the results of the rules-based approaches, it does not matter whether a prices series or a return series is taken as input. 13 can be relevant in various circumstances. In contrast, an economic comparison relates the comparison to a specific decision problem, and can take into account that the consequences of decisions can be asymmetric. Wrong decisions may be more harmful than good decisions are beneficial. By using cost functions or utility functions (as in Granger and Pesaran, 2000; West et al., 1993), such a measure can be easily linked to economic theory. A novelty in our use of these concepts is their application to compare identification, whereas the existing literature relates them to evaluate forecasting accuracy. 3.1 The statistical difference We base our statistical distance measure on the L1-norm, corresponding with the absolute difference. Calculating the difference between two methods of identification or prediction would be easy, when the realization is known. Conditional on the realized state s, we would calculate n d(pm , pn |S = s) ≡ |pm s − ps |. (8) We can only compare two filters if their set of states coincide, S m = S n . If the sets do not coincide, we can reduce the larger set by aggregating two or more states. We cannot n measure the difference between pm s and ps by the ratio of their logarithms as proposed by Kullback and Leibler (1951) since either probability can equal zero or one, when we consider identification in the rules-based approaches. Since S is latent, we next integrate over the different k states, d(pn , pm ) ≡ X n φs |pm s − ps |, (9) s∈S where φs ≡ Pr[S = s] is the probability that state s prevails under the true probability n measure. The weight of the difference between pm s and ps increases when state s is more likely to occur. The expression can be interpreted as an integrated absolute difference, similar to the integrated square difference in Pagan and Ullah (1999) and Sarno and Valente (2004). For the binomial case S m = S n = {u, d}, the above expression simplifies to n m n m m m n m n d(pn , pm ) = φu |pm u −pu |+φd |pd −pd | = φu |pu −pu |+(1−φu )|1−pu −(1−pd )| = |pu −pu |. 14 We estimate the expected value E[d(pn , pm )] by its sample equivalent XX 1 n φs,t |pm s,t − ps,t |, T − R + 1 t=R s∈S T dbm,n ≡ (10) where R is the first period for which we compare the probabilities. When we compare identifications, we can use the full sample and typically have R = 1. For predictions, R > 1 and the observations before R are used as in-sample period to estimate model parameters. In the binomial case, φ is irrelevant. In the multinomial case we need to make an assumption on φt . We can, for example, assume that φt = pm or φt = pn . dbm,n t t will depend on the choice for φt in a similar way as the Kullback-Leibler divergence (see Kullback and Leibler, 1951). We propose a bootstrap to derive the variance of of dd m,n , which we discuss in Section 3.3. n We propose a slightly adjusted distance measure when pm s ∈ {0, 1} and ps ∈ [0, 1]. This situation arises when pm s corresponds with full-sample identification by the rules-based methods and pns with predictions from the rules-based methods or identification as well n as predictions from the regime-switching models. When pm s = 1 and ps > 1/2, the two approaches lead to the same rounded inference or prediction. In that case we would like n to have a zero distance, which means replacing pm s by ps . By a similar logic, we replace n m n pm s by 1 − ps when ps = 1 and ps ≤ 1/2. Together, it means we replace the conditional distance in Eq. (8) by the function n 0 if pm s = I(ps > 1/2) m n ˜ d(p , p |S = s) = |1 − 2pns | otherwise, (11) where I() is the indicator function. 3.2 The economic difference To construct an economic measure for the difference between two predictions, we need a framework that links the state probabilities to a decision, and a method to evaluate this decision and compare it to another state probability. We take the perspective of an investor who uses the predictions to speculate on the occurrence of bull or bear markets. She speculates by taking long or short position in one-period futures contracts. While 15 other frameworks are possible, we think that this approach is direct and easy to interpret. However, other approaches, for example using bull- and bear markets for macroeconomic predictions, are also possible. We assume the investor maximizes the expected value of her utility function U(W ). We express her position as a fraction w of her initial wealth W0 . Her next-period wealth equals W0 (1 + wr). We approximate the utility function to the second order around her initial wealth W0 : 1 U(W0 (1 + wr)) ≈ U(W0 ) + U ′ (W0 )W0 wr + U ′′ (W0 )W02 w 2 r 2 . 2 (12) Since the current utility level does not influence the optimization, we can ignore the first term. Dividing by U ′ (W0 )W0 produces a standardized utility function 1 1 U ′′ (W0 )W0 2 2 w r = wr − γw 2 r 2 , U˜ (W0 (1 + wr)) = wr + ′ 2 U (W0 ) 2 (13) where γ is the coefficient of relative risk aversion. We choose a quadratic approximation instead of other fully specified utility functions for two reasons. The rules-based approaches do not provide predictions on the complete distribution of r, but just predict its sign. The quadratic approximation only requires an estimate for the mean and the variance of the distribution. Second, this approximation fits in nicely with the regime-switching models that reflect both the mean and variance in its identification and predictions. Nonetheless, a similar approach with higher order moments would be straightforward, see e.g. Harvey and Siddique (2000), Jondeau and Rockinger (2006) or Guidolin and Timmermann (2008a). First, we derive the optimal decision, given a vector with state probabilities pm t . Maximizing the expected value of Eq. (13) produces the optimal portfolio m wtm = µm t /(γψt ), (14) m where µm t is the predicted mean and ψt the predicted raw second moment of rt+1 . Both depend on the forecast probability for each state, X m pm µm s,t µs t ≡ s∈S m ψtm ≡ X s∈S m m pm s,t ψs = X m 2 m pm s,t ((µs ) + ωs ), s∈S m 16 (15) (16) m m where µm s , ψs and ωs are the state-specific mean, raw second moment and variance for model m. Next, we evaluate the optimal portfolio produced by method m, by calculating the unconditional expected utility with respect to the unconditional distribution Gr of rt+1 , 1 m 2 m m (17) V (w ) = EGr wt rt+1 − γ(wt rt+1 ) . 2 We determine the economic difference between two methods m and n by calculating the fee that an investor would be willing to pay to use method m instead of n, and denote it ηm,n . The fee is expressed relative to the investor’s wealth. Including the fee, wealth at time t + 1 equals W0 (1 + wtm rt+1 − ηm,n ). The utility resulting from method m should, after paying the fee, equal the utility resulting from method n, 1 1 2 2 m n m n EGr wt rt+1 − ηm,n − γ (wt rt+1 − ηm,n ) = EGr wt rt+1 − γ (wt rt+1 ) , 2 2 (18) 1 2 m m n ⇔V (w ) − V (w ) − 1 − γ EGr [wt rt+1 ] ηm,n − γηm,n = 0 2 which can be solved analytically for ηm,n . If ηm,n > 0, the investor is willing to pay a fee for adopting m instead of n, so she prefers method m over n. If λm,n is negative, it can be interpreted as a compensation that the investor wants to receive for adopting an apparent inferior method m instead of n. Consequently, ηm,n is not only a measure for how different method m is from n, but also whether method m is more attractive to riskaverse investors. The measure is not symmetric, so exchanging methods m and n does not produce the negative of the original fee, ηn,m 6= −ηm,n , unless m = n. Based on series for −1 −1 {wtm }Tt=R−1 , {wtn }Tt=R−1 and {rt }Tt=R , we can estimate ηd m,n . As discussed in the previous section, the bootstrap is best used to construct the distribution of ηd m,n . While a fee and the closely related concept of a certainty equivalent return have been used before to determine the economic value of investment strategies, we are the first to adopt it as a test statistic. Fleming et al. (2001) and Marquering and Verbeek (2004) calculate fees to compare dynamic investment strategies with a fixed benchmark of a static buy-and-hold strategy. West et al. (1993) interpret the relative increase in initial utility that would be needed to equate to expected utility levels of two strategies also as a fee, while other papers, for example Das and Uppal (2003) and Ang and Bekaert (2002), refer 17 to it as a certainty equivalent return. Their approaches imply that the fee is paid upfront and reduces the amount available for investment. Since the investment strategies we consider are zero-cost, we deduct the fee from the investment return. 3.3 Bootstrap Conventional asymptotic techniques may not avail to determine the distribution of dd m,n or d ηd m,n for two reasons. First, the distribution of dm,n has a lower bound at 0. Hence, the n limiting distribution will be non-normal. Second, series of pm t and pt may exhibit a high degree of autocorrelation that is not properly addressed by considering a limited number of autocovariances as proposed by Diebold and Mariano (1995). Therefore, we propose to use the bootstrap. We implement the bootstrap in two ways, depending on whether we compare the identification or predictions of different methods. In both cases, we construct bootstrapped samples from the original information set ΩT , so including returns and predicting variables. To account for autocorrelation in these series, we apply the stationary bootstrap of Politis and Romano (1994). When we compare the different methods for identification, we use the bootstrapped sample ΩjT to calculate new ˆ j ) and to construct new series of inference probabilities. In turn, these lead estimates β(Ω T j j to bootstrapped estimates dd m,n and ηd m,n . These set of bootstrapped estimates converges to the true distributions and can be used for testing and the construction of confidence intervals. In the case of comparing predictions, we follow the approach of White (2000). We resample from the second part of the information, starting from the first prediction R. So, in this case the first part of the information set ΩjR−1 = ΩR−1 for each bootstrap j. We n,j T T create new series {pm,j t }t=R and {pt }t=R , but use the estimates from the original series. j j With these series, we calculate bootstrap estimates dd m,n and ηd m,n and construct their distribution. 18 4 Data and implementation 4.1 Stock market data The state of the stock market should be determined against the benchmark of a riskless investment. A riskless bank account Bt earns the risk-free interest rate rτf over period τ . Starting with B0 = 1, the value of this bank account obeys Bt ≡ t−1 Y τ =0 1 + rτf . (19) From a stock market index Pt the relevant series to determine the state follows P˜t = Pt /Bt . (20) The return on this index is the market return in excess of the risk-free rate. It also corresponds with the return on a long position in a one-period futures contract on the stock market index. Futures contracts are the natural asset to speculate on the direction of the stock market, as they are cheap and easily available. Studying the excess market index P˜t thus corresponds directly with the return on an investment opportunity. Our analysis considers the US stock market, proxied by the S&P500 price index on a weekly frequency. We splice together a time-series for the S&P500 by combining the data of Schwert (1990) with the S&P500 series that the Federal Reserve Bank of St. Louis has made available on FRED.6 Schwert’s data set runs from February 17, 1885 until July 2, 1962, whereas the FRED series starts on January 4, 1957 and is kept up-to-date. For the risk-free rate we use the three-month T-Bill rate, also from FRED. This series starts on January 8, 1954. Because of the availability of the predicting variables, the data sample that we analyze starts on January 7, 1955. We use weekly observations because of their good trade-off between precision and data availability. Higher frequencies lead to more precise estimates of the switches between bull and bear markets. On the other hand, data of predicting variables at a lower frequency is available for a longer time-span. Weekly data does not cut back too much on the time span, and gives a satisfactory precision. 6 We kindly thank Bill Schwert for sharing his data with us. 19 Figure 1 shows the excess stock price index for the US. The index has been set to 100 on 1/7/1955. The graph exhibits the familiar financial landmarks of the last sixty years. We observe a clear alternation of periods of rise and decline in the 1950s and 1960s, the prolonged slump during the late 1970s and early 1980s, the crash of 1987, the dramatic rise during the IT-bubble of the late 1990s and the subsequent bust in 2000-2002, and finally the fall during the recent credit crisis. [Figure 1 about here.] 4.2 Predicting variables We consider macro-economic and financial variables to predict whether the next period will be bullish or bearish. Our choice of variables is motivated by prior studies that have reported the success of several variables for predicting the direction of the stock market. Hamilton and Lin (1996), Avramov and Wermers (2006) and Beltratti and Morana (2006) use business cycle variables like industrial production. Ang and Bekaert (2002) show the added value of the short term interest rate. Avramov and Chordia (2006) provide evidence favoring the term spread and the dividend yield. Chen (2009) considers a wide range of variables with the term spread, the inflation rate, industrial production and change in unemployment being the most successful. We join this literature and gather data accordingly for inflation, industrial production, unemployment, the T-Bill rate, the term and credit spread, and the dividend-to-price ratio. To ensure stationarity, we transform some of the predictive variables. The T-Bill rate and the D/P-ratio exhibit a unit root. We construct a stationary series by subtracting the prior one-year average from each observation, used more often in forecasting (see e.g., Campbell, 1991; Rapach et al., 2005). For the unemployment rate we construct yearly differences. We transform the industrial production series to yearly growth rates. We do not transform the inflation, the term spread or the credit spread series. To ease the interpretation of coefficients on these variables, we standardize each series. As a consequence, coefficients all relate to a one-standard deviation change and the economic impact of the different variables can be compared directly. In Appendix B we provide more information on the predictive variables. 20 The data for inflation, industrial production and unemployment is available at a monthly frequency, dating back to 1950 or earlier. We lag this data by one month, and assume that the series are constant within a month. For the T-Bill rate, weekly observations are available from January 8, 1954. Since we consider the T-Bill rate as a difference to it’s yearly moving average, this sets the starting date one year later, which also determines the starting date for our analysis. Weekly observations of the term and credit spreads become available from January, 1962. Before that date, we use monthly observations. The D/P-ratio is available at a weekly frequency for all of our sample period. We lag weekly observations by one week. Because the predicting variables have a mixed frequency, we combine the stationary bootstrap of Politis and Romano (1994) discussed in Section 3.3 with a block-bootstrap. We apply the stationary bootstrap on a monthly frequency. If a certain month is drawn, we draw all four or five corresponding weekly observations. 4.3 Variable selection We consider in total seven variables that can help predicting the future state of the stock market. Not all these variables might be helpful in predicting specific transitions. Therefore, we propose a specific-to-general procedure for variable selection. In both the rulesbased and the regime switching approaches we start with a model with only constants included. Next, we calculate for each variable and transition combination the improvement its inclusion would yield in the likelihood function. We select the variable-transition combination with the largest improvement and test whether this is significant with a likelihood ratio test. If the improvement is significant, we add the variable to our specification for that specific transition and repeat the search procedure with the remaining variablestransition combinations. The procedure stops when no further significant improvement is found. This approach differs from the general-to-specific approach, which would include all variables first and then consider removing the variables with insignificant coefficients. For the RS3L-model, we would need to estimate a model with 3 · 2 · 7 = 42 transition coefficients, which is typically infeasible. For the same reason, we do not follow Pesaran and 21 Timmermann (1995), who compare all different variable combinations based on general model selection criteria such as AIC, BIC and R2 . 5 Full Sample Identification We first compare the identification of the different methods for the full sample. This comparison shows in detail for the largest available information set how and why the outcomes of the various methods differ. We consider the actual dating of bull and bear markets, but also their durations and the return distributions. The IAD and switching fee allow us to summarize these differences in a single number. Figure 2 shows which periods are qualified as bullish (white area) or bearish (pink area) by the different methods. We report summary statistics on the duration of bull and bear markets in Table 1. The LT-method produces 16 cycles that are spread over our sample period. Since the LT-algorithm is based on peaks and troughs, switches all take place at maxima and minima. In the periods 1955–1970 and 1985–2000 long bull markets and short bear markets alternate. In the period 1970–1985 bear markets dominate and prices decline over the years. After 2000, we see bear market periods with pronounced declines in prices. Bull markets last slightly over two years on average; bear markets slightly longer than one year. However, the variation in duration is large. [Figure 2 about here.] [Table 1 about here.] The identification by the PS-method in Figure 2b closely resembles the identification by the LT-method, but is not identical. The PS-method shows more bull and bear markets (19 and 18). The PS-method imposes minimal duration on bull and bear markets instead of minimal changes, so it identifies some bull and bear markets that do not pass the hurdles of the LT-model. On the other hand, the LT-method identifies some bull-bear market cycles that last too short to be picked up by the PS-method. The shortest bull (bear) market from the LT-method lasts 15 (7) weeks, compared to 27 (15) from the PS-method. 22 However, average duration is lower for the PS-method, since it identifies more cycles. The variation in duration remains large. In Figures 2(c–d) we show the identification by the RS2-models. We estimate the model parameters, and use them to calculate smoothed inference probabilities at each point in time. The smoothed inference probability for a bull market at time t gives the probability that a bullish regime prevails, based on the full sample. We plot the series of bull probabilities by a thin black line. We see that the probability for a bull market is either close to one or close to zero, and rarely equal to values around 0.5. This indicates that the two regimes are quite distinct, and that the approach gives a clear indication which regime prevails. To compare the identification with that of rules-based methods, we also plot the series of rounded probabilities. If a bull probability exceeds 0.5, we categorize the observation as bullish, otherwise as bearish. The RS2-models differ substantially from the LT and PS models. We do not see a comparable alternation of bull and bear markets, but instead periods of bull markets that are interrupted by brief bear markets (e.g., 1955–1975 and 1983–1995) and periods of bear markets that are interrupted by brief bull markets (e.g., 1979–1983 and 1997–2003). The number of cycles is much larger at 34 (RS2C) and 43 (RS2L), and consequently the duration is considerably shorter. Bull markets last on average 52 to 64 weeks; bear markets only 16 to 21 months. Still, the longest bull market still lasts 336 or 337 weeks, and the longest bear market 78 weeks. The impact of time-variation in the transition probabilities is small, since the RS2C and RS2L-model produce a highly similar identification. To see why the rules-based methods and the RS2-models produce such different identifications, we report the means and volatilities of the bullish and bearish regimes in Table 2. Bull markets have a positive average return and low volatility, whereas bear markets exhibit negative returns and high volatility. When using rules-based methods, the difference between bull and bear markets is concentrated in the average return, which is about a full 1% lower when a bear market occurs. The volatility of bear markets is higher, but the ratio of bear to bull market volatility is around 1.30. Regime switching models pay more attention to volatility. Consequently we observe the largest difference there. In the RS2-models, the volatility ratio is around 2.25, while the difference in average returns is 23 only around 0.44%. As a consequence, the RS2-models identify high volatility periods as bearish, even if prices eventually increase, and low volatility periods as bullish, even if prices fall. That’s why the periods with gradually falling prices in the 1950s are seen as bull markets, and why the volatile price increase from 1997 to 2000 is qualified as bearish. Of course, a risk-averse investor may indeed judge volatile price increases as unattractive. We consider such a comparison when we calculate the switching fees. [Table 2 about here.] When we allow three regimes in Figures 2(e–f), the identification is between the rulesbased methods and the RS2-models. Compared to the RS2-models, we see more and longer bear markets in the period 1955-1990, which puts the RS3-models more in line with the rules-based approaches. Table 2 shows that the bullish regime of the RS3-models has the same characteristics as the bullish regime in the RS2-models. Instead of one bearish regime, we now see two: a mild one with an average return just below zero and volatility similar to the rules-based methods; and a strongly bearish regime with a large negative average return and very high volatility. The presence of the strongly bearish regime is limited to a few periods, notably the big declines by the end of 1974, the crash of October 1987, and the big drops during the credit crisis in 2008. Table 1 also shows that the RS3-models are closer to the LT and PS-methods than the RS2-models. They identify 17 bull markets, that last on average 2 years. Again, the duration shows quite some variation. The RS3C (RS3L) model identifies 24 (25) mild bear markets and 7 (8) strong bear markets. Mild bear markets last around 45 weeks, whereas strong bear markets last on average 9–11 weeks, though their maximum is still half a year. When we aggregate the identification of mildly and strongly bearish regimes, we end up with 17 bearish periods that last on average 69-72 weeks, which is again quite in line with the rules-based results. So far, we compared the identification of the different methods by figures or summary statistics. Such comparisons cannot be summarized in one number and may be misleading. Therefore, we calculate the integrated absolute difference (IAD) of Section 3.1, which compares the identification by two methods on a week-to-week basis. For the models with three regimes, we aggregate the mildly and strongly bearish regimes, and concentrate on 24 the bullish versus the two bearish regimes. The average IAD for all combinations of two methods in Table 3 can be interpreted as a probability, so they are bounded between 0 and 1. Of course, a difference of 1 simply means that two method produce completely the opposite identification. The average difference between the LT and PS-method is small, only 0.068. We can say that with a probability of only 6.8% the identification by the LTand PS-methods differs. [Table 3 about here.] The probability that the rules-based methods produce a different identification than the regime switching models is much higher, as the IAD’s range from 0.247 to 0.313. Even though the summary statistics for the RS3-models were close to those for the rules-based methods, the IADs show that the differences between these models are comparable to the differences between the RS2-models and the rules-based models. The IADs between the regime switching models with two or three regimes are smaller, but still indicate a probability of around 17% of a different identification. Time-varying transition probabilities lead only to minor changes in the identification, with IADs of 0.032. Because the differences between constant and time-varying transition probabilities are so small, we postpone a more detailed comparison to Appendix C. There we also consider conventional techniques for model comparison. We use the stationary bootstrap of Politis and Romano (1994) to construct confidence intervals around the estimated IADs. In this bootstrap, for a given drawing the next drawing is random with probability p or the next observation with probability 1 − p. Since bull and bear markets are highly persistent (see Table C.1), we put this probability p at a low value of 0.05. The confidence intervals are quite wide, which indicates that the methods can produce quite varying results. They also indicate that the distribution of the IADs is skewed to the right. No confidence interval includes zero, which implies that all methods produce significantly different identifications. For the LT and PS-methods, their IAD can be as large as 0.141. The 90%-confidence intervals also show the distinction between the rules-based and regime-switching models, with a 5% lower bound on the IADs of around 0.17. Within the class of regime-switching models IADs are simply not very precise. 25 The IAD is a statistical measure of the difference between two methods, but it does not tell how important this difference is. Therefore, we also look at the fee that an investor would be willing to pay to switch from one method to another. In Section 3.2 we derive this fee in an investment setting. Since we consider identification here, this fee corresponds with a situation of perfect foresight. The investor knows with certainty whether a method labels the next period as bullish or bearish, instead of having to predict it. So, the results of this investment setting give an upper bound to what could possibly be reached in real-life. We report the performance measures and fees in Table 4. The rules-based methods lead to a stunning average return of 48% per year with a volatility of 30%. So, if an investor would be able to correctly predict bull and bear markets, this is her expected performance. The regime-switching models lead to smaller returns, ranging from 7.3% to 9.9% per year, but also to considerably less volatility of around 11%. The difference in magnitude comes to some extent from the fact that the rules-based identification is binary (either a bull market or a bear market prevails), while the various regimes switching models always attribute a non-zero probability to every regime. Still, the Sharpe ratio is twice as high for the rules-based methods. Utility is four to five times higher. Here we see the influence of qualifying volatile periods with price increases as bearish and tranquil periods with price decreases as bullish. [Table 4 about here.] The fees in Table 4b indicate that a risk-averse investor, with a coefficient of relative risk aversion equal to five, would be willing to pay a considerable fee to switch from the regime-switching models (columns) to the rules-based methods (rows). For example, she would pay up to 18.63% per year to switch from the RS2C model to the LT-method, and 18.77% to switch to the PS-model. We use the same bootstrap as for the IADs to construct confidence intervals. We find that the fees to switch from the RS2-models to the rules-based methods differ significantly from zero, so the RS2-models lead to a significantly inferior identification. Confidence intervals for the fees to switch from the RS3-models to the rules-based methods are wider, and may even include zero. They indicate that the outperformance of the RS3-models by the rules-based methods is less precise. 26 An investor would pay a small fee of 1.37–2.14% to switch from the RS3-models to an RS2-model. However, the 90% confidence intervals are wide, include zero and indicate that this fee is also often negative. So though we actually observe an underperformance by the RS3-models, we can just as easily find outperformance. The investor would also pay a small fee to switch from models with constant transition probabilities to models where they are time-varying. In case of the RS2-models this fee has small confidence intervals, which include zero. In case of the RS3-models, the fee is less precise and can vary from -1.29% to 11.85%. Here the added value of the predicting variables can be a bit larger. We conclude that the rules-based approaches produce substantially different bull and bear markets than the regime-switching models. While the rules-based approach tend to identify relatively long periods of bull and bear markets, the regime-switching models exhibit periods when bull (bear) markets dominate with short interruptions of bear (bull) markets. Maheu et al. (2009) explicitly accommodate rallies during bear markets and corrections during bull markets in their regime switching models. The driving force behind these differences is volatility, which is important for identification by regime switching but completely ignored by the rules-based methods. From an investor’s perspective, identification by rules is definitely preferable. Also in other economic decisions and analyses where bull and bear market periods are needed, rules-based methods are best to determine these periods. However, the fees that we calculate correspond with perfect foresight. In the next section, we see which method performs best, when the label of the next period has to be predicted. 6 Predictions The state of the stock market being a comprehensive economic indicator, we are more interested in predicting it than just identifying it. Good predictions can tell investors whether to expect a good or bad performance of investments. More generally, it signals whether the economic outlook is good or bad, and whether risk premia are low or high. This means that we should not only compare the different methods by their ex-post identification, but mainly by the predictions that they yield. For identification, rules-based methods are preferable, because they excel in ex-post separating profitable from losing periods. It is 27 not obvious whether this advantage carries over to forecasting. The rules-based methods indicate only after some time what the sentiment of the stock market is. For example, the LT-method needs a price increase of 20% to categorize a period as bullish. The PS-method excludes the last 13 weeks from identification. If it is unknown which state currently prevails, these methods may fail to correctly predict the future state, in particular when a switch has just occurred. Regime switching models may respond quicker to switches. To see which method is best in predicting the future state of the stock market, we set up a prediction experiment. An investor uses the first part of the sample period until June 24, 1983 for identification and the estimation of model parameters, and uses these to make one-step-ahead predictions for the second part of the sample period. At every point in time, she updates her information set to construct a prediction for the next week. When 52 weeks have passed, the investor expands the estimation window and reestimates the parameters of the different models. Given the duration of bull and bear markets, 52 weeks offers an acceptable trade-off between estimation speed and accuracy. We compare these predictions in a statistical way, using standard techniques for predictive accuracy and the Intergrated Absolute Differences. To assess the economic importance of these differences we compare the performance of investment strategies based on the different methods, and calculate the maximum fees to exchange one method for another. The investor applies the methods in the same way as in the previous section. When she uses the LT or PS-method, she first identifies bull and bear markets in the in-sample period, and next estimates a Markovian logit model with or without predictive variables. For the regime-switching models, the investor estimates the parameters over the in-sample period. When transition probabilities (either in the Markovian logit or the regime switching models) can be time-varying, the specific-to-general approach is used to select the variables. This approach implies that the set of selected variables varies over time. At each point in time t, the investor combines the most recent parameters estimates with the current information set to predict the state of the market at t + 1. In the LT-method, she knows the sentiment of the market until the last extremum. If the last extreme price was at t∗ < t, she will make her first prediction for t∗ +1, and use the recursion in Eq. (2) to arrive at the prediction for t + 1. In the PS-method, predictions start at t − 13, as switches in the last 13 weeks are ignored. For the regime-switching models, she applies the filter 28 to infer the state of the market at time t. Multiplying these inference probabilities with the transition matrix produces the forecast probabilities. Regarding the weekly predicting variables, the values of time t are used, whereas for monthly variables the values of the month ending before week t are taken. We present the predictions of the different methods with and without predictive variables in Figure 3.7 The black lines shows the probability for a bull market at each point in time. We compare the predictions with the full-sample identification of the different methods. In Figure 3a, we see that the predictions of the LT-method without predictive variables (LTC-method) lag the ex-post identification. It can take quite some weeks before the LT-method predicts the correct state of the market after a switch. We also see that the during a bull (bear) market the probability of a continuation gradually declines, until a new maximum (minimum) is reached and the probability jumps back to one (zero). [Figure 3 about here.] [Figure 3 (continued) about here.] Table 5 reports statistics on the quality of the predictions. The LTC-method predicts 79.9% of the bull market weeks correctly, but only 43.6% of the bear market weeks. In total, the hit rate is 71.1%, which is lower than the hit rate that results from always predicting a bull market. The Kuipers score, which equals percentage of correctly predicted bull markets minus the wrongly predicted bear markets, is positive, but not very large. It balances the percentage of hits with the percentage of false alarms, giving both an equal weight (see Granger and Pesaran, 2000, for a discussion). We also calculate the IAD between the predictions and the realizations of a specific method. The IAD from predictions by LTCmethod differ with a probability of 0.194 from the realizations of the LT-method. [Table 5 about here.] Figure 3b shows that the use of predictive variables in the LT-method leads to stronger swings in the forecast probabilities. Sometimes, switches are sooner predicted, but at other points in time it takes longer before the LTL-method predicts the correct state of the 7 We present and discuss the evolution of the parameters in Appendix D. 29 market after a switch. As indicated by Table 5, this holds in particular for bear markets, where the hit rate decreases to 38.4%. The overall hit rate and the Kuipers score are also lower, compared to the LTC-method. The IAD for the LTL-method is higher than for the LTC-method, pointing at a larger difference between prediction and realization. The predictions of the PS-method with constant transition probabilities (PSC) in Figure 3c look quite different from the LT-predictions. In the PS-method, switch points are determined by lower bounds on the duration of cycles and phases. A low peak or shallow trough can be (mis)taken for a switch point as long as the restrictions on duration are met. Compared to the LT-methods, the PSC-method predicts true switches sooner. The downside of this method are the frequent false alarms that last a couple of weeks. The PSC-method scores better at predicting bear markets with a hit rate of 62.6%. For bullmarkets the performance is comparable. The overall performance compares positively with the LT-methods, as the overall hit rates, the improvement over the default bull strategy and the Kuipers score are higher, while the IAD is lower. Using predictive variables leads to better predictions of bull markets, but worse of bear markets. Overall, the accuracy is slightly less than with constant transition probabilities. Figure 3e corresponds with the two-state regime switching models with constant transition probabilities. Here we compare the predictions with the rounded full-sample smoothed inference probabilities. The RS2C-model is quicker than the rules-based methods in picking up switches in the state of the market. The hit rate for the bullish state is 92.9%. However, we also see some false alarms, where the RS2C-model indicate a switch, that is revised one or two weeks later. This effect is stronger during bear markets than during bull markets. So, also for the regime switching model, bear markets are more difficult to predict, with a hit rate of 71.4%. Overall, the hit rate is impressive with 85.7%, an improvement of 19.4% over the “default bull”-strategy. The IAD can be calculated directly from the forecast and the smoothed inference probabilities, without rounding them first to integers. It indicates an average difference between prediction and realization of 0.158. The addition of predictive variables only leads to marginal differences. To compare the forecast probabilities of the RS3-models with the other models, we first aggregate the probabilities to a bull and a bear probability. The full-sample results in the previous section showed a single bullish regime (positive mean) and two bearish regimes 30 (negative means), one being mildly and the other strong. Figure D.1(e–f) in Appendix D actually shows two bullish regimes (mild and strong) before 2009. Therefore, we aggregate the regimes depending on the sign of their means. The predictions of the RS3-models in Figure 3(g–h) are good for bull markets (hit rates are 99.6% and 100%), but less accurate for bear markets (hit rates of 42.2% and 36.8%). During bear markets, predictions oscillate frequently between bullish and bearish. This is partly caused by the aggregation that we choose, since regime 2 is only mildly bullish or bearish. Overall, predictions are reasonable, with hit rates around 70%, which exceed the hit rates of “default bullish” predictions. The IADs for the RS3-models are considerably higher than those for the RS2 models. We conclude that the predictions of the RS3 models are less accurate than the predictions of the RS2-models, because the evolution of the regimes is more stable for the RS2-models. It is difficult to draw conclusions from the accuracy statistics in Table 5, because the accuracy for each method is determined with respect to the ex-post identification of that method. So, while the forecast probabilities of regime switching models may be close to the full-sample smoothed inference probabilities, this does not necessarily imply that these are useful predictions of bull and bear markets. Neither can we say whether predictions of, say, the LT-method with predictive variables are substantially different than the predictions of the PS-method without predictive variables. Calculating IADs quantifies the difference between the various predictions. The fees indicate which predictions are more valuable to a risk-averse investor, balancing the profits of hits of bull and bear markets with the costs of false alarms. [Table 6 about here.] We report the IAD between the different predictions in Table 6. The average difference between the predictions of the LT-method and the PS-method is approximately 0.27. This is a lot larger than the difference between the identification, which was only 0.068 according to Table 3. Furthermore, the confidence intervals do not overlap. So while using the LTmethod or the PS-method produces largely the same identification, these methods yield quite different predictions. These differences are due to different ways in which both methods identify the current regime. 31 The differences between the predictions by rules-based methods and those from the regime-switching models vary from 0.231 to 0.331. They have the same magnitude as the differences between their identifications. So, also in terms of predictions, these methods produce substantially different results. The lower bounds of the confidence intervals also show that these differences are substantial. Taking both means and volatilities into account yields not only different identifications but also different predictions. Using two or three states in regime switching models produces largely similar predictions. When transition probabilities are constant (time-varying), the IAD is 0.090 (0.095). The confidence intervals are quite narrow. When we compare that to the results in Table 3, we conclude that the two and three-state regime switching models differ more in their ex post identification than in their predictions. Using predictive variables does not change the predictions by much. The largest IAD with a value of 0.082 is for the difference between the predictions of the PSC and the PSL-method. The upper bounds on these probabilities confirm that the IADs here are typically small. We conclude that predictive variables do not help much when predicting the state of the stock market. Given that the state of the stock market is an important predictive variable of other economic variables itself, this result is not surprising. Finally, we evaluate the predictions of the different methods in an investment setting. In this setting, the investor combines the forecast probabilities with the state-dependent means and volatilities to construct the optimal portfolio in Eq. (14). The state-dependent means and volatilities are updated together with all other parameters every 52 weeks. We plot the cumulative result of this portfolio for each method in Figure 3 (blue lines) and report the corresponding summary statistics in Table 7a. As a reference, we add the performance of a strategy that always takes a long position in the stock market, w = 1. The LTC-method yields the highest return of all methods. It ends up with a cumulative return of 237% over 27 years, which corresponds 8.73% per year. This yearly return is considerably less than the 48.1% that would result from 100% correct forecasts. Figure 3a shows that this difference comes from the lag in identifying switches. Money is lost at the beginning of bull and bear markets. The performance of the method is similar during bull and bear markets. The volatility of the strategy remains high at 30.3% per year. The resulting Sharpe-ratio of 0.27 compares positively to the Sharpe ratio of an investment in 32 the market, which has a return of 3.43% per year, a volatility of 16.7% and a Sharpe ratio of 0.21. The utility is actually lower than that of the long position of the market, because of the higher volatility that it yields. [Table 7 about here.] Introducing predictive variables worsens the performance of the LT-method. The predictive variables lead to a higher return during periods that are ex post identified as bullish, but produce a loss of 2.69% during bearish periods. As a result, the overall return decreases from 8.73% to 7.13% per year, with the Sharpe ratio and utility decreasing accordingly. The Sharpe ratio still exceeds the ratio of the market. The PS-methods perform considerably worse than the LT-methods. The cumulative returns are negative for a prolonged period. The lagged response to actual switches and the many false alarms have a disastrous result on the performance, which is meagre at 1.48% to 2.26% per year. In the PSC-method, the many false alarms particularly hurt the performance during bull markets which is lowest among all methods. It it the only strategy that performs better during bear markets than during bull markets. Including time-variation improves the performance during bull markets, but performance during bear markets deteriorates. Volatility is comparable to the LT-methods, which results in a low Sharpe ratio of 0.053–0.075 and lower values for utility, compared to the LT-methods and the market. The false alarms are less of an issue when predictive variables are used, as the average return increases from 1.48% to 2.26%. However, volatility increases as well with a dismal effect on utility. The RS2-models give the highest utility of all methods. The average returns of these strategies are rather small, ranging from 0.97 to 1.52%, but volatility (9.5–10.1%) is also much lower than for the rules-based methods. As a consequence, the Sharpe ratios of 0.10–0.15 exceed those of the PS-methods, though they are still lower than the those of the market and the LT-methods. The reason for this good performance is the smaller magnitude of the portfolios that the investor takes when adopting an RS2-model. The small magnitude is caused by the smaller magnitude of the expected returns in the RS2-models and the higher volatility of the bearish regime (see Table 2 and Figure D.1). The stronger focus on volatility of the RS2-model leads to more prudent forecasts and investments. 33 Allowing for time-variation in the transition probabilities improves the performance during bull markets at the expense of the performance during bear markets. In total, the RS2Lmodel outperforms the RS2C-model with a higher Sharpe ratio, and higher utility. The performance of the RS3-models if Figure3(g–h) is at first better than for the RS2models, but reverts after 1997. The RS3C-model yields the lowest average return of all strategies at 0.34% per year. Though its volatility is also low, the Sharpe ratio remains lowest for this strategy. In terms of utility, this strategy performs worse than the RS2models, but better than the rules-based models and the market strategy. Also here, adding predictive variables produces better results, since the average return, the Sharpe ratio and utility for the RS3L-model exceed those for the RS2L-model. However, the performance of the RS2L-model is even better, indicating that more regimes are not that beneficial for forecasting. The fees for exchanging one method for another method in Table 7b are derived in a utility setting. A higher utility of a strategy corresponds with a higher fee for that strategy. Since the RS2L-model yields the highest utility, the investor is willing to pay a fee to adopt this method instead of any other. In particular, she wants to pay a considerable maximum fee of around 17% per year to change from the rules based methods to it. The confidence intervals indicate that this amount is significant. The same conclusion applies to all switches from a rules-based method to a regime-switching model. The investor does not want to significantly pay for other switches. While the LTmethods yield higher average returns and utilities than the PS-methods, the confidence intervals for the fees all contain zero. The magnitude of these fees is also low, with a maximum of 2.75% per year. The fees to switch between the regime-switching models do not exceed 2% per year, and are also not significantly different from zero. The investor actually wants to pay a fee to discard predictive variables in the rules-based methods. When working with regime-switching models, the investor would pay a small fee for these variables. However, in both cases the confidence intervals contain zero. We can draw several conclusions from this analysis. First, regime-switching models perform best from a utility perspective. Since they take both means and volatilities into account, they lead to more accurate predictions, less extreme positions and a higher utility. A risk averse investor is willing to pay a significant fee to use a regime-switching model. 34 This fee mainly represents the lower volatility that results from regime-switching models. Also for other economic decisions where both the predicted direction and volatility of the stock market are important, regime switching models are best used. Second, looking just at average returns, volatilities and Sharpe ratios, the LT-methods perform best. It produces a higher average return and a better Sharpe ratio than all other strategies, including a long position in the market. While this method is slow in picking up switches in the state of market, as indicated by the low hit ratios, its performance does not suffer much from false alarms. Third, the predictions from the rules-based methods and the regime switching models differ considerably. As expected, this difference is obvious for the rules-based methods on the one hand, and the regime switching models on the other hand. Regime switching models take both means and volatilities into account, whereas the rules-based method just look at the trend of the market. However, we find difference between the LT and PS-methods that are just as large. Similarity in identification does not imply similarity in predictions. Restrictions on price changes lead to quite different results compared to restrictions on duration. Fourth, most strategies perform better during bull markets that during bear markets. Hit rates are higher during bull markets, and most strategies actually lose money during bear markets. The rules-based methods are too late spotting the beginning of a bear market and suffer from false alarms. The regime-switching models classify volatile periods as bearish, and take short positions. If prices increase, they end up with a loss. 7 Robustness checks Our results and conclusions may be sensitive to some arbitrary choices that we have made in our analyses. The rules-based methods require parameters to determine which peaks and troughs signal switches between bull and bear markets. In the economic comparison of the different methods, the coefficient of relative risk aversion is a crucial coefficient. In this section we analyse how our results change when we change these settings. 35 7.1 Thresholds in the LT-method The LT-method requires an increase in exceedance of λ1 since the last trough for a bull market to start, and a decrease of more than λ2 for a bear market to begin. The values of 20% and 15% we have used so far have been argued by LT to be most conventional. They also consider lower threshold combinations of (0.20, 0.10), (0.15, 0.15) and (0.15, 0.10). We consider these values as well. Lower thresholds may be particularly interesting, because our results show that the speed with which the current regime is identified is crucial for good predictions. Lower thresholds make it easier to identify a switch, but may also lead to more false alarms. We compare the performance of the LT-methods with different thresholds as in Sections 5 and 6. We discuss the main results here. The full results and a detailed discussion are available in Appendix E.1. The full sample results on identification show that lower thresholds leads to more cycles, in particular when both thresholds are lowered. However, the IADs between the different identifications indicate that they differ in less than 5% of the weeks. The choice of thresholds does not much affect the pattern of bull and bear markets in Figure 2a. Even though the changes are statistically small, the economic comparison indicates that they present improvements. The fees that an investor is willing to pay to switch to identification with (0.15, 0.10)-thresholds are positive and significant. These results strengthen our conclusion that the LT-method works better for identification than the regime-switching methods. For predictions the results are more mixed. We find that the predictive accuracy increases for lower thresholds, in particular for bear markets. The hit rates and Kuipers score of the (0.15, 0.10)-thresholds come close to the results for the two-state regime switching models. The IADs between the predictions of the LT-methods with different thresholds indicate a different prediction for 7–16% of the weeks. While larger than the differences for identification, they are smaller than the differences between the different methods in Table 6. The improvements in predictive accuracy are not fully matched by economic improvements. Lower thresholds lead to investments that produce higher average returns, but also higher volatilities. Sharpe ratios still increase, but utility is sometimes lower. Only a lower threshold for λ2 of 0.10 leads to an increase in utility and positive,though insignificant, fees. The utility of the (0.20, 0.10)-thresholds is still substantially lower than 36 the utility for the RS2-models in Table 7. We conclude that lower thresholds bring the statistical quality of the predictions at the same level as the regime-switching models, but do not improve the economic quality enough to beat them. 7.2 Parameters in the PS-method The PS-method use minimum constraints on the length of cycles and phases to select the peaks and troughs that indicates switches between bull and bear markets, and censors a set of first and last observations. The settings for these constraints are based on the algorithm of Bry and Boschan (1971) for business cycle identification and common market lore. As PS do not consider robustness checks themselves, we consider some changes in the parameters based on our results so far. The LT-method performs better with relaxed restrictions, so we mainly investigate relaxations. We consider a lower minimum on cycle duration of 52 weeks (instead of 70 weeks). For the minimum on phase duration we consider 12 and 20 weeks (standard at 16 weeks). We relax the minimum price change to overrule the phase constraint to 15%. We also investigate the consequences of censoring more (26 weeks) and less (7 weeks). We discuss the main results here. For the full results and a discussion, we refer to Appendix E.2. For identification, changes are negligible. Identification does not change at all when we change the constraints on phase duration or price change. Censoring 7 weeks leads to an extra bear market at the end of the sample period, but censoring 26 weeks has no effect. Lowering the cycle constraint to 52 weeks leads to one extra cycle. Together, it means that the conclusion that the PS-method works well for identification is robust to these parameter changes. Neither does a relaxation of the constraints on minimum length or minimum price change largely impact predictions. The overall predictive accuracy stays the same, do we sometimes see accuracy improvements for bull markets at the expense of bear markets. The IAD between the predictions for different settings and the predictions produced by the basic setting are not significantly different from zero. The fees for switching are economically small and insignificant. Censoring shows a larger impact, in particular for the economic difference measures. 37 Predictive accuracy is largely unaffected by censoring more or less data. However, the IADs between the predictions of the standard approach and approaches with more and less sensoring are quite substantial. The economic comparison shows that more censoring leads to less extreme portfolio weights, higher means, lower volatilities, higher Sharpe ratios and higher utility, while censoring less has the opposite effect. These effects can all be explained by the effect that censoring has on a prediction. When more observations at the end are censored, an investor essentially makes a forecast for a longer horizon, applying a longer recursion of Eq. (2). If the prediction horizon rises, the prediction converges to the long-term average (see e.g., Table C.1b) and typically become less extreme. When predictive accuracy is determined, predictions are rounded, so shrinkage to a long-term average does not matter. To the contrary, less extreme predictions lead to less extreme investments, and to a better performance. When 26 weeks are censored, utility is close to the utility of long position in the market and only slightly below the utility of the two-state regime switching models. Overall, we find that our conclusions are robust to changes in the design of the PSmethod. Constraints on the duration of cycles or phases or on price changes only have a small effect. The effects for censoring show that rules-based methods rely too much on the past direction of the market when making predictions. Paying more attention to volatility as regime switching models do leads to better performance. Shrinking predictions towards their long-term average also leads to less extreme investments and a better performance. 7.3 Risk Aversion The economic measure for comparison that we propose in Section 3.2 depends on the coefficient of relative risk aversion γ. Throughout our analysis so far, we have set γ = 5, which is conservative though still in the generally accepted range for this parameter. In this subsection we show how sensitive our results are to the choice for γ. The optimal portfolio w m in Eq. (14) is proportional to the inverse of γ. This relation caries over to the expected return of an investment strategy, which is proportional to 1/γ as well, and the variance which is proportional to 1/γ 2 . Consequently, both effects of riskaversion exactly cancel out in the Sharpe ratio. The expected utility being the sum of the 38 expected return and the second moment weighted by −γ/2 is also inversely related to γ, " m 2 # m µ 1 µ 1 r − γ r V (w m ) = EGr w m r − γ(w m r)2 = EGr 2 γψ m 2 γψ m " 2 # 1 µm 1 µm = EGr r− r . γ ψm 2 ψm (21) Finally, the switching fee is proportional to 1/γ. To derive this result, we make the effect of γ in Eq. (18) explicit, where we use V (µm /ψ m ) for the expected utility for γ = 1 corresponding with model m m m n 1 µt µ µ 1 2 V − V − 1 − E γη =0 r η − G t+1 m,n r m γ ψm ψn ψt 2 m,n Solving for ηm,n produces m µt 1 1 − EGr rt+1 ± ηm,n = − γ ψtm s n 2 m m µ µt µ − V r +2 V 1 − E Gr m t+1 ψt ψm ψn (22) (23) An increase (reduction) in risk-aversion will lead to less (more) extreme results in Tables 4a and 7a. Means, volatilities, utilities and switching will all decrease (increase) by the same relative amount. However, the ordening of the strategies will not be affected and our conclusions regarding the preferred method are robust. The confidence intervals around the fees are also proportional to 1/γ, as the proportionality of the switching fee applies similarly to simulated fees. So, fees that are significantly different from zero remain significant independent of the choice of γ. 8 Conclusion In this article we compare the identification and prediction of bull and bear markets by four different methods. One way to address identification and prediction is by formulating rules to determine bullish and bearish periods, and then as a second step use binary models for prediction. In this category, we consider the approaches of Lunde and Timmermann 39 (2004) and Pagan and Sossounov (2003). Both base identification on peaks and troughs in price data. To find switch points Pagan and Sossounov (2003) impose restrictions on the length of cycles and phases, whereas Lunde and Timmermann (2004) impose restrictions on price changes. As an alternative, an investor can formulate a model that simultaneously handles identification and prediction. We consider a simple regime switching model with a bull and a bear state, and an extended version that includes three states. We apply the different methods to the S&P 500. To deal with the latent nature of bull and bear markets, we propose two new measures to compare the identifications and predictions of the various methods. The Integrated Absolute Distance is a generally applicable statistic quantifying the difference between the inferred or predicted probability of regimes. The switching fee quantifies the preference for one method over another from the perspective of a risk averse investor who wants to time the market. From the identification we conclude that the rules based approaches produce more or less the same results. A market that has exhibited price decreases since the last peak is bearish; price increases after the last trough qualify as bullish. To the contrary, the regime switching models also pay attention to volatility. Periods with an attractive risk-return trade-off, meaning positive average returns and low volatility, are bullish, while negative average returns and high volatility are qualified as bearish. Since periods with low volatility but price decreases can be identified as bull markets, and volatile price increases as bearish, rules-based methods are preferable ex post. Their identification is significantly different from the identification by regime-switching models, and is worth a significant fee. Our results for predictions show that paying attention to expected returns and volatility is preferable ex ante. Regime-switching models are able to react quicker to bull-bear switches than the rules-based methods, and they lead to more prudent investments. Because they yield the highest utility, a risk-averse investor is willing to pay a significant fee of around 16% to switch from rules-based methods to regime-switching models. Looking at Sharpe ratios, the method by Lunde and Timmermann (2004) outperforms the other methods and a long position in the US stock markets. The PS-method performs less, as it produces many false alarms. In line with the somewhat depressing results on return predictability in Welch and 40 Goyal (2008), we also find that the inclusion of predictive variables is limited and subject to changes. For the rules-based approaches, the effect is clearly detrimental, with performance uniformly worse. For the regime switching models we observe improvements in performance, but the variables that are selected for predictions and their coefficients vary considerably over our sample period. Overall, the IAD’s between models with and without predictive variables are small, and fees are negligible. Harding and Pagan (2003a,b) and Hamilton (2003) have already discussed the difference between rules-based and model-based approaches, applied to dating business cycles. As the resulting identifications were largely similar, the main differences were the larger transparency for the rules-based approaches versus the deeper insight into the data generating process for the regime switching models. For financial time series, differences are larger where the rules-based approaches purely reflect the tendency of the market, while the regime switching models reflect the risk-return trade-off. For applications that require an ex post series of bull and bear markets, rules-based methods are preferable. If the state of the stock market is needed in a predictive setting, regime-switching model are best used. 41 References Ang, A. and Bekaert, G. (2002). International asset allocation with regime shifts. Review of Financial Studies, 15(4):1137–1187. Ang, A., Chen, J., and Xing, Y. (2006). Downside risk. Review of Financial Studies, 19(4):1191– 1239. Avramov, D. and Chordia, T. (2006). Predicting stock returns. Journal of Financial Economics, 82:387–415. Avramov, D. and Wermers, R. (2006). Investing in mutual funds when returns are predictable. Journal of Financial Economics, 81(2):339–377. Beltratti, A. and Morana, C. (2006). Breaks and persistency: Macroeconomic causes of stock market volatility. Journal of Econometrics, 131(1-2):151–177. Bohl, M. T., Siklos, P. L., and Werner, T. (2007). Do central banks react to the stock market? the case of the Bundesbank. Journal of Banking & Finance, 31(3):719–733. Bry, G. and Boschan, C. (1971). Cyclical Analysis of Time Series: Selected Procedures and Computer Programs. NBER, New York, NY, USA. Campbell, J. Y. (1991). 101(405):157–179. A variance decomposition for stock returns. Economic Journal, Candelon, B., Piplack, J., and Straetmans, S. (2008). On measuring synchronization of bulls and bears: The case of East Asia. Journal of Banking & Finance, 32(6):1022–1035. Chauvet, M. (1999). Stock market fluctuations and the business cycle. Journal of Economic and Social Measurement, 25(3/4):235–258. Chauvet, M. and Potter, S. (2000). Coincident and leading indicators of the stock market. Journal of Empirical Finance, 7(1):87–111. Chen, S.-S. (2009). Predicting the bear stock market: Macroeconomic variables as leading indicators. Journal of Banking & Finance, 33:211–223. Chiang, M., Lin, T., and Yu, C. J. (2009). Liquidity provision of limit order trading in the futures market under bull and bear markets. Journal of Business Finance & Accounting, 36(7-8):1007–1038. Christoffersen, P. F. and Diebold, F. X. (2006). Financial asset returns, direction-of-change forecasting, and volatility dynamics. Management Science, 52(8):1273–1287. Corradi, V. and Swanson, N. R. (2007). Nonparametric bootstrap procedures for predictive inference based on recursive estimation schemes. International Economic Review, 48(1):67– 109. 42 Das, S. R. and Uppal, R. (2003). Systemic risk and international portfolio choice. Working paper, Santa Clara University, Santa Clara, CA, USA. Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B, 39(1):1–38. Diebold, F. X., Lee, J.-H., and Weinbach, G. C. (1994). Regime switching with time-varying transition probabilities. In Hargreaves, C., editor, Non-Stationary Time Series Analysis and Cointegration, pages 283–302. Oxford University Press, Oxford, UK. Diebold, F. X. and Mariano, R. S. (1995). Comparing predictive accuracy. Journal of Business & Economic Statistics, 13(3):134–144. Edwards, S., G´ omez Biscarri, J., and P´erez de Gracia, F. (2003). Stock market cycles, financial liberalization and volatility. Journal of International Money and Finance, 22(7):925–955. Estrella, A. and Mishkin, F. S. (1998). Predicting US recessions: Financial variables as leading indicators. Review of Economics and Statistics, 80(1):45–61. Fleming, J., Kirby, C., and Ostdiek, B. (2001). The economic value of volatility timing. Journal of Finance, 56(1):329–352. G´ omez Biscarri, J. and P´erez de Gracia, F. (2004). Stock market cycles and stock market development in Spain. Spanish Economic Review, 6(2):127–151. Gordon, S. and St-Amour, P. (2000). A preference regime model of bull and bear markets. American Economic Review, 90(4):1019–1033. Granger, C. W. J. and Pesaran, M. H. (2000). Economic and statistical measures of forecast accuracy. Journal of Forecasting, 19(7):537–560. Guidolin, M. and Timmermann, A. (2006a). An econometric model of nonlinear dynamics in the joint distribution of stock and bond returns. Journal of Applied Econometrics, 21(1):1–22. Guidolin, M. and Timmermann, A. (2006b). Term structure of risk under alternative econometric specifications. Journal of Econometrics, 131:285–308. Guidolin, M. and Timmermann, A. (2007). Asset allocation under multivariate regime switching. Journal of Economics Dynamics & Control, 31(11):3503–3544. Guidolin, M. and Timmermann, A. (2008a). International asset allocation under regime switching, skew and kurtosis preference. Review of Financial Studies, 21(2):889–935. Guidolin, M. and Timmermann, A. (2008b). Size and value anomalies under regime shifts. Journal of Financial Econometrics, 6(1):1–48. Hamilton, J. D. (1989). A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica, 57:357–384. 43 Hamilton, J. D. (1990). Analysis of time series subject to changes in regime. Journal of Econometrics, 45(1-2):39–70. Hamilton, J. D. (1994). Time Series Analysis. Princeton University Press, Princeton, NJ, USA. Hamilton, J. D. (2003). Comment on “a comparison of two business cycle dating methods”. Journal of Economic Dynamics and Control, 27(9):1691–1693. Hamilton, J. D. and Lin, G. (1996). Stock market volatility and the business cycle. Journal of Applied Econometrics, 11:573–593. Harding, D. and Pagan, A. (2002a). A comparison of two business cycle dating methods. Journal of Economic Dynamics & Control, 27:1681–1690. Harding, D. and Pagan, A. (2002b). Dissecting the cycle: A methodological investigation. Journal of Monetary Economics, 49(2):365–381. Harding, D. and Pagan, A. (2003a). A comparison of two business cycle dating methods. Journal of Economic Dynamics and Control, 27(9):1681–1690. Harding, D. and Pagan, A. (2003b). Rejoinder to james hamilton. Journal of Economic Dynamics and Control, 27(9):1695–1698. Harvey, C. R. (1989). Forecasts of economic growth from the bond and stock markets. Financial Analysts Journal, 45(5):38–45. Harvey, C. R. and Siddique, A. (2000). Conditional skewness in asset pricing tests. Journal of Finance, 55(3):1263–1295. Jondeau, E. and Rockinger, M. (2006). Optimal portfolio allocation under higher moments. European Financial Management, 12(1):29–55. Kaminsky, G. L. and Schmukler, S. L. (2008). Short-run pain, long-run gain: Financial liberalization and stock market cycles. Review of Finance, 12(2):253–292. Kim, C.-J. (1994). Dynamic linear models with markov-switching. Journal of Econometrics, 60(1):1–22. Kole, E., Koedijk, K., and Verbeek, M. (2006). Portfolio implications of systemic crises. Journal of Banking & Finance, 30(8):2347–2369. Kullback, S. and Leibler, R. A. (1951). On information and sufficiency. Annals of Mathematical Statistics, 22(1):79–86. Lunde, A. and Timmermann, A. (2004). Duration dependence in stock prices: An analysis of bull and bear markets. Journal of Business & Economic Statistics, 22(3):253–273. Maheu, J. M. and McCurdy, T. H. (2000). Identifying bull and bear markets in stock returns. Journal of Business & Economic Statistics, 18(1):100–112. 44 Maheu, J. M., McCurdy, T. H., and Song, Y. (2009). Extracting bull and bear markets from stock returns. Working paper, University of Toronto, CA. Marcellino, M. (2006). Leading indicators: What have we learned? In Elliot, G., Granger, C. W., and Timmermann, A., editors, Handbook of Economic Forecasting, pages 879–960. Elsevier, Amsterdam, Netherlands. Marquering, W. and Verbeek, M. (2004). The economic value of predicting stock index returns and volatility. Journal of Financial and Quantitative Analysis, 39(2):407–429. Mitchell, W. C. and Burns, A. F. (1938). Statistical indicators of cyclical revivals. NBER (New York), reprinted in: G. H. Moore, ed., (1961), Business cycle indicators, Princeton University Press (Princeton), ch. 6. Newey, W. K. and West, K. D. (1987). A simple, positive semi-definite, heteroskedasticity and autocorrelation consistent covariance matrix. Econometrica, 55(3):703–708. Pagan, A. and Ullah, A. (1999). Nonparametric Econometrics. Cambridge University Press, Cambridge, UK. Pagan, A. R. and Sossounov, K. A. (2003). A simple framework for analysing bull and bear markets. Journal of Applied Econometrics, 18(1):23–46. Perez-Quiros, G. and Timmermann, A. (2000). Firm size and cyclical variations in stock returns. Journal of Finance, 55(3):1229–1262. Pesaran, M. H. and Skouras, S. (2002). Decision-based methods for forecast evaluations. In Clements, M. P. and Hendry, D. F., editors, A Companion to Economic Forecasting, chapter 11, pages 241–267. Basil Blackwell, Oxford, UK. Pesaran, M. H. and Timmermann, A. (1995). Predictability of stock returns: Robustness and economic significance. Journal of Finance, 50(4):1201–1228. Politis, D. N. and Romano, J. P. (1994). The stationary bootstrap. Journal of the American Statistical Association, 89(428):1303–1313. Rapach, D. E., Wohar, M. E., and Rangvid, J. (2005). Macro variables and international stock return predictability. International Journal of Forecasting, 21(1):137–166. Rigobon, R. and Sack, B. (2003). Measuring the reaction of monetary policy to the stock market. Quarterly Journal of Economics, 118(2):639–669. Sarno, L. and Valente, G. (2004). Comparing the accuracy of density forecasts from competing models. Journal of Forecasting, 23(8):541–557. Schwert, G. W. (1990). Indexes of u.s. stock prices from 1802 to 1987. Journal of Business, 63(3):399–426. Shiller, R. J. (2000). Irrational Exuberance. Princeton University Press, Princeton NJ, USA. 45 Stock, J. H. and Watson, M. W. (2003a). Forecasting output and inflation: The role of asset prices. Journal of Economic Literature, 41(3):778–829. Stock, J. H. and Watson, M. W. (2003b). How did leading indicator forecasts do during the 2001 recession? Federal Reserve Bank of Richmond Economic Quarterly, 89:71–90. Timmermann, A. (2000). Moments of markov switching models. Journal of Econometrics, 96(1):75–111. Veronesi, P. (1999). Stock market overreaction to bad news in good times: A rational expectations equilibrium model. Review of Financial Studies, 12(5):975–1007. Welch, I. and Goyal, A. (2008). A comprehensive look at the empirical performance of equity premium prediction. Review of Financial Studies, 21(4):1455–1508. West, K. D. (1996). Asymptotic inference about predictive ability. Econometrica, 64(5):1067– 1084. West, K. D., Edison, H. J., and Cho, D. (1993). A utility-based comparison of some models of exchange rate volatility. Journal of International Economics, 35:23–45. White, H. (2000). A reality check for data snooping. Econometrica, 68(5):1097–1126. 46 Figure 1: Performance US Market 400 350 300 250 200 150 100 50 Jan-09 Jan-07 Jan-05 Jan-03 Jan-01 Jan-99 Jan-97 Jan-95 Jan-93 Jan-91 Jan-89 Jan-87 Jan-85 Jan-83 Jan-81 Jan-79 Jan-77 Jan-75 Jan-73 Jan-71 Jan-69 Jan-67 Jan-65 Jan-63 Jan-61 Jan-59 Jan-57 Jan-55 0 This figure show the weekly observations of the US stock market in excess of the risk-free rate over the period January 7, 1955 until July 2, 2010 (1/7/1955 = 100). The excess stock market index is calculates as the ratio Pt /Bt , where Pt is the value of the stock market index and Bt is the cumulation of a riskless Qt−1 bank account, Bt ≡ τ =0 (1 + rτf ). For the stock market index, we use the S&P500. The risk-free rates is the three-month T-Bill rate. Data have been taken from FRED at the Federal Reserve Bank of St. Louis and Schwert (1990). 47 Figure 2: Identification of Bull and Bear Markets 1 400 1 400 0.9 350 0.9 350 0.8 300 0.8 300 0.7 250 0.7 250 0.6 200 0.5 0.6 200 0.5 0.4 150 0.4 150 0.3 100 0.3 100 0.2 0.2 50 50 0.1 0.1 (a) LT Jan-09 Jan-07 Jan-05 Jan-03 Jan-01 Jan-99 Jan-97 Jan-95 Jan-93 Jan-91 Jan-89 Jan-87 Jan-85 Jan-83 Jan-81 Jan-79 Jan-77 Jan-75 Jan-73 Jan-71 Jan-69 Jan-67 Jan-65 Jan-63 Jan-61 Jan-59 Jan-57 0 Jan-55 0 Jan-09 Jan-07 Jan-05 Jan-03 Jan-01 Jan-99 Jan-97 Jan-95 Jan-93 Jan-91 Jan-89 Jan-87 Jan-85 Jan-83 Jan-81 Jan-79 Jan-77 Jan-75 Jan-73 Jan-71 Jan-69 Jan-67 Jan-65 Jan-63 Jan-61 Jan-59 Jan-57 0 Jan-55 0 (b) PS 1 400 1 400 0.9 350 0.9 350 0.8 300 0.8 300 0.7 250 0.7 250 0.6 200 0.5 0.6 200 0.5 0.4 150 0.4 150 0.3 100 0.3 100 0.2 0.2 50 50 0.1 0.1 (c) RS2C Jan-09 Jan-07 Jan-05 Jan-03 Jan-01 Jan-99 Jan-97 Jan-95 Jan-93 Jan-91 Jan-89 Jan-87 Jan-85 Jan-83 Jan-81 Jan-79 Jan-77 Jan-75 Jan-73 Jan-71 Jan-69 Jan-67 Jan-65 Jan-63 Jan-61 Jan-59 Jan-57 0 Jan-55 0 Jan-09 Jan-07 Jan-05 Jan-03 Jan-01 Jan-99 Jan-97 Jan-95 Jan-93 Jan-91 Jan-89 Jan-87 Jan-85 Jan-83 Jan-81 Jan-79 Jan-77 Jan-75 Jan-73 Jan-71 Jan-69 Jan-67 Jan-65 Jan-63 Jan-61 Jan-59 Jan-57 0 Jan-55 0 (d) RS2L 1 400 1 400 0.9 350 0.9 350 0.8 300 0.8 300 0.7 250 0.7 250 0.6 200 0.5 0.6 200 0.5 0.4 150 0.4 150 0.3 100 0.3 100 0.2 0.2 50 50 0.1 0.1 (e) RS3C Jan-09 Jan-07 Jan-05 Jan-03 Jan-01 Jan-99 Jan-97 Jan-95 Jan-93 Jan-91 Jan-89 Jan-87 Jan-85 Jan-83 Jan-81 Jan-79 Jan-77 Jan-75 Jan-73 Jan-71 Jan-69 Jan-67 Jan-65 Jan-63 Jan-61 Jan-59 Jan-57 0 Jan-55 0 Jan-09 Jan-07 Jan-05 Jan-03 Jan-01 Jan-99 Jan-97 Jan-95 Jan-93 Jan-91 Jan-89 Jan-87 Jan-85 Jan-83 Jan-81 Jan-79 Jan-77 Jan-75 Jan-73 Jan-71 Jan-69 Jan-67 Jan-65 Jan-63 Jan-61 Jan-59 Jan-57 0 Jan-55 0 (f) RS3L This figure shows the identification of bull and bear periods for the US, based on the different approaches. The thick blue line plots the excess stock market index (left y-axis). In panel (a), bull and bear markets are identified by the LT-algorithm, and in panel (b) by the PS-algorithm. Panels (c–d) and (e–f) reflect identification by two-state and three-state regime switching models, respectively. These models can have constant or time-varying transition probabilities (panels c and e, and panels d and f, respectively). A think black line indicates the smoothed inference probability of a bull market (right y-axis). Bullish (bearish) regimes are indicated with white (pink) areas. A bull (bear) market prevails in the RS-models, when the smoothed inference probability for regime 1 (2) exceeds 0.5. The strong bear regime prevails when the smoothed inference probability for regime 3 exceeds 0.5, and is indicated in purple (only for panels e–f). 48 1-7-05 1-7-06 1-7-07 1-7-92 1-7-93 1-7-94 1-7-95 1-7-96 1-7-97 1-7-98 1-7-99 1-7-00 1-7-01 1-7-02 1-7-03 1-7-04 1-7-05 1-7-06 1-7-07 1-7-08 1-7-08 1-7-09 1-7-09 1-7-06 1-7-07 1-7-08 1-7-09 1-7-08 300 1-7-05 1-7-07 1 1-7-04 1-7-06 250 1-7-03 1-7-05 0.9 1-7-02 1-7-04 200 1-7-01 1-7-03 0.8 1-7-00 1-7-02 0.7 1-7-99 1-7-01 150 1-7-98 1-7-00 0.6 1-7-97 1-7-99 100 1-7-96 1-7-98 0.5 1-7-95 1-7-97 50 (f) RS2L 1-7-94 1-7-96 0.4 1-7-93 1-7-95 0 1-7-92 1-7-94 0.3 1-7-91 1-7-93 -50 1-7-90 1-7-92 0.2 1-7-89 1-7-91 -100 1-7-88 1-7-90 0.1 1-7-87 1-7-89 1-7-1983 (b) LT, time-varying transition probabilities 1-7-86 1-7-88 100 80 60 40 20 0 -20 -40 -60 -80 (d) PS, time-varying transition probabilities 1-7-85 1-7-87 0 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 60 50 40 30 20 10 0 -10 -20 -30 1-7-83 1-7-84 1-7-86 1-7-09 0 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 49 1-7-83 1-7-85 1-7-84 1-7-85 1-7-86 1-7-87 1-7-88 1-7-89 1-7-90 1-7-91 1-7-92 1-7-93 1-7-94 1-7-95 1-7-96 1-7-97 1-7-98 1-7-99 1-7-00 1-7-01 1-7-02 1-7-03 1-7-04 1-7-05 1-7-06 1-7-07 1-7-08 1-7-09 1-7-1984 1-7-1985 1-7-1986 1-7-1987 1-7-1988 1-7-1989 1-7-1990 1-7-1991 1-7-1992 1-7-1993 1-7-1994 1-7-1995 1-7-1996 1-7-1997 1-7-1998 1-7-1999 1-7-2000 1-7-2001 1-7-2002 1-7-2003 1-7-2004 1-7-2005 1-7-2006 1-7-2007 1-7-2008 1-7-2009 Figure 3: Predictions and performance 1-7-04 1-7-91 300 1-7-03 1-7-90 250 1-7-02 1-7-89 200 1-7-01 1-7-88 150 1-7-00 1-7-87 100 1-7-99 1-7-86 50 1-7-98 1-7-84 1-7-85 0 1-7-97 1-7-83 1-7-84 -50 1-7-96 -100 1-7-95 1-7-83 (a) LT, constant transition probabilities 1-7-94 100 1-7-93 80 1-7-92 60 1-7-91 40 1-7-90 20 1-7-89 0 1-7-88 -20 1-7-87 -40 1-7-86 -60 (c) PS, constant transition probabilities 1-7-85 -80 -100 60 50 40 30 20 10 0 -10 -20 -30 1-7-84 (e) RS2C Figure continues on next page. 1-7-83 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Figure 3: Predictions and performance – continued 70 1 80 1 60 0.9 70 0.9 0.8 60 0.8 0.7 50 0.7 0.6 40 0.6 0.5 30 0.5 0.4 20 0.4 0.3 10 0.3 0.2 0 0.2 -10 0.1 -10 0.1 -20 0 -20 0 50 40 30 20 (g) RS3C 1-7-09 1-7-08 1-7-07 1-7-06 1-7-05 1-7-04 1-7-03 1-7-02 1-7-01 1-7-00 1-7-99 1-7-98 1-7-97 1-7-96 1-7-95 1-7-94 1-7-93 1-7-92 1-7-91 1-7-90 1-7-89 1-7-88 1-7-87 1-7-86 1-7-85 1-7-84 1-7-09 1-7-08 1-7-07 1-7-06 1-7-05 1-7-04 1-7-03 1-7-02 1-7-01 1-7-00 1-7-99 1-7-98 1-7-97 1-7-96 1-7-95 1-7-94 1-7-93 1-7-92 1-7-91 1-7-90 1-7-89 1-7-88 1-7-87 1-7-86 1-7-85 1-7-84 1-7-83 0 1-7-83 10 (h) RS3L This figure shows for each method the predictions and the evolution of the investment performance over time. The first prediction is made for July 1, 1983 and the last for July 2, 2010, giving a total of 1410 predictions. To make a prediction for week t+1, the LT-method follows the recursion in Eq. (2) from the last extremum onwards. The PS-method starts this recursion 13 weeks prior to week t. The regime-switching models apply the standard filter technique with the last available model parameters. All approaches use weekly explanatory variables up to week t and monthly explanatory variables up to the last month prior to week t. The parameters in the Markovian logit models for predictions in the rules-based approaches, the m state-dependent mean µm s and second moment ψs , and the parameters of the regime switching models are updated every 52 weeks. We plot the predicted probability of a bull market with a black line (secondary y-axis). For the RS3-models, we calculate the probability of a bull market as the sum of the predictions for regimes with positive means. The pink areas indicate the periods of bear markets as identified based m on the full sample as in Figure 2. The asset allocation for method m at time t equals wtm = µm t /(γψt ), m m with risk aversion parameter γ = 5. The values for the mean µt+1 and second moment ψt are calculated as in Eqs. (15) and (16). The blue line shows the cumulative result (in %) of this portfolio. 50 Table 1: Number and Duration of Market Regimes approach LT PS RS2C RS2L RS3C RS3C∗ RS3L RS3L∗ bull number avg. duration med. duration std. dev. duration max. duration min. duration 16 119 90 97 405 15 19 95 74 61 276 27 34 64 24 77 336 2 43 52 23 90 337 2 17 102 64 90 314 13 17 102 64 90 314 13 17 98 62 91 313 5 17 98 62 91 313 5 (mild) bear number avg. duration med. duration std. dev. duration max. duration min. duration 16 62 60 44 187 7 18 61 60 30 132 15 34 21 12 17 78 1 43 16 10 23 78 1 24 45 41 23 91 8 17 69 42 66 261 8 25 46 41 32 163 5 17 72 42 79 327 9 strong bear number avg. duration med. duration std. dev. duration max. duration min. duration 7 11 8 10 28 1 8 9 3 12 29 2 This table shows for every method the number of spells of the different market regimes, their average and median duration, the standard deviation of the duration, and its maximum and minimum. For the two-state regime switching models, a period is qualified as bullish if the smoothed inference probability for the bull regime exceeds 0.5 and bearish otherwise. For the three-state regime switching models, the highest smoothed inference probability determines the prevailing regime. In the columns RS3C∗ and RS3L∗ , the mildly and strongly bear markets have been aggregated. 51 Table 2: Return Characteristics of Bull and Bear Markets regime bull (mild) bear 52 strong bear LT mean vol. mean vol. mean vol. 0.38 1.82 −0.60 2.46 (0.04) (0.06) (0.08) (0.16) PS 0.40 1.86 −0.54 2.36 (0.04) (0.07) (0.07) (0.15) RS2C 0.16 1.47 −0.27 3.28 (0.04) (0.04) (0.13) (0.24) RS2L 0.16 1.47 −0.30 3.34 (0.04) (0.04) (0.14) (0.25) RS3C 0.15 1.38 −0.04 2.43 −0.95 5.65 (0.04) (0.04) (0.08) (0.16) (0.73) (1.53) RS3L 0.16 1.39 −0.06 2.36 −0.74 5.50 (0.04) (0.04) (0.08) (0.15) (0.62) (1.30) This table shows the mean and the volatility (in % per week) for the different regimes under the different approaches. In the approach of LT and PS identification is based on peaks and troughs in the prices series. Conditioning on the regimes, we estimate the means and volatilities of the returns distributions. In the regime switching approach, we estimate the model for the return distributions with two regimes (bull and bear, columns RS2C and RS2L) and with three regimes (bull, mild bear, and strong bear, columns RS4C and RS4L). We report standard errors in parentheses. For the LT- and PS-methods we calculate HAC consistent standard errors of Newey and West (1987) in a GMM setting. For the regime switching models, we use the Fisher information matrix to compute standard errors. Table 3: Integrated Absolute Differences of Identification LT PS RS2C PS RS2C RS2L RS3C RS3L 0.068 [0.034, 0.141] 0.247 [0.187, 0.355] 0.291 [0.220, 0.365] 0.234 [0.171, 0.355] 0.277 [0.214, 0.368] 0.032 [0.015, 0.061] 0.268 [0.201, 0.454] 0.311 [0.220, 0.483] 0.154 [0.088, 0.555] 0.168 [0.098, 0.564] 0.272 [0.200, 0.505] 0.313 [0.214, 0.486] 0.167 [0.084, 0.602] 0.182 [0.081, 0.591] 0.032 [0.022, 0.271] RS2L RS3C This table reports the integrated absolute distance between the identification of the different approaches, based on two states. For the difference between the LT- and PS-method, and the RS2-models we use Eq. (10). For comparisons between the LT- and PS-method on the one hand and the RS-models on the other hand, we incorporate Eq. (11) into the calculation. When the RS3-models are part of the comparison, we aggregate the mild and strong bear regimes. We report the 5% and 95% percentiles between brackets for each statistic based on the 200 bootstraps of both the return and the explanatory variable series following Politis and Romano (1994). 53 Table 4: Performance Measures and Fees Based on Full-sample Identification (a) Performance Measures Mean Volatility Sharpe ratio Utility LT PS RS2C RS2L RS3C RS3L 48.1 30.3 1.59 0.241 48.4 30.4 1.59 0.242 9.4 11.3 0.83 0.061 9.9 11.3 0.87 0.066 7.3 10.6 0.69 0.045 8.0 11.3 0.71 0.048 (b) Fees LT PS 54 LT 0 PS 0.13 [-4.92, 5.40] -18.26 [-23.45, -10.15] -17.74 [-23.44, -9.71] -19.88 [-22.13, -2.88] -19.63 [-21.91, 2.21] RS2C RS2L RS3C RS3L -0.13 [-5.37, 4.94] 0 -18.39 [-23.45, -10.20] -17.87 [-22.80, -9.85] -20.01 [-21.32, -2.64] -19.76 [-21.01, -0.34] RS2C RS2L RS3C RS3L 18.63 [10.30, 24.08] 18.77 [10.36, 24.02] 0 18.10 [9.86, 24.02] 18.24 [10.05, 23.34] -0.52 [-1.96, 1.07] 0 20.29 [2.93, 22.65] 20.42 [2.67, 21.82] 1.62 [-12.57, 2.34] 2.14 [-12.02, 2.58] 0 20.03 [-2.25, 22.38] 20.16 [0.34, 21.49] 1.37 [-17.11, 2.17] 1.89 [-16.97, 1.90] -0.25 [-11.80, 1.29] 0 0.52 [-1.07, 1.96] -1.62 [-2.35, 12.62] -1.37 [-2.17, 17.17] -2.14 [-2.58, 12.06] -1.89 [-1.90, 17.03] 0.25 [-1.29, 11.85] This table reports performance measures of investment strategies that use the different methods as input for an asset allocation and the fees m an agent would be willing to pay to exchange two methods. The asset allocation for method m at time t equals wtm = µm t /(γψt ), with risk m m aversion parameter γ = 5. The values for the mean µt+1 and second moment ψt are calculated as in Eqs. (15) and (16). The state-dependent m mean µm s and second moment ψs are based on the full-sample and reported in Table 2. The state-dependent probabilities are taken as the identification at time t, which are binary for the rules-based methods and smoothed inference probabilities for the regime-switching approaches. Based on the realized returns of the asset allocations, we calculate the mean and volatility (in % per year), the yearly Sharpe ratio, and the annualized realized utility as in Eq. (13). The fee ηm,n to switch from strategy m (in rows) to strategy n (in columns) solves Eq. (18). We express the fees in % per year. We report the 5% and 95% percentiles between brackets for each statistic based on the 200 bootstraps of both the return and the explanatory variable series following Politis and Romano (1994). Table 5: Predictive Accuracy bull correct bull wrong % bull correct bear correct bear wrong % bear correct total correct total false % correct default bull improvement Kuipers Score IAD LTC LTL PSC PSL RS2C RS2L RS3C RS3L 852 214 79.9% 150 194 43.6% 1002 408 71.1% 75.6% -4.5% 23.5% 0.194 851 215 79.8% 132 212 38.4% 983 427 69.7% 75.6% -5.9% 18.2% 0.222 768 193 79.9% 281 168 62.6% 1049 361 74.4% 68.2% 6.2% 42.5% 0.177 779 182 81.1% 261 188 58.1% 1040 370 73.8% 68.2% 5.6% 39.2% 0.197 868 66 92.9% 340 136 71.4% 1208 202 85.7% 66.2% 19.4% 64.4% 0.158 935 65 93.5% 269 141 65.6% 1204 206 85.4% 70.9% 14.5% 59.1% 0.157 708 3 99.6% 295 404 42.2% 1003 407 71.1% 50.4% 20.7% 41.8% 0.283 681 0 100.0% 268 461 36.8% 949 461 67.3% 48.3% 19.0% 36.8% 0.318 This table shows the predictive accuracy of the different methods. The predictions are constructed as in Figure 3. The predicted probabilities are rounded, and compared with the identification that results from the full sample. For the regime switching models the resulting smoothed inference probabilities are rounded, too. “Bull (bear) correct” gives the number of true bullish (bearish) weeks that were correctly predicted. “Bull (bear) wrong” gives the number of true bullish (bearish) weeks that were wrongly predicted. The percentages are calculated with respect to the number of true bullish (bearish) weeks. The row “default bull” reports the percentage of total correct predictions, when the models would always predict a bullish week. The row “improvement” shows by how much a method’s percentage of correct predictions exceeds the “default bull” prediction. The Kuipers Score is calculated as the percentage of correctly predicted bull markets minus the percentage of incorrectly predicted bear markets. The row IAD reports the Integrated Absolute Distance between the predictions and the realizations. We apply Eq. (10) to calculate the IAD for the rules-based methods, since their predictions are probabilities, while their realizations are binary. For the RS3-models, we calculate the probability of a bull market as the sum of the predictions for regimes with positive means. 55 Table 6: Integrated Absolute Differences of Predictions LTC LTL PSC PSL 56 RS2C RS2L RS3C LTL PSC PSL RS2C RS2L RS3C RS3L 0.060 [0.057, 0.110] 0.250 [0.210, 0.288] 0.279 [0.216, 0.303] 0.276 [0.271, 0.367] 0.275 [0.237, 0.345] 0.082 [0.060, 0.145] 0.239 [0.203, 0.356] 0.272 [0.223, 0.374] 0.284 [0.253, 0.332] 0.331 [0.294, 0.386] 0.213 [0.193, 0.332] 0.244 [0.212, 0.360] 0.262 [0.236, 0.320] 0.305 [0.282, 0.379] 0.061 [0.048, 0.077] 0.239 [0.189, 0.338] 0.269 [0.201, 0.364] 0.284 [0.239, 0.332] 0.327 [0.278, 0.384] 0.090 [0.059, 0.103] 0.086 [0.072, 0.114] 0.228 [0.178, 0.324] 0.255 [0.194, 0.352] 0.281 [0.230, 0.328] 0.318 [0.276, 0.374] 0.116 [0.074, 0.126] 0.095 [0.071, 0.118] 0.034 [0.029, 0.054] This table reports the integrated absolute distance between the predictions of the different approaches, based on two states. The predictions are constructed as in Figure 3. For the difference between all two-state methods we apply Eq. (10). When the RS3-models are part of the comparison, we concentrate on the predictions of bullish regimes. We calculate the probability of a bull market as the sum of the predictions for regimes with positive means. We report the 5% and 95% percentiles between brackets for each statistic based on the 200 bootstraps of both the return and the explanatory variable series following Politis and Romano (1994). Table 7: Performance Measures and Fees Based on Prediction (a) Performance Measures Av. Abs. Weight Mean Mean Bull Mean Bear Volatility Sharpe Ratio Utility market LTC LTL PSC PSL RS2C RS2L RS3C RS3L 3.44 16.7 0.21 -0.036 1.82 8.73 8.96 8.01 32.4 0.27 -0.176 1.90 7.13 10.30 -2.69 31.9 0.22 -0.184 1.70 1.48 0.97 2.56 28.0 0.05 -0.181 1.87 2.26 5.61 -4.91 30.1 0.08 -0.203 0.62 0.97 3.62 -4.24 9.5 0.10 -0.013 0.66 1.52 4.63 -5.50 10.1 0.15 -0.010 0.61 0.34 4.34 -3.74 10.8 0.03 -0.026 0.64 1.17 6.66 -3.96 10.9 0.11 -0.018 (b) Fees LTC 57 LTC 0 LTL -0.81 [-1.86, 5.17] -0.50 [-8.15, 7.77] -2.74 [-9.56, 8.54] 16.21 [8.34, 24.91] 16.48 [8.70, 25.36] 14.93 [7.51, 23.85] 15.70 [7.07, 24.94] PSC PSL RS2C RS2L RS3C RS3L LTL PSC PSL RS2C RS2L RS3C RS3L 0.82 [-5.18, 1.86] 0 0.50 [-7.82, 8.19] -0.32 [-6.15, 12.39] 0 2.75 [-8.62, 9.64] 1.94 [-6.82, 11.50] 2.24 [-5.69, 6.12] 0 -16.60 [-25.66, -8.53] -17.40 [-23.32, -6.64] -16.98 [-26.38, -8.33] -19.28 [-26.11, -8.97] 0 -16.86 [-26.28, -8.87] -17.67 [-24.04, -7.42] -17.25 [-26.30, -9.44] -19.55 [-27.11, -10.32] -0.26 [-1.82, 0.64] 0 -15.27 [-24.60, -7.67] -16.07 [-22.59, -5.69] -15.66 [-25.12, -8.09] -17.96 [-25.73, -8.02] 1.29 [-1.15, 2.44] 1.56 [-0.98, 3.36] 0 -16.06 [-25.71, -7.18] -16.86 [-23.28, -5.53] -16.45 [-25.05, -8.05] -18.75 [-25.90, -8.92] 0.52 [-1.68, 2.96] 0.78 [-1.43, 3.35] -0.77 [-1.82, 2.26] 0 0.31 [-12.37, 6.12] -1.93 [-11.50, 6.80] 17.01 [6.51, 22.76] 17.28 [7.30, 23.42] 15.73 [5.60, 21.93] 16.50 [5.45, 22.78] -2.25 [-6.14, 5.68] 16.70 [8.23, 25.86] 16.97 [9.30, 25.87] 15.42 [8.02, 24.74] 16.19 [7.92, 24.51] 18.90 [8.84, 25.48] 19.17 [10.13, 26.68] 17.62 [7.88, 25.10] 18.39 [8.80, 25.35] 0.26 [-0.64, 1.82] -1.30 [-2.44, 1.15] -0.52 [-2.97, 1.68] -1.56 [-3.36, 0.98] -0.78 [-3.37, 1.43] 0.77 [-2.26, 1.82] This table reports performance measures of investment strategies that use the different methods as input for an asset allocation and the fees an agent would be willing to pay to exchange two methods. The optimal portfolio is constructed similarly as for Figure 3. Based on the realized returns of the asset allocations, we calculate the mean and volatility (in % per year), the yearly Sharpe ratio, and the annualized realized utility as in Eq. (13). Mean Bull (Mean Bear) is the annualized mean during the subperiods identified ex post as bull (bear) markets. For the purpose of comparison, we report the statistics of a long position in the market. The fee ηm,n to switch to strategy m (in rows) from strategy n (in columns) solves Eq. (18). We express the fees in % per year. We report the 5% and 95% percentiles between brackets for each statistic based on the 200 bootstraps of both the return and the explanatory variable series following Politis and Romano (1994). A Multinomial logit transitions In Section 2.2 we propose a regime switching model with time-varying transition probabilities. The probability of a transition from regime q to regime s at time t is linked to predicting variables zt−1 by a multinomial logit transformation ′ πsq,t ≡ πsq (zt−1 ) ≡ Pr[St = s|St−1 = q, zt−1 ] = P eβsq zt−1 , s, q ∈ S, βςq ′ zt−1 ς∈S e (24) with ∃s ∈ S : βsq = 0 to ensure identification. We have dropped the model-superscript m for notational convenience. A.1 Estimation To estimate the parameters βsq , we extend the approach of Diebold et al. (1994), based on the EM-algorithm by Dempster et al. (1977). Diebold et al. (1994) consider estimation when the transition probabilities are linked via a standard (binomial) logit transformation. We maintain the attractive feature of the EM-algorithm that the expectation of the complete-data log likelihood can be split in terms related to only a subset of the parameter space. Therefore, we can focus on the part of the log likelihood function related to the parameters βsq . The transition part of the expectation of the likelihood function is given by ℓ(B) = T XX X ξsq,t log πsq,t , (25) t=1 s∈S q∈S where B = {βsq : s, q ∈ S} is the set of all parameters βsq and ξsq,t ≡ Pr[St = s|St−1 = q, ΩT ] is a smoothed inference probability. These probabilities are based on the complete data set of returns and predictive variables ΩT , and are calculated with the method of Kim (1994). In the expectation step the set of smoothed inference probabilities is determined. In the maximization step new parameters values are calculated that maximize the likelihood function. We derive the first order conditions that apply to βsq by differentiating Eq. (25) 1 ∂πςq ∂ℓ(B) X X = ξςq,t . ∂βsq π ∂β ςq sq t=1 ς∈S T 58 Based on Eq. (24) we find πsq (1 − πsq )zt−1 ∂πςq = ∂βsq −πςq πsq zt−1 if ς = s . if ς 6= s Combining these two expressions yields the first order condition T X (ξsq,t − ξq,t−1 πsq,t ) zt−1 = 0 ∀q, s ∈ S (26) t=1 where ξq,t = Pr[St = q|ΩT ]. For each departure state q the set of the first order conditions for the different s ∈ S comprise a system that determines the set Bq = {βsq : s ∈ S}. Numerical techniques can be used to find parameters βsq that solve this system. A.2 Marginal Effects Because the multinomial logit transformation is non-linear, the coefficients on the explanatory variables cannot be interpreted in a straightforward way. To solve this problem, we calculate the marginal effect of the change in one variable zi , evaluated at specific values ¯ The marginal effect is given by the first derivative of (5) with respect for all variables z. to zi : ! X ∂πsq (z) ¯ βsqi − ¯ ςqi , = πsq (z) πςq (z)β ∂zi z=z¯ ς∈S (27) where βsqi denotes the coefficient on zi . It is easy to verify that the sum of this expression over the destination states s is equal to zero. Since the probabilities for the destination states should add up to one, a marginal increase in one probability should be accompanied by decreases in the other probabilities. When only two regimes are available, the above ¯ expression reduces to the familiar expression for marginal effects in logit models, πsq (z)(1− ¯ sqi . πsq (z))β B Predictive variables The predictive variables that we use in this study are based on data from the Federal Reserve Bank of St. Louis available via FRED8, except the unemployment rate and the 8 See https://research.stlouisfed.org/fred2/ 59 dividend-to-price ratio. For each predictive variable we test whether it has a unit root. If so, we transform the data to create a stationary time series. The results are reported in Table B.1, together with the mean and standard deviation of the (transformed) series. We construct the inflation rate as the monthly relative change in the seasonally adjusted consumer price index (All Urban Consumers All Items, Series ID: CPIAUCSL). We reject a unit for the inflation rate at conventional confidence levels, and find an average inflation rate of 3.78% per year with a standard deviation 1.10%. We construct the growth rate of industrial production as the yearly relative change in industrial production (seasonally adjusted, Series INDPRO). It does not show evidence of a unit root, and has an average of 3.06% with a standard deviation of 5.27%. We obtain the unemployment rate from the Bureau of Labor Stastics (seasonally adjusted, Series ID: LNS14000000). Because of the high AR(1)-coefficient and the p-value for the ADF-statistic close to 0.05, we transform the series by taking a yearly change. The yearly change in unemployment is close to zero with a standard deviation of 1.14%. These three variables are constructed at the monthly frequency. [Table B.1 about here.] The T-Bill rate corresponds with a maturity of three months and is again taken from FRED (Series ID: WTB3MS). Because a unit root is not rejected, we transform this series by taking the difference with respect to the yearly moving average as in Campbell (1991); Rapach et al. (2005). The resulting difference is on average zero and shows a standard deviation of 1.04%. We construct the term spread as the yield on a 10-year government bond (constant maturity, Series ID: WGS10YR) minus the 3-month T-Bill rate. A weekly series for the 1-year yield becomes available from 1962 onwards. The AR(1)-coefficient and ADF-test are based on this smaller subsample. They indicate stationarity, so we do not transform this series. We use the monthly series for the 10-year government bond and 3-month T-Bill rate to construct the observations from 1955 to 1962, and assume that the difference stays constant within a month. The average term spread is 1.42% with a volatility of 1.23%. We determine the credit spread as the difference between the yield on BAA-rated and AAA-rated corporate bonds (as calculated by Moody’s, seasonally adjusted). As weekly data become available starting in 1962 again, we follow the same 60 procedure as for the term spread. We reject the null-hypothesis of a unit root for the subsample with weekly data, and find an average spread of 0.99% and a volatility of 0.46% for the whole sample. The dividend-to-price ratio is constructed from several series. We use the dividends series for the S&P500 as in Shiller (2000), which are available on Robert Shiller’s homepage.9 The series consists of monthly observations of the moving total dividends over the past twelve months. The price index for the S&P500 is spliced together from Schwert (1990) and FRED, as explained in Section 4. To calculate the D/P-ratio for time t (in weeks), we divide the dividends over the last 12 months prior to the month corresponding with time t by the current price level. As the resulting series shows evidence of a unit root, we take again the difference with respect to the moving yearly average. This difference is on average zero and has a volatility of 0.34%. 9 See http://www.econ.yale.edu/~shiller/data.htm for more information. 61 Table B.1: Characteristics of Predicting Variables Frequency 62 Inflation Ind. Prod., yearly growth rate Unemployment Tbill rate Term spread Credit spread D/P ratio monthly monthly monthly weekly weekly weekly weekly AR(1) ADF p-value 0.623 0.966 0.997 0.998 0.991 0.993 0.998 -3.26 -5.20 -3.05 -2.36 -4.28 -3.81 -1.78 0.017 < 0.001 0.031 0.152 < 0.001 0.003 0.390 Transformation yearly change change to yearly average change to yearly average Mean Std. Dev. 3.78 3.06 0.08 -0.01 1.42 0.99 -0.02 1.10 5.27 1.14 1.04 1.23 0.46 0.34 This table shows the set of predicting variables with its source and frequency. For each variable we conduct an adjusted Dickey-Fuller test. We report the first order autocorrelation coefficient, the ADF test-statistic and the p-value for the hypothesis of the presence of a unit root. If this hypothesis is not rejected, the next column shows the transformation that is applied to the variable. The last two columns show the mean and standard deviation of the (transformed) variables in %. The monthly series run from December 1954 to May 2010; the weekly series from January 7, 1955 until June 25, 2010. All data series except the D/P ratio are obtained from the Federal Reserve Bank of St. Louis via FRED. To construct the D/P ratio, we have taken the dividend series from Shiller (2000) and spliced the prices series for the S&P500 together from Schwert (1990) (up to January 4, 1957) and again from FRED (from that date onwards). The dividend series, which is a twelve month moving total, is then divided by the price series. C Additional Results on Identification C.1 Constant versus Time Varying Transition Probabilities The results in Section 5 indicate that the difference between constant and time-varying transition probabilities in the regime switching models are minor. Here we investigate in more detail how different the results are. Table C.1 reports the transition probabilities under the assumption that they are constant over time. Bull and bear markets are quite persistent with probabilities of around 0.95 or higher that the current state prevails for another week. Bull markets tend to be slightly more persistent than bear markets. Only the strongly bearish regime of is somewhat less persistent, though its probability of continuation for another week is still 0.89. [Table C.1 about here.] As an alternative to constant transition probabilities, we link the probabilities to predicting variables, which makes them time-varying. In the rules-based approaches, we first use the LT- or PS-algorithm to label periods as bullish or bearish. We then use these labeled periods as input for the estimation of the Markovian logit model in (6). The twostate regime switching model uses the same logistic transformation to link the transition probabilities to predicting variables. For the four-state regime switching model, the logistic transformation is extended to the multinomial logistic transformation in (5). We determine the variables to include for specific departure-destination combinations by the specific-to-general procedure proposed in Section 4.3. We use a significance level for the likelihood ratio test of 10%. According to Table C.2, time-variation can be related to a few economic variables. The D/P ratio is selected for all models. A rise in the D/P-ratio increases the likelihood of a bull market in the next period in the LT, PS and RS3L-models. This results comes as no surprise, since an increase in the D/P ratio can point at higher future expected returns. In the RS2L-model the D/P-ratio decreases the likelihood of a bull market, which is more puzzling. It is similarly puzzling than an increase in the D/P-ratio strengthens the persistence of the strongly bearish regime in the RS3L-model. 63 A rise in the inflation rate negatively affects the persistence of bull markets in the LT, PS and RS3L-models, but has no effect when the market is bearish. Other variables show up less consistently. An increase in the unemployment rate during a bull market leads to a higher probability of continuation in the PS and RS3L-model. In the PS-approach, a higher term spread or a lower credit spread increases the probability of a switch from a bear to a bull market, while in the RS2L-model they increase the probability that a bull market continues. A higher T-bill rate decreases the probability of a bull market in the LT-model. To determine by how much the transition probabilities change when the predicting variables change, we calculate the marginal effect that a one-standard deviation change in a predicting variable has on a reference probability π. As a reference point we use the average forecast probability PT Pr[St+1 = s|St = q, zt−1 ] Pr[St = q|Ωt ] , π ¯sq = t=1 PT t=1 Pr[St = q|Ωt ] (28) where Ωt denotes the information set (predicting and dependent variables) up to time t. In this expression, each forecast probability Pr[St+1 = s|St = q, zt−1 ] of a switch from state q to state s is weighted by the likelihood of an occurrence of state q at time t, Pr[St = q|Ωt ]. In the rules based approaches, the weights are either zero or one. In the regime-switching approaches the weights are the so-called inference probabilities. We report these probabilities for the different regimes and models in the last row of Table C.2. They confirm the strong persistence reported in Table C.1. The marginal effects are calculated from the reference probability π and the coefficient β. When logit transformations, the marginal effect of variable zi with coefficient βi is given by π(1 − π)βi . For the multinomial logit transformation we derive the marginal effects in Appendix A. Overall, marginal effects are small, with changes of around 0.01– 0.02. However, the probability of a switch from a bear to a bull market can still double. For instance, in the PS-model, the marginal effect of a one-standard deviation rise in the D/P-ratio increases the probability a bear-bull switch from 0.02 to 0.04. [Table C.2 about here.] 64 Combining Tables C.1 and C.2, we conclude that all methods identify persistent bull and bear markets. The evidence for time-variation is limited. However, if the sentiment of the stock market is a good predictor of other economic processes, it is not surprising that other economic variables fail to predict the stock market sentiment well. The D/Pratio which is closely related to expected returns in the stock market is most consistently selected. C.2 Model choice To judge the quality of the different models, we calculate and compare log likelihood values in Table C.3. For all models, we can determine the added value of predictive variables in the transition probabilities. By construction, these improvements are all significant. Also, the Akaike Information Criterion favors the models with time-variation in the transition probabilities over the models where they are constant. However, the improvements of the likelihood are not enough to improve the Bayesian Information Criteration, which puts a heavier penalty on additional parameters. We conclude that the evidence favoring timevarying transition probabilities is at best marginal. [Table C.3 about here.] For the regime-switching models we can also evaluate the added value of an extra regime. For both cases (constant and time-varying transition probabilities), an additional regime leads to large improvements in the likelihood values. Both information criteria prefer a model with three regimes over one with only two. When transition probabilities are constant, the model with two regimes is nested in the model with three regimes, and we could conduct a likelihood ratio test (the LR-statistic equals 105.50). However, due to presence of nuisance parameters under the null-hypothesis of two regimes, the statistic does not follow a standard χ2 distribution and simulations can be used. As our interest is not in selecting the best statistical model we do not conduct this test here. Given the size of the LR-statistic and the values of the information criteria, the evidence so far supports a model with three regimes. 65 Table C.1: Constant Transition Probabilities (a) Probability Estimates from to bull bull bear crash bull bear crash bull bear crash bear crash LT PS RS2C RS3C 0.992 0.008 0.990 0.010 0.981 0.019 0.015 0.985 0.016 0.984 0.052 0.948 0.986 0.014 < 0.001 0.019 0.972 0.009 < 0.001 0.106 0.894 (b) Unconditional Regime Probabilities bull (mild) bear strong bear LT PS RS2C RS3C 0.652 0.348 0.615 0.385 0.730 0.27 0.570 0.396 0.034 This table shows the transition probabilities between the different regimes under the different approaches and the resulting unconditional probabilities. We assume that the probabilities are constant over time. In the approaches of LT and PS, we first apply their algorithms to identify the sequences of bull and bear markets. As a second step we estimate the probabilities. For the regime switching models the probabilities result directly from the estimation. The regime switching model can either have 2 regimes (RS2C) or 3 ¯ satisfy π ¯ mP m = π ¯ m. regimes (RS3C). The unconditional probabilities π 66 Table C.2: Time-varying transition probabilities, (multinomial) logit models model from to constant inflation prod. growth unempl. t-bill rate term spread LT bull bull 5.16 −0.34 [−0.003] − − −0.77 [−0.006] − PS bear bull −5.32 − − − bull bull 5.23 −0.85 [−0.008] − − 0.73 [0.007] − − − RS2L bear bull bull bull bear bull −5.35 − 3.62 − −2.33 − bull 88.58 −1.04 [−0.013] − 83.35 − [0.013] − −0.95 [−0.012] − 67 − − − − − − − − − − [0.012] − − − − − credit spread − − − D/P ratio 0.76 [0.006] 1.01 [0.015] 1.05 [0.010] 0.70 [0.011] −0.40 [−0.006] 1.25 [0.020] π ¯sq 0.99 0.02 0.99 0.02 0.32 [0.009] −0.68 [−0.018] − 0.97 bull mild bear RS3L mild bear bull mild bear −0.29 − 3.72 − strong bear bull mild bear −63.27 − 0.18 − − − − − − − − − − − − − − − − − − − − − − − − − − −1.14 [−0.207] 0.02 0.96 < 0.01 0.24 −0.43 [−0.031] 1.14 [0.014] − [−0.014] 0.08 0.99 0.01 This table shows the estimated coefficients and marginal effects of the predicting variables, when they are linked to the transition probabilities by (multinomial) logit models. The predicting variables have been standardized by subtracting their full-sample mean and dividing by their full-sample standard deviation. In the approaches of LT and PS, we first apply the algorithms to identify bullish and bearish periods. In the second step we estimate a Markovian logit model as in (1), where the coefficients depend on the departure state. In the two-state regime switching model, RS2L, the logistic transformation in (6) is used to link the predicting variables to the transition probabilities. For the threestate regime switching model, RS3L, the multinomial logistic transformation in (5) is used. In that case the coefficients for a switch to the strong bear regime have been fixed at zero. The variable-transition combinations that subsequently produce the biggest increase in the likelihood function are included in the models. The procedure stops when the remaining variable-transition combinations fail to produce an increase in the likelihood function that is significant on the 10%-level. For the RS3L-model we allow a maximum of four combinations to be included. The marginal effects in brackets are calculated for the average forecast probability π ¯sq reported in the last row of the table. The average forecast probability is calculated as in (28). For the two-state approaches, the marginal effect of variable i is calculated as π ¯sq (1 − π ¯sq )βqi . For the three-state approaches, the marginal effect is given by (5). Table C.3: Log likelihood values and information criteria of different model specifications constant time-varying LR df Pr(0.01) # parameters log L AIC BIC # parameters log L AIC BIC LT PS RS2 RS3 2 -170.12 0.119 0.123 6 -153.82 0.110 0.123 32.60 4 13.28 2 -192.61 0.134 0.139 8 -168.63 0.122 0.139 47.96 6 16.81 7 -5979.75 4.136 4.150 11 -5970.87 4.133 4.155 17.78 4 13.28 14 -5927.01 4.104 4.133 18 -5916.96 4.100 4.137 20.09 4 13.28 This table shows the log likelihood values of the different models. For the rules-based approaches LT and PS, we report the log likelihood values of the Markovian logit models as in (1). For the regime switching models with two or three states, we report the likelihood of the complete model. The transition probabilities can be constant (corresponding with Table C.1) or time-varying (corresponding with Table C.2). Based on the log likelihood values we calculate the Akaike and Bayesian Information Criterion (AIC and BIC). In the row labeled “LR” we report the likelihood ratio statistic for time-varying vs. constant transition probabilities, which has a χ2 distribution with degrees of freedom listed in the row below. The last row reports 1% critical values of the corresponding χ2 distribution. 68 D Additional Results on Predictions In the experiment to compare the predictions of the various methods, we reestimate the parameters of the different models every 52 weeks. Here we show how the different parameters evolve. Besides aiding our understanding of the predictions, it also shows how robust the parameters and characteristics are when more information becomes available. Figure D.1 shows the evolution of the means and volatilities of the different regimes. In the rules-based approaches, we first do the identification, and estimate means and volatilities based on that. In the regime-switching approaches, we estimate means and volatilities directly. Generally, we see that the means and volatilities are stable. As we concluded in the full-sample analysis, the difference between the mean for the bull and for the bear regime is more extreme for the rules-based approaches, while the difference between the volatilities for these two regimes is more extreme for the RS2-models. [Figure D.1 about here.] The evolution of the means and volatilities of the regimes in the RS3-models in Figures D.1e and D.1f indicates the influence of the credit crisis. When data until May 2008 is used, two bullish and one bearish regime are identified. One regime has strongly bullish characteristics with a high mean and a low volatility, while the other is more mild, with a mean slightly above zero, and a higher volatility. The characteristics of the bearish regime match quite well with those of the bearish regime in the RS2-models. After 2008, one bullish and two bearish regimes show up. One bearish regime has mild characteristics, while the other has much stronger bearish features. It stresses the exceptional behavior of the US stock market during the credit crisis. The evolution of the transition probabilities when assumed constant within an estimation window are in Figure D.2. The methods with two states produce transition probabilities that are also quite stable over time. Persistence is high for both regimes. The probability of remaining in a bull state never falls below 0.95. For the rules-based approaches, the same applies to the bear regime. In the RS2C model, a bear market seems slightly less persistent, but the probability of continuation almost always exceeds 0.90. [Figure D.2 about here.] 69 The transition probabilities in the RS3C-model vary a bit more than in the two-state models, in particular the probabilities for a switch to another regime. For example, the probability of a switch from a (strongly) bearish to a mildly bullish/bearish regime ranges from 0.04 to 0.11. The probabilities that a specific regime continues are high (around 0.90 or more) and quite stable. The parameter dynamics of the models for time-varying transition probabilities in Figure D.3 indicate whether the parameters are stable, but also whether the variable selection is stable. In the rules-based method, the selection and the parameters are quite stable. The D/P-ratio is consistently present for all switches in both the LT and PS-methods. The T-bill rate is selected for predictions from the bullish regime in both models. In the PS-method, the inflation rate, the unemployment rate and the term spread also show up consistently. In the LT-approach, these are sometimes selected. In the RS-models the selection shows a more haphazard pattern. However, given that a variable is selected, the corresponding coefficient is stable. [Figure D.3 about here.] [Figure D.3 (continued) about here.] 70 Figure D.1: Evolution of means and volatilities per regime LT PS RS2C RS2L LT PS RS2C RS2L 0 0.5 -0.1 0.4 -0.2 0.3 -0.3 -0.4 0.2 -0.5 0.1 -0.6 L bull C mild L mild C (strong) bear 9 7 6 5 4 3 2 1 0 9 8 7 8 ju n0 ju n0 ju n0 ju n0 ju n0 ju n0 ju n0 ju n0 ju n0 ju n0 ju n9 ju n9 5 4 3 2 1 0 9 8 7 6 ju n9 ju n9 ju n9 ju n9 ju n9 ju n9 ju n9 ju n9 ju n8 5 4 6 ju n8 ju n8 9 8 ju n0 7 ju n0 6 ju n0 5 ju n0 ju n0 4 ju n0 3 2 ju n0 1 ju n0 0 ju n0 9 ju n0 8 7 RS2L ju n9 ju n9 6 RS2C ju n9 5 ju n9 4 PS ju n9 ju n9 3 ju n9 2 1 ju n9 0 ju n9 9 ju n9 8 ju n8 ju n8 ju n8 ju n8 ju n8 ju n8 ju n8 (c) volatility of bull regimes in two-state models 7 1 ju n8 1 6 1.5 ju n8 1.2 5 2 3 1.4 7 ju n88 ju n89 ju n90 ju n91 ju n92 ju n93 ju n94 ju n95 ju n96 ju n97 ju n98 ju n99 ju n00 ju n01 ju n02 ju n03 ju n04 ju n05 ju n06 ju n07 ju n08 ju n09 2.5 6 1.6 5 3 4 1.8 3 3.5 C bull ju n8 3 LT RS2L ju n8 RS2C 2 ju n8 PS (b) mean of bear regimes in two-state models ju n8 LT ju n8 ju n8 ju n8 (a) mean of bull regimes in two-state models 4 6 5 7 ju n88 ju n89 ju n90 ju n91 ju n92 ju n93 ju n94 ju n95 ju n96 ju n97 ju n98 ju n99 ju n00 ju n01 ju n02 ju n03 ju n04 ju n05 ju n06 ju n07 ju n08 ju n09 ju n8 ju n8 ju n8 ju n8 ju n8 4 -0.7 3 0 (d) volatility of bear regimes in two-state models L (strong) bear C bull 0.6 L bull C mild L mild C (strong) bear L (strong) bear 6 0.4 5 0.2 4 0 -0.2 3 -0.4 2 -0.6 1 -0.8 9 8 ju n0 6 7 ju n0 ju n0 5 ju n0 ju n0 4 2 3 ju n0 ju n0 1 ju n0 9 0 ju n0 ju n0 7 8 ju n9 ju n9 5 6 ju n9 ju n9 4 ju n9 3 ju n9 1 2 ju n9 ju n9 0 ju n9 9 ju n9 7 8 ju n8 ju n8 6 ju n8 5 ju n8 ju n8 4 3 (e) mean of regimes in RS3-models ju n8 ju n8 7 ju n88 ju n89 ju n90 ju n91 ju n92 ju n93 ju n94 ju n95 ju n96 ju n97 ju n98 ju n99 ju n00 ju n01 ju n02 ju n03 ju n04 ju n05 ju n06 ju n07 ju n08 ju n09 ju n8 6 ju n8 5 ju n8 ju n8 ju n8 4 0 3 -1 (f) volatility of regimes in RS3-models This figure shows the evolution of the means and volatilities when estimated with an expanding window (end date on the x-axis). The first window comprises the period January 7, 1955 – June 24, 1983 (1485 observations), and is continuously expanded with 52 weeks until we reach the end of the sample (July 2, 2010). In the approaches of LT and PS, we first apply their algorithms to identify the sequences of bull and bear markets for each estimation window. As a second step we calculate means and volatilities per regime. The regime switching models can either have two regimes or three regimes, and constant or time-varying transition probabilities. The means and volatilities of the regimes follow directly from the estimation. 71 Figure D.2: Evolution of constant transition probabilities LTC bull LTC bear PS bull PS bear RS2C bull RS2C bear 1 0.99 0.98 0.97 0.96 0.95 0.94 0.93 0.92 0.91 jun-09 jun-08 jun-07 jun-06 jun-05 jun-04 jun-03 jun-02 jun-01 jun-00 jun-99 jun-98 jun-97 jun-96 jun-95 jun-94 jun-93 jun-92 jun-91 jun-90 jun-89 jun-88 jun-87 jun-86 jun-85 jun-84 jun-83 0.9 (a) 2-state models bull bull bull mild mild mild mild (strong) bear (strong) bear (strong) bear (strong) bear bull bull mild (strong) bear mild 1 0.12 0.98 0.1 0.96 0.94 0.08 0.92 0.06 0.9 0.88 0.04 0.86 0.02 0.84 jun-09 jun-08 jun-07 jun-06 jun-05 jun-04 jun-03 jun-02 jun-01 jun-00 jun-99 jun-98 jun-97 jun-96 jun-95 jun-94 jun-93 jun-92 jun-91 jun-90 jun-89 jun-88 jun-87 jun-86 jun-85 jun-84 0 jun-83 0.82 (b) RS3C This figure shows the evolution of the transition probabilities when estimated with an expanding window (end date on the x-axis). We assume that the probabilities are constant within each estimation window. The first window comprises the period January 7, 1955 – June 24, 1983 (1485 observations), and is continuously expanded with 52 weeks until we reach the end of the sample (July 2, 2010). In the approaches of LT and PS, we first apply their algorithms to identify the sequences of bull and bear markets for each estimation window. As a second step we estimate the probabilities. For the regime switching models the probabilities result directly from the estimation. The regime switching models can either have two regimes (RS2C) or three regimes (RS3C). For the methods with two states, we plot the probabilities of a bull-bull and a bear-bear switch in Panel (a). For the RS3C-model in Panel (b) we indicate the transition in the legend above the subfigure. Dashed lines correspond with the secondary y-axis. We do not show transition probabilities that never exceed 0.001. 72 -1.5 -1 -0.5 0 0.5 1 1.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 jun-83 jun-84 jun-84 jun-85 jun-85 jun-85 jun-86 jun-86 jun-86 jun-87 jun-87 jun-87 jun-88 jun-88 jun-88 jun-89 jun-89 jun-89 jun-90 jun-90 jun-90 jun-91 jun-92 jun-93 jun-94 jun-96 jun-97 jun-98 jun-99 jun-00 jun-01 jun-02 jun-03 jun-04 jun-05 jun-06 jun-91 jun-92 jun-93 jun-94 jun-95 jun-96 jun-97 jun-98 jun-99 jun-00 jun-01 jun-02 jun-03 jun-04 jun-05 jun-06 -1 1.5 -2 jun-83 jun-84 -1.5 -1 -0.5 0 jun-03 jun-04 jun-05 jun-06 0.5 1 1.5 -0.5 0 jun-84 jun-85 jun-86 jun-86 jun-87 jun-87 jun-88 jun-88 jun-89 jun-89 jun-85 jun-86 jun-87 jun-88 jun-91 jun-92 jun-93 jun-94 jun-95 jun-96 jun-97 jun-98 jun-99 jun-00 jun-01 jun-02 jun-03 jun-04 jun-05 jun-06 jun-90 (b) LT, from bear to bull (d) PS, from bear to bull (f) RS2L, from bear to bull jun-89 jun-90 jun-91 jun-92 jun-93 jun-94 jun-95 jun-96 jun-97 jun-98 jun-99 jun-00 jun-01 jun-02 jun-03 jun-04 jun-05 jun-06 jun-07 jun-07 jun-08 jun-08 jun-09 jun-09 1.5 jun-02 jun-83 jun-85 1 jun-01 inflation prod. growth unempl. t-bill rate term spread credit spread D/P ratio 1 0.5 jun-00 inflation prod. growth unempl. t-bill rate term spread credit spread D/P ratio inflation prod. growth unempl. t-bill rate term spread credit spread D/P ratio -1.5 jun-99 jun-09 73 -2 jun-98 jun-08 0.5 1.5 jun-97 jun-07 0 1 jun-96 jun-09 -0.5 0.5 jun-95 jun-08 jun-84 0 jun-94 jun-09 jun-83 -0.5 jun-93 jun-08 -1 -1 jun-92 jun-07 -1.5 -1.5 jun-91 jun-07 -2 -2 jun-90 jun-91 jun-92 jun-93 jun-94 jun-95 jun-96 jun-97 jun-98 jun-99 jun-00 jun-01 jun-02 jun-03 jun-04 jun-05 jun-06 jun-07 jun-08 jun-09 Figure D.3: Evolution of parameters in logit models jun-95 (a) LT, from bull to bull jun-83 jun-84 (c) PS, from bull to bull (e) RS2L, from bull to bull Figure note on next page. -2 jun-83 inflation prod. growth unempl. t-bill rate term spread credit spread D/P ratio inflation prod. growth unempl. t-bill rate term spread credit spread D/P ratio inflation prod. growth unempl. t-bill rate term spread credit spread D/P ratio 3 2 1 inflation prod. growth unempl. t-bill rate term spread credit spread D/P ratio -1 -2 jun-09 jun-08 jun-07 jun-06 jun-05 jun-04 jun-03 jun-02 jun-01 jun-00 jun-99 jun-98 jun-97 jun-96 jun-95 jun-94 jun-93 jun-92 jun-91 jun-90 jun-89 jun-88 2 1 inflation prod. growth unempl. t-bill rate term spread credit spread D/P ratio -1 -2 jun-09 jun-08 jun-07 jun-06 jun-05 jun-04 jun-03 jun-02 jun-01 jun-00 jun-99 jun-98 jun-97 jun-96 jun-95 jun-94 jun-93 jun-92 jun-91 jun-90 jun-89 jun-88 jun-87 jun-86 jun-85 -3 (j) RS3L, from mild bear to mild bear 1 2 3 (i) RS3L, from mild bear to bull jun-84 jun-83 jun-09 jun-08 jun-07 jun-06 jun-05 jun-04 jun-03 jun-02 jun-01 jun-00 jun-99 jun-98 jun-97 jun-96 jun-95 jun-94 jun-93 jun-92 jun-91 jun-90 jun-89 jun-88 jun-87 jun-86 jun-85 jun-84 jun-83 -3 -2 -1 0 inflation prod. growth unempl. t-bill rate term spread credit spread D/P ratio 0 1 2 3 (h) RS3L, from bull to mild bear 3 (g) RS3L, from bull to bull jun-87 jun-86 jun-85 -3 jun-84 jun-83 jun-09 jun-08 jun-07 jun-06 jun-05 jun-04 jun-03 jun-02 jun-01 jun-00 jun-99 jun-98 jun-97 jun-96 jun-95 jun-94 jun-93 jun-92 jun-91 jun-90 jun-89 jun-88 jun-87 jun-86 jun-85 jun-84 jun-83 -3 -2 -1 0 inflation prod. growth unempl. t-bill rate term spread credit spread D/P ratio 0 1 2 3 Figure D.3: Evolution of parameters in logit models – continued jun-09 jun-08 jun-07 jun-06 jun-05 jun-04 jun-03 jun-02 jun-01 jun-00 jun-99 jun-98 jun-97 jun-96 jun-95 jun-94 jun-93 jun-92 jun-91 jun-90 jun-89 jun-88 jun-87 jun-86 jun-85 jun-84 jun-83 -3 -2 -1 0 inflation prod. growth unempl. t-bill rate term spread credit spread D/P ratio (k) RS3L, from strong bear to mild bear This figure plots the evolution of the coefficients in the (multinomial) logit transitions for the predicting variables in Table B.1, when estimated with an expanding window (end date on the x-axis). The first window comprises the period January 7, 1955 – June 24, 1983 (1485 observations), and is continuously expanded with 52 weeks until we reach the end of the sample (July 2, 2010). The predicting variables have been standardized by subtracting their full-sample mean and dividing by their full-sample standard deviation. In the approaches of LT and PS, we first apply the algorithms to identify bullish and bearish periods in the subperiod under consideration. In the second step we estimate a Markovian logit model as in (1), where the coefficients depend on the departure state. In the RS2L-model the logistic transformation in (6) is used to link the predicting variables to the transition probabilities. In the RS3L, the multinomial logistic transformation in (5) is used. For identification, all coefficients for a switch to the (strong) bear regime have been fixed at zero. The variable-transition combinations that subsequently produce the biggest increase in the likelihood function are included in the models. The procedure stops when the remaining 74 variable-transition combinations fail to produce an increase in the likelihood function that is significant on the 10%-level. In each subfigure, we only plot the variables that have been selected at least once. E Result of Robustness Checks E.1 Robustness of the LT-method Lunde and Timmermann (2004) consider four combinations for the values of the thresholds λ1 and λ2 to identify switches between bull and bear markets. They argue that a value of 20% for λ1 is conventionally used. A lower value of 15% for λ2 subsequently accounts for the positive drift of the stock market. Other combinations they consider for (λ1 , λ2 ) are (0.20, 0.10), (0.15, 0.15) and (0.15, 0.10). Since we conclude that a quick identification of the current state is important when making predictions, we follow LT and also consider these combinations of lower thresholds. Lower thresholds lead to a more rapid alternation of bull and bear markets, that as a consequence will last briefer. When we consider identification based on the full sample, the number of cycles increases from 16 in the original (0.20, 0.15) setting to 24 for the lowest thresholds (0.15, 0.10). Lowering both thresholds has a much stronger effect than lowering only one of the thresholds. Average and median durations decrease when we work with lower threholds, though the result is less pronounced than in LT. This is due the fact that the longest bull market is unaffected by the choice of thresholds. [Table E.1 about here.] We apply the techniques of Section 3 to determine how different the identifications and predictions are that results from the LT-method with different thresholds. Table E.2a shows that the integrated absolute difference between these series is quite small. Their average difference varies between 0.011 and 0.059, while the upper bound of the 90% confidence intervals is around 0.10. Comparing these difference with those in Table 3 shows that they are slightly smaller than the difference between the standard LT and the PS identifications, and much smaller than the differences between the standard LT-method and the RS-methods. So from a statistical point of view, the type of method matters more than the specific thresholds. [Table E.2 about here.] 75 The economic comparison in Tables E.2b and E.2c shows larger differences. A faster identification of bull and bear markets leads to a considerably higher mean, rising from 48.1% to 66.3% per year, at the cost of a slight increase in the volatility from 30.3% to 35.2%. As a consequence, both the Sharpe ratio and the realized utility increase. An investor would be willing to pay large fees up to 9.67% to switch to lower thresholds. Since fees to switch from the standard (0.20, 0.15) are all positive, the fees in Table 4b may underestimate the preference of the standard LT-method over the regime switching methods. The fees to switch from (0.20, 0.15) to (0.15, 0.10) are significant, as confidence intervals are substantially bounded from zero. Only the joint lowering of thresholds commands a significant fee. However, these fees correspond with identification ex post, and it is not obvious that working with lower thresholds leads to better predictions or allocations ex ante. We turn to the predictions made with different thresholds in Table E.3. Lowering thresholds leads to a higher predictive accuracy. In particular bear markets are better predicted with lower thresholds. We conclude from these results that an exceedance of a lower threshold suffices to mark a definite switch between bull and bear markets. The lowest thresholds produce the largest hit rate and Kuipers score and the lowest IAD with the final identification. These numbers put the LT-method with lowest thresholds at par with the two state regime switching models in Table 5. [Table E.3 about here.] The IAD-measures in Table E.4 indicate that the differences between predictions based on the different threshold combinations are larger than those between identification. Logically, IADs are larger when both threshold values are different, with maxima around 0.16, meaning that 16% of the predictions are different. Confidence intervals indicate that the differences are significant. A comparison with Table 6 shows that the predictions by LTmethod with different thresholds still have more in common than the standard LT-method with the other methods, including the PS method. In line with earlier results, the use of predictive variables only leads to marginal differences in the predictions. [Table E.4 about here.] 76 Finally, we compare the performance of investment strategies that are based on the LTmethods with different thresholds in Table E.5. Lower thresholds lead to higher means, which rise from around 8% to 12%, but also to higher volatilities. Part of the increases come from a larger position in futures contracts. Judged by Sharpe ratio and realized utility, the threshold combination (0.20, 0.15) works best, while a lowering of the λ1 threshold actually leads lower utility because of the increase in volatility. The highest utility in Table E.5a is still lower that the utilities for the regime switching models in Table 7a. [Table E.5 about here.] The fees in Table E.5b indicate that the economic improvement of a strategy by one threshold combination compared to another combination is still relatively small and mostly insignificant. Changing from the standard (0.20, 0.15) combination to the (0.20, 0.10) commands a fee of 4–6% per year, but the confidence intervals are quite wide and stretch beyond zero. The only significant fees correspond with a switch from the (0.15, 0.10) combination with constant transition probabilities. The magnitude of the fees is small compared to the fees reported in Table 7. Lowering the thresholds leads to better identification but also to more risk-taking, which in the end makes a slower, more cautious strategy more attractive. Overall, we conclude that lowering the thresholds improves identification and predictions. Lower thresholds lead to better identification, both in a statistical and an economic performance. We already concluded that the rules-based methods are better suited for identification. The results for lower thresholds indicate an even larger difference. For predictions, the results are more mixed. If hits and false alarms are equally important, lower thresholds improve the performance of the LT-methods to a level comparable to the twostate regime switching models. However, the economic comparison shows that the false alarms actually become costlier, and the overall balance in terms of economic performance is negative. Nonetheless, lowering the threshold λ2 makes sense as a switch to this setting commands a positive fee and leads to higher utility. Since utility is still considerably lower than the level produced by the regime-switching models, these models lead to predictions that are economically more valuable. 77 E.2 Robustness of the PS-method Pagan and Sossounov (2003) base their algorithm on the algorithm for business cycle identification of Bry and Boschan (1971), and adjust some of the original settings based on early literature on bull and bear markets, the so-called Dow theory after Charles Dow. They do not investigate how robust their findings are to changes in these settings. We conduct a small robustness investigation, where we change one parameter at a time. In comparison with the LT-method, the PS-method identifies a few more cycles, so we consider only a relaxation of the constraint on cycle length, which we lower to 52 weeks instead of 70. For the constraint on the length of a phase, we consider a lower value of 12 weeks and a higher value of 20 weeks. A higher value may lead to less false alarms. The bound on price change that must be crossed to overrule the phase constraint is currently at 20%. As in the robustness checks for the LT-method, we consider a value of 15%. Censoring (currently 13 weeks) will not influence the identification much, but may have considerable impact on predictions. We consider a lower value of 7 weeks and a higher value of 26 weeks, which corresponds with the setting of PS. We find that the identification is quite robust to these changes. In the basic setting, we identify 18 bull and 18 bear markets (see Table 1). The results do not change at all, when we change the phase constraint or the price change bound. Censoring 7 weeks at the beginning and end leads to one extra switch to a bear market at the end of of the sample. Censoring 26 weeks does not lead to different results. Lowering the cycle to 52 weeks leads to 19 bull and 19 bear markets. Because results change at most slightly, the further analyses based on identification will also change only slightly. Our conclusion that the PS-method works well for identification remains unaffected. The predictions and allocations based on the PS-method change more when the parameters are changed. A prediction for week t + 1 based on the PS-method uses only information up to time t. The identification over the subsample until week t may differ from the full-sample identification and may be more sensitive to parameter choices. This larger sensitivity then carries over to the predictions. Table E.6 show the sensitivity of predictive accuracy for the restrictions. The hit rate for bull markets changes a bit (at most 10%), while the hit rate for bear markets shows 78 more variation. However, an improvement in correctly predicting bull markets is offset by a deterioration for bear markets. The overall hit rates remain steadily between 70–75%. This result contrasts with the results for the LT-method in Table E.3, which shows more variation. [Table E.6 about here.] Table E.7 shows a further analysis of the impact of parameter changes. Censoring leads to predictions that are quite different from the predictions in the basic setting, with IADs of around 0.l7 when 7 weeks are censored and 0.22 when 26 weeks are censored. Less censoring leads to a more aggressive allocation with weights larger than two, while more censoring leads to a less aggressive allocation with weights close to one. This pattern is caused by predictions being closer to the steady state distribution of bull and bear markets, when more observations are censored. Effectively, more censoring necessitates a prediction for more steps ahead. More censoring and less aggressive investments lead to a better performance: the mean increases while the volatility decreases. Less censoring shows the opposite effect. It means that the predictions are too extreme, and that shrinking them towards their long-term average makes sense. The effect of more censoring is quite impressive with a Sharpe ratio that increases to 0.37 when predictive variables are used. This ratio is similar in magnitude to the best LT-method in the previous subsection. Moreover, utility increases compared to the basis strategies. The resulting utility is only slightly lower than the utility the results from using two-state regime switching models. Because more censoring leads to higher utility, an investor is willing to pay significant fees of around 14% per year to switch from the basic PS-method. [Table E.7 about here.] All other parameter choices do not matter much. Changing the phase constraint does not give different results. When we put the constraint at 12 or 20 weeks, the predictions are exactly the same as in the basic setting. Therefore we have not include the phase constraint in a further analysis. A relaxation of the constraints on the duration of a cycle or the price change to trigger a switch do not lead to large statistical or economic differences 79 in Table E.7. The characteristics of the investment strategies are largely similar. Consequently, fees for switching differ only a little bit from zero. The confidence intervals for the IADs and the fees encompass zero, indicating that the differences are not significant. 80 Table E.1: Number and Duration of Market Regimes for Different Thresholds in the LT-method λ1 λ2 bull bear number avg. duration med. duration std. dev. duration max. duration min. duration number avg. duration med. duration std. dev. duration max. duration min. duration 0.20 0.15 0.20 0.10 0.15 0.15 0.15 0.10 16 119 90 97 405 15 16 62 60 44 187 7 18 98 59 97 405 15 18 63 55 49 187 7 18 108 84 97 405 6 18 53 60 29 91 7 24 77 50 90 405 5 24 44 39 30 91 4 This table shows the number of spells of the different market regimes, their average and median duration, the standard deviation of the duration, and its maximum and minimum for different thresholds λ1 and λ2. 81 Table E.2: Statistical and Economic Comparison of the Identification by the LTmethod for Different Thresholds (a) Integrated Absolute Distances (λ1 , λ2 ) (0.20, 0.15) (0.20, 0.10) (0.15, 0.15) (0.15, 0.10) 0.047 [0.017, 0.102] 0.011 [0.000, 0.059] 0.059 [0.031, 0.126] 0.045 [0.034, 0.112] 0.038 [0.014, 0.087] 0.033 [0.017, 0.089] (0.20, 0.10) (0.15, 0.15) (b) Performance Measures (λ1 , λ2 ) Mean Volatility Sharpe ratio Utility (0.20, 0.15) (0.20, 0.10) (0.15, 0.15) (0.15, 0.10) 48.1 30.3 1.59 0.241 51.8 31.4 1.65 0.259 52.8 31.7 1.67 0.264 66.3 35.2 1.88 0.331 (c) Fees (λ1 , λ2 ) (0.20, 0.15) (0.20, 0.10) (0.15, 0.15) (0.15, 0.10) (0.20, 0.15) 0 (0.20, 0.10) 1.94 [0.41, 7.62] 2.49 [0.00, 6.77] 9.67 [5.92, 17.09] -1.94 [-7.57, -0.41] 0 -2.48 [-6.73, 0.00] -0.54 [-4.45, 4.98] 0 -9.59 [-16.83, -5.89] -7.66 [-12.61, -3.78] -7.12 [-13.66, -3.36] 0 (0.15, 0.15) (0.15, 0.10) 0.54 [-4.96, 4.47] 7.72 [3.80, 12.76] 7.17 [3.37, 13.83] This table shows the statistical and economic comparison of the identification that results from the LTmethod when different thresholds λ1 and λ2 are used. The statistics are calculated as in Tables 3 and 4. 82 Table E.3: Predictive Accuracy based on the LT-method with Different Thresholds (λ1 , λ2 ) Transition Bull correct Bull wrong % Bull correct Bear correct Bear wrong % Bear correct Total correct Total false % Correct Default bull Improvement Kuipers Score IAD (0.20, 0.15) C L 852 214 79.9% 150 194 43.6% 1002 408 71.1% 75.6% -4.5% 23.5% 0.194 851 215 79.8% 132 212 38.4% 983 427 69.7% 75.6% -5.9% 18.2% 0.222 (0.20, 0.10) C L 816 214 79.2% 263 117 69.2% 1079 331 76.5% 73.0% 3.5% 48.4% 0.153 819 211 79.5% 249 131 65.5% 1068 342 75.7% 73.0% 2.7% 45.0% 0.171 (0.15, 0.15) C L 983 89 91.7% 137 201 40.5% 1120 290 79.4% 76.0% 3.4% 32.2% 0.160 977 95 91.1% 137 201 40.5% 1114 296 79.0% 76.0% 3.0% 31.7% 0.159 (0.15, 0.10) C L 934 111 89.4% 232 133 63.6% 1166 244 82.7% 74.1% 8.6% 52.9% 0.132 941 104 90.0% 232 133 63.6% 1173 237 83.2% 74.1% 9.1% 53.6% 0.127 See Table 5 for explanation. A C in the row transition means that transition probabilities are constant, and an L means they are modelled with the Markovian logit-model. 83 Table E.4: Integrated Absolute Difference of Predictions based on the LT-method with Different Settings (0.20, 0.15, L) (0.20, 0.10, C) (0.20, 0.10, L) (0.15, 0.15, C) 84 (0.15, 0.15, L) (0.15, 0.10, C) (0.15, 0.10, L) (0.20, 0.15, C) (0.20, 0.15, L) (0.20, 0.10, C) (0.20, 0.10, L (0.15, 0.15, C) (0.15, 0.15, L) (0.15, 0.10, C) 0.060 [0.057, 0.110] 0.067 [0.035, 0.136] 0.116 [0.096, 0.185] 0.113 [0.096, 0.206] 0.098 [0.073, 0.181] 0.063 [0.041, 0.105] 0.067 [0.032, 0.099] 0.113 [0.083, 0.162] 0.128 [0.094, 0.204] 0.163 [0.121, 0.247] 0.106 [0.097, 0.166] 0.077 [0.038, 0.119] 0.161 [0.142, 0.242] 0.155 [0.108, 0.213] 0.052 [0.056, 0.104] 0.127 [0.092, 0.180] 0.165 [0.122, 0.207] 0.080 [0.052, 0.116] 0.121 [0.089, 0.161] 0.082 [0.055, 0.134] 0.118 [0.107, 0.181] 0.150 [0.139, 0.239] 0.162 [0.119, 0.217] 0.116 [0.101, 0.172] 0.101 [0.065, 0.135] 0.108 [0.104, 0.192] 0.109 [0.086, 0.165] 0.054 [0.055, 0.093] This table reports the integrated absolute distance between the predictions of the LT-method constructed with different thresholds and with constant or time-varying transition probabilities. The settings are reported in the headings in parentheses, where the first two elements indicate the values for λ1 and λ2 and the third element indicates constant (C) or time-varying (L) transition probabilities. See Table 6 for further explanation. Table E.5: Performance Measures and Fees when Predictions are based on the LT-method with Different Settings (a) Performance Measures (λ1 , λ2 ) Transition Av. Abs. Weight Mean Mean Bull Mean Bear Volatility Sharpe Ratio Utility (0.20, 0.15) C L 1.82 1.90 8.73 7.13 8.96 10.30 8.01 -2.69 32.4 31.9 0.27 0.22 -0.176 -0.184 (0.20, C 1.91 13.33 12.28 16.17 32.8 0.41 -0.136 0.10) L 2.04 14.10 15.19 11.15 32.7 0.43 -0.127 (0.15, C 1.91 9.18 10.31 5.60 33.8 0.27 -0.194 0.15) L 1.96 10.36 13.90 -0.89 33.3 0.31 -0.175 (0.15, C 2.15 12.04 9.87 18.27 37.7 0.32 -0.235 0.10) L 2.26 14.36 14.07 15.17 36.8 0.39 -0.195 (b) Fees (λ1 , λ2 ) Transition (0.20, 0.15) C 85 (0.20, 0.15, C) 0 (0.20, 0.15, L) -0.81 [-1.86, 5.17] 3.99 [-3.22, 5.89] 4.95 [-2.45, 8.72] -1.77 [-7.97, 1.73] 0.10 [-3.82, 5.70] -5.98 [-16.21, -1.67] -1.94 [-10.99, 3.77] (0.20, 0.10, C) (0.20, 0.10, L) (0.15, 0.15, C) (0.15, 0.15, L) (0.15, 0.10, C) (0.15, 0.10, L) L 0.82 [-5.18, 1.86] 0 4.81 [-6.90, 5.82] 5.76 [-3.92, 6.90] -0.95 [-11.89, 2.08] 0.92 [-5.39, 2.73] -5.16 [-20.99, -1.83] -1.12 [-13.78, 2.05] (0.20, 0.10) C -3.99 [-5.89, 3.21] -4.80 [-5.82, 6.85] 0 0.96 [-0.87, 5.55] -5.77 [-11.01, 2.87] -3.89 [-7.10, 6.20] -10.00 [-16.56, -2.85] -5.96 [-11.60, 1.66] L -4.94 [-8.76, 2.44] -5.76 [-6.88, 3.91] -0.96 [-5.58, 0.87] 0 -6.73 [-14.89, 2.38] -4.85 [-8.77, 4.25] -10.97 [-19.60, -4.53] -6.92 [-13.61, -1.47] (0.15, 0.15) C 1.77 [-1.72, 7.94] 0.95 [-2.08, 11.81] 5.76 [-2.87, 11.00] 6.71 [-2.38, 14.75] 0 1.87 [-1.06, 8.79] -4.20 [-13.25, 0.72] -0.16 [-9.42, 8.95] L -0.10 [-5.71, 3.81] -0.92 [-2.73, 5.36] 3.89 [-6.22, 7.11] 4.84 [-4.25, 8.74] -1.87 [-8.86, 1.06] 0 -6.08 [-18.93, -1.98] -2.04 [-12.26, 3.14] (0.15, 0.10) C 5.93 [1.66, 16.00] 5.11 [1.81, 20.64] 9.92 [2.83, 16.35] 10.87 [4.50, 19.26] 4.17 [-0.71, 13.15] 6.04 [1.97, 18.68] 0 4.03 [0.39, 11.06] L 1.93 [-3.76, 10.90] 1.11 [-2.03, 13.67] 5.92 [-1.66, 11.50] 6.87 [1.46, 13.52] 0.16 [-8.96, 9.36] 2.03 [-3.13, 12.16] -4.03 [-11.13, -0.39] 0 See Table 7 for explanation. The different settings in the LT-method are indicated by the values for λ1 and λ2 and the choice for transition probabilities, which can be constant (C) or time-varying (L). Table E.6: Predictive Accuracy of the PS-method with Different Settings Setting Transition Bull correct Bull wrong % Bull correct Bear correct Bear wrong % Bear correct Total correct Total false % Correct Default bull Improvement Kuipers Score IAD basic C 768 193 79.9% 281 168 62.6% 1049 361 74.4% 68.2% 6.2% 42.5% 0.177 L 779 182 81.1% 261 188 58.1% 1040 370 73.8% 68.2% 5.6% 39.2% 0.197 censoring: 7 C L 706 245 74.2% 295 164 64.3% 1001 409 71.0% 67.4% 3.5% 38.5% 0.239 714 237 75.1% 281 178 61.2% 995 415 70.6% 67.4% 3.1% 36.3% 0.262 censoring: 26 C L 760 201 79.1% 235 214 52.3% 995 415 70.6% 68.2% 2.4% 31.4% 0.144 792 169 82.4% 207 242 46.1% 999 411 70.9% 68.2% 2.7% 28.5% 0.161 cycle: 52 C L 768 193 79.9% 261 188 58.1% 1029 381 73.0% 68.2% 4.8% 38.0% 0.194 874 87 90.9% 145 304 32.3% 1019 391 72.3% 68.2% 4.1% 23.2% 0.210 change: 15% C L 770 191 80.1% 281 168 62.6% 1051 359 74.5% 68.2% 6.4% 42.7% 0.176 781 180 81.3% 261 188 58.1% 1042 368 73.9% 68.2% 5.7% 39.4% 0.196 This table show the predictive accuracy of the PS-method when one parameter is changed. The basis setting is used throughout the paper. It censors 13 weeks at the beginning and end. It requires cycles to be longer than 70 weeks. Phases should be longer than 16 weeks unless the price changes with more then 20% since the last peak or trough. The column headings indicate which value is used for a parameter. All other parameters are set to their basic value. Transition probabilities can be constant (indicated by a C) or time-varying (indicated by an L). See Table 5 for further explanation. 86 Table E.7: Statistical and Economic Comparison of Predictions based on the PS-method with Different Settings Setting 87 Basic, C Basic, L Censoring: 7, C Censoring: 7, L Censoring: 26, C Censoring: 26, L Cycle: 52, C Cycle: 52, L Change: 15%, C Change: 15%, L IAD 0 0 0.175 0.169 0.212 0.222 0.009 0.028 0.001 0.001 [0.158, [0.153, [0.186, [0.181, 0.188] 0.185] 0.228] 0.243] [0.000, 0.023] [0.000, 0.029] Av. Abs. Weight Mean Mean Mean Bull Mean Bear Vol. Sharpe Ratio Utility 1.70 1.87 2.01 2.13 1.20 1.42 1.71 1.81 1.70 1.87 1.48 2.26 -5.32 -3.84 5.33 9.05 1.57 0.48 2.10 3.08 0.97 5.61 -9.70 -5.65 7.67 16.04 1.38 4.07 1.89 6.82 2.56 -4.91 3.74 -0.07 0.33 -5.91 1.98 -7.21 2.56 -4.91 28.0 30.1 32.9 34.4 19.7 24.6 28.0 29.7 28.0 30.1 0.05 0.08 -0.16 -0.11 0.27 0.37 0.06 0.02 0.08 0.10 -0.181 -0.203 -0.324 -0.335 -0.043 -0.061 -0.181 -0.216 -0.175 -0.195 Fee 0 0 -14.31 -13.19 13.73 14.22 0.02 -1.25 0.62 0.82 [-22.54, -3.91] [-22.23, -1.95] [6.22, 19.57] [1.91, 16.03] [-0.16, 2.03] [-0.21, 2.65] The column IAD gives the Integrated Absolute Distance between the predictions based on a specific setting and the basic setting for the PS-method. The next columns subsequently report performance measures of investment strategies that use the different predictions as input for an asset allocation as discussed in Section 3.2. We report the average absolute weight of each strategy. Based on the realized returns of the asset allocations, we calculate the mean and volatility (in % per year), the yearly Sharpe ratio, and the annualized realized utility as in Eq. (13). Mean Bull (Mean Bear) is the annualized mean during the subperiods identified ex post as bull (bear) markets. The column fee reports the maximum fee that an investor is willing to pay to switch from a basic setting to another setting in the PS-method. We express the fees in % per year. When calculating the IAD or the fee, the setting for the transition probabilities (constant or time-varying) in the basic strategy corresponds with the setting in the alternative method. We report the 5% and 95% percentiles between brackets for the IAD and the fee based on the 200 bootstraps of both the return and the explanatory variable series following Politis and Romano (1994). We do not report confidence intervals for the different cycle restrictions, as changing cycle minima cannot be combined with a stationary bootstrap.
© Copyright 2024