Econometrica, Vol. 68, No. 3 ŽMay, 2000., 575]603 SAMPLE SPLITTING AND THRESHOLD ESTIMATION BY BRUCE E. HANSEN 1 Threshold models have a wide variety of applications in economics. Direct applications include models of separating and multiple equilibria. Other applications include empirical sample splitting when the sample split is based on a continuously-distributed variable such as firm size. In addition, threshold models may be used as a parsimonious strategy for nonparametric function estimation. For example, the threshold autoregressive model ŽTAR. is popular in the nonlinear time series literature. Threshold models also emerge as special cases of more complex statistical frameworks, such as mixture models, switching models, Markov switching models, and smooth transition threshold models. It may be important to understand the statistical properties of threshold models as a preliminary step in the development of statistical tools to handle these more complicated structures. Despite the large number of potential applications, the statistical theory of threshold estimation is undeveloped. It is known that threshold estimates are super-consistent, but a distribution theory useful for testing and inference has yet to be provided. This paper develops a statistical theory for threshold estimation in the regression context. We allow for either cross-section or time series observations. Least squares estimation of the regression parameters is considered. An asymptotic distribution theory for the regression estimates Žthe threshold and the regression slopes. is developed. It is found that the distribution of the threshold estimate is nonstandard. A method to construct asymptotic confidence intervals is developed by inverting the likelihood ratio statistic. It is shown that this yields asymptotically conservative confidence regions. Monte Carlo simulations are presented to assess the accuracy of the asymptotic approximations. The empirical relevance of the theory is illustrated through an application to the multiple equilibria growth model of Durlauf and Johnson Ž1995.. KEYWORDS: Confidence intervals, nonlinear regression, growth regressions, regime shifts. 1. INTRODUCTION A ROUTINE PART OF AN EMPIRICAL ANALYSIS of a regression model such as yi s b 9 x i q e i is to see if the regression coefficients are stable when the model is estimated on appropriately selected subsamples. Sometimes the subsamples are selected on categorical variables, such as gender, but in other cases the subsamples are selected based on continuous variables, such as firm size. In the latter case, some decision must be made concerning what is the appropriate threshold Ži.e., how big must a firm be to be categorized as ‘‘large’’. at which to split the sample. When this value is unknown, some method must be employed in its selection. 1 This research was supported by a grant from the National Science Foundation and an Alfred P. Sloan Foundation Research Fellowship. Thanks go to Robert de Jong and James MacKinnon for insightful comments and Mehmet Caner and Metin Celebi for excellent research assistance. Special thanks go to two diligent referees whose careful readings of multiple drafts of this paper have eliminated several serious errors. 575 576 B. E. HANSEN Such practices can be formally treated as a special case of the threshold regression model. These take the form Ž1. yi s u 1X x i q e i , qi F g , Ž2. yi s u 2X x i q e i , qi ) g , where qi may be called the threshold variable, and is used to split the sample into two groups, which we may call ‘‘classes,’’ or ‘‘regimes,’’ depending on the context. The random variable e i is a regression error. Formal threshold models arise in the econometrics literature. One example is the Threshold Autoregressive ŽTAR. model of Tong Ž1983, 1990., recently explored for U.S. GNP by Potter Ž1995.. In Potter’s model, yi is GNP growth and x i and qi are lagged GNP growth rates. The idea is to allow important nonlinearities in the conditional expectation function without over-parameterization. From a different perspective, Durlauf and Johnson Ž1995. argue that models with multiple equilibria can give rise to threshold effects of the form given in model Ž1. ] Ž2.. In their formulation, the regression is a standard Barro-styled cross-country growth equation, but the sample is split into two groups, depending on whether the initial endowment is above a specific threshold. The primary purpose of this paper is to derive a useful asymptotic approximation to the distribution of the least-squares estimate g ˆ of the threshold parameter g . The only previous attempt Žof which I am aware. is Chan Ž1993. who derives the asymptotic distribution of g ˆ for the TAR model. Chan finds that nŽg ˆy g 0 . converges in distribution to a functional of a compound Poisson process. Unfortunately, his representation depends upon a host of nuisance parameters, including the marginal distribution of x i and all the regression coefficients. Hence, this theory does not yield a practical method to construct confidence intervals. We take a different approach, taking a suggestion from the change-point literature, which considers an analog of model Ž1. ] Ž2. with qi s i. Let dn s u 2 y u 1 denote the ‘‘threshold effect.’’ The proposed solution is to let dn ª 0 as n ª `. ŽWe will hold u 2 fixed and thereby make u 1 approach u 2 as n ª `.. Under this assumption, it has been found Žsee Picard Ž1985. and Bai Ž1997.. that the asymptotic distribution of the changepoint estimate is nonstandard yet free of nuisance parameters Žother than a scale effect.. Interestingly, we find in the threshold model that the asymptotic distribution of the threshold estimate g ˆ is of the same form as that found for the change-point model, although the scale factor is different. The changepoint literature has confined attention to the sampling distribution of the threshold estimate. We refocus attention on test statistics, and are the first to study likelihood ratio tests for the threshold parameter. We find that the likelihood ratio test is asymptotically pivotal when dn decreases with sample size, and that this asymptotic distribution is an upper bound on the asymptotic distribution for the case that dn does not decrease with sample size. This allows SAMPLE SPLITTING 577 us to construct asymptotically valid confidence intervals for the threshold based on inverting the likelihood ratio statistic. This method is easy to apply in empirical work. A GAUSS program that computes the estimators and test statistics is available on request from the author or from his Web homepage. The paper is organized as follows. Section 2 outlines the method of least squares estimation of threshold regression models. Section 3 presents the asymptotic distribution theory for the threshold estimate and the likelihood ratio statistic for tests on the threshold parameter. Section 4 outlines methods to construct asymptotically valid confidence intervals. Methods are presented for the threshold and for the slope coefficients. Simulation evidence is provided to assess the adequacy of the asymptotic approximations. Section 5 reports an application to the multiple equilibria growth model of Durlauf and Johnson Ž1995.. The mathematical proofs are left to an Appendix. 2. ESTIMATION n The observed sample is yi , x i , qi 4is1 , where yi and qi are real-valued and x i is an m-vector. The threshold ¨ ariable qi may be an element of x i , and is assumed to have a continuous distribution. A sample-split or threshold regression model takes the form Ž1. ] Ž2.. This model allows the regression parameters to differ depending on the value of qi . To write the model in a single equation, define the dummy variable d i Žg . s qi F g 4 where ?4 is the indicator function and set x i Žg . s x i d i Žg ., so that Ž1. ] Ž2. equal Ž3. yi s u 9 x i q dnX x i Ž g . q e i where u s u 2 . Equation Ž3. allows all of the regression parameters to switch between the regimes, but this is not essential to the analysis. The results generalize to the case where only a subset of parameters switch between regimes and to the case where some regressors only enter in one of the two regimes. To express the model in matrix notation, define the n = 1 vectors Y and e by stacking the variables yi and e i , and the n = m matrices X and Xg by stacking the vectors xXi and x i Žg .9. Then Ž3. can be written as Ž4. Y s Xu q Xg dn q e. The regression parameters are Ž u , dn , g ., and the natural estimator is least squares ŽLS.. Let Ž5. S n Ž u , d , g . s Ž Y y Xu y Xg d . 9 Ž Y y Xu y Xg d . be the sum of squared errors function. Then by definition the LS estimators uˆ, dˆ, g ˆ jointly minimize Ž5.. For this minimization, g is assumed to be restricted to a bounded set w g , g x s G . Note that the LS estimator is also the MLE when e i is iid N Ž0, s 2 .. The computationally easiest method to obtain the LS estimates is through concentration. Conditional on g , Ž4. is linear in u and dn , yielding the condi- 578 B. E. HANSEN tional OLS estimators uˆŽg . and dˆŽg . by regression of Y on XgU s w X Xg x. The concentrated sum of squared errors function is S n Ž g . s Sn Ž uˆŽ g . , dˆŽ g . , g . s Y9Y y Y 9 XgU Ž XgUX XgU . y1 XgUX Y , and g ˆ is the value that minimizes SnŽg .. Since SnŽg . takes on less than n distinct values, g ˆ can be defined uniquely as gˆ s argmin Sn Ž g . gg Gn where Gn s G l q1 , . . . , qn4 , which requires less than n function evaluations. The slope estimates can be computed via uˆs uˆŽg ˆ . and dˆs dˆŽgˆ .. If n is very large, G can be approximated by a grid. For some N - n, let qŽ j. denote the Ž jrN .th quantile of the sample q1 , . . . , qn4 , and let GN s G l qŽ1. , . . . , qŽ n.4 . Then gˆN s argming g G N SnŽg . is a good approximation to g ˆ which only requires N function evaluations. From a computational standpoint, the threshold model Ž1. ] Ž2. is quite similar to the changepoint model Žwhere the threshold variable equals time, qi s i .. Indeed, if the observed values of qi are distinct, the parameters can be estimated by sorting the data based on qi , and then applying known methods for changepoint models. When there are tied values of qi , estimation is more delicate, as the sorting operation is no longer well defined nor appropriate. From a distributional standpoint, however, the threshold model differs considerably from the changepoint model. One way to see this is to note that if the regressors x i contain qi , as is typical in applications, then sorting the data by qi induces a trend into the regressors x i , so the threshold model is somewhat similar to a changepoint model with trended data. The presence of trends is known to alter the asymptotic distributions of changepoint tests Žsee Hansen Ž1992, 2000. and Chu and White Ž1992... More importantly, the distribution of changepoint estimates Žor the construction of confidence intervals. has not been studied for this case. Another difference is that the stochastic process R nŽg . s Ý nis1 x i e i qi F g 4 is a martingale Žin g . when qi s i Žthe changepoint model., but it is not necessarily so in the threshold model Žunless the data are independent across i .. This difference may appear minor, but it requires the use of a different set of asymptotic tools. 3. DISTRIBUTION THEORY 3.1. Assumptions Define the moment functionals Ž6. Ž7. M Ž g . s E Ž x i xXi qi F g 4 . , D Ž g . s E Ž x i xXi < qi s g . , and V Ž g . s E Ž x i xXi e i2 < qi s g . . SAMPLE SPLITTING 579 Let f Ž q . denote the density function of qi , g 0 denote the true value of g , D s DŽg 0 ., V s V Žg 0 ., f s f Žg 0 ., and Ms EŽ x i xXi .. ASSUMPTION 1: 1. Ž x i , qi , e i . is strictly stationary, ergodic and r-mixing, with r-mixing coefficients satisfying Ý`ms 1 rm1r2 - `. 2. EŽ e i < Fiy1 . s 0. 3. E < x i < 4 - ` and E < x i e i < 4 - `. 4. For all g g G , EŽ< x i < 4e i4 < qi s g . F C and EŽ< x i < 4 ¬ qi s g . F C for some C `, and f Žg . F f - `. 5. f Žg ., DŽg ., and V Žg . are continuous at g s g 0 . 6. dn s cnya with c / 0 and 0 - a - 12 . 7. c9Dc ) 0, c9Vc ) 0, and f ) 0. 8. M) M Žg . ) 0 for all g g G . Assumption 1.1 is relevant for time series applications, and is trivially satisfied for independent observations. The assumption of stationarity excludes time trends and integrated processes. The r-mixing assumption 2 controls the degree of time series dependence, and is weaker than uniform mixing, yet stronger than strong mixing. It is sufficiently flexible to embrace many nonlinear time series processes including threshold autoregressions. Indeed, Chan Ž1989. has demonstrated that a strong form of geometric ergodicity holds for TAR processes and, as discussed in the proof of Proposition 1 of Chan Ž1993., this implies rm s O Ž r m . with < r < - 1, which implies Assumption 1.1. Assumption 1.2 imposes that Ž1. ] Ž2. is a correct specification of the conditional mean. Assumptions 1.3 and 1.4 are unconditional and conditional moment bounds. Assumption 1.5 requires the threshold variable to have a continuous distribution, and essentially requires the conditional variance EŽ e i2 < qi s g . to be continuous at g 0 , which excludes regime-dependent heteroskedasticity. Assumption 1.6 may appear unusual. It specifies that the difference in regression slopes gets small as the sample size increases. Conceptually, this implies that we are taking an asymptotic approximation valid for small values of dn . The parameter a controls the rate at which dn decreases to zero, i.e., how small we are forcing dn to be. Smaller values of a are thus less restrictive. The reason for Assumption 1.6 is that Chan Ž1993. found that with dn fixed, nŽg ˆy g 0 . converged to an asymptotic distribution that was dependent upon nuisance parameters and thus not particularly useful for calculation of confidence sets. The difficulty is due to the Op Ž ny1 . rate of convergence. By letting dn tend towards zero, we reduce the rate of convergence and find a simpler asymptotic distribution. Assumption 1.7 is a full-rank condition needed to have nondegenerate asymptotic distributions. While the restriction c9Dc ) 0 might appear innocuous, it 2 For a definition, see Ibragimov Ž1975. and Peligrad Ž1982.. 580 B. E. HANSEN excludes the interesting special case of a ‘‘continuous threshold’’ model, which is Ž1. ] Ž2. with x i s Ž1 qi .9 and dnX g 0U s 0 where g 0U s Ž1 g 0 .9. In this case the conditional mean takes the form of a continuous linear spline. From definition Ž7. we can calculate for this model that c9Dc s c9EŽ x i xXi < qi s g 0 . c s c9g 0Ug 0UX c s 0. A recent paper that explores the asymptotic distribution of the least squares estimates in this model is Chan and Tsay Ž1998.. Assumption 1.8 is a conventional full-rank condition which excludes multicollinearity. Note that this assumption restricts G to a proper subset of the support of qi . This is a technical condition which simplifies our consistency proof. 3.2. Asymptotic Distribution A two-sided Brownian motion W Ž r . on the real line is defined as ¡W Žyr . , ¢W Ž r . , W Ž r . s~0, 1 2 r - 0, r s 0, r ) 0, where W1Ž r . and W2 Ž r . are independent standard Brownian motions on w0, `.. THEOREM 1: Under Assumption 1, n1y 2 a Žg ˆy g 0 . ªd v T, where c9Vc vs Ž c9Dc . 2 f and 1 T s argmax y < r < q W Ž r . . 2 y`-r-` Theorem 1 gives the rate of convergence and asymptotic distribution of the threshold estimate g ˆ. The rate of convergence is n1y 2 a , which is decreasing in a . Intuitively, a larger a decreases the threshold effect dn , which decreases the sample information concerning the threshold g , reducing the precision of any estimator of g . Theorem 1 shows that the distribution of the threshold estimate under our ‘‘small effect’’ asymptotics takes a similar form to that found for changepoint Ž1991., estimates. For the latter theory, see Picard Ž1985., Yao Ž1987., Dumbgen ¨ Ž . and Bai 1997 . The difference is that the asymptotic precision of g ˆ is proportional to the matrix EŽ x, xXi < qi s g 0 . while in the changepoint case the asymptotic precision is proportional to the unconditional moment matrix EŽ x i xXi .. It is interesting to note that these moments are equal when x i and qi are independent, which would not be typical in applications. The asymptotic distribution in Theorem 1 is scaled by the ratio v . In the leading case of conditional homoskedasticity Ž9. E Ž e i2 < qi . s s 2 , 581 SAMPLE SPLITTING then V s s 2 D and v simplifies to vs s2 Ž c9Dc . f . The asymptotic distribution of g ˆ is less dispersed when v is small, which occurs when s 2 is small, f Žg 0 . is large Žso that many observations are near the threshold., andror < c < is large Ža large threshold effect.. The distribution function for T is known. ŽSee Bhattacharya and Brockwell Ž1976... Let F Ž x . denote the cumulative standard normal distribution function. Then for xG 0, P ŽT F x . s 1 q ( x 2p exp y x ž / 8 3 3'x q exp Ž x . F y 2 2 ž y xq 5 F y 'x / ž /ž / 2 2 . and for x- 0, P ŽT F x . s 1 y P ŽT F yx .. A plot of the density function of T is given in Figure 1. FIGURE 1.}Asymptotic density of threshold estimator. 582 B. E. HANSEN 3.3. Likelihood Ratio Test To test the hypothesis H0 : g s g 0 , a standard approach is to use the likelihood ratio statistic under the auxiliary assumption that e i is iid N Ž0, s 2 .. Let LR n Ž g . s n Sn Ž g . y Sn Ž g ˆ. Sn Ž g ˆ. . The likelihood ratio test of H0 is to reject for large values of LR nŽg 0 .. THEOREM 2: Under Assumption 1, LR n Ž g 0 . ªd h 2j , where j s max w 2W Ž s . y < s <x sgR and h2s c9Vc s 2 c9Dc . The distribution function of j is P Ž j F x . s Ž1 y eyx r2 . 2 . If homoskedasticity Ž9. holds, then h 2 s 1 and the asymptotic distribution of LR nŽg 0 . is free of nuisance parameters. If heteroskedasticity is suspected, h 2 must be estimated. We discuss this in the next section. Theorem 2 gives the large sample distribution of the likelihood ratio test for hypotheses on g . The asymptotic distribution is nonstandard, but free of nuisance parameters under Ž9.. Since the distribution function is available in a simple closed form, it is easy to generate p-values for observed test statistics. Namely, 1 2 pn s 1 y 1 y exp y LR n Ž g 0 . 2 ž ž 2 // is the asymptotic p-value for the likelihood ratio test. Critical values can be calculated by direct inversion of the distribution function. Thus a test of H0 : g s g 0 rejects at the asymptotic level of a if LR nŽg 0 . exceeds cj Ž1 y a., where cj Ž z . s y2 lnŽ1 y 'z .. Selected critical values are reported in Table I. TABLE I ASYMPTOTIC CRITICAL VALUES PŽ j F x. .80 .85 .90 .925 .95 .975 .99 4.50 5.10 5.94 6.53 7.35 8.75 10.59 SAMPLE SPLITTING 583 3.4. Estimation of h 2 The asymptotic distribution of Theorem 2 depends on the nuisance parameter h 2 . It is therefore necessary to consistently estimate this parameter. Let r 1 i s Ž dnX x i . 2 Ž e i2rs 2 . and r 2 i s Ž dnX x i . 2 . Then Ž 10. h2s E Ž r 1 i < qi s g 0 . E Ž r 2 i < qi s g 0 . is the ratio of two conditional expectations. Since r 1 i and r 2 i are unobserved, let ˆr1 i s Ž dˆX x i . 2 Ž ˆe i2rsˆ 2 . and ˆr 2 i s Ž dˆX x i . 2 denote their sample counterparts. A simple estimator of the ratio Ž10. uses a polynomial regression, such as a quadratic. For j s 1 and 2, fit the OLS regressions ˆr ji s m ˆ j0 q m ˆ j1 qi q m ˆ j2 qi2 q «ˆji , and then set h ˆ s 2 m ˆ 10 q m ˆ 11 gˆq m ˆ 12 gˆ 2 m ˆ 20 q m ˆ 21 gˆq m ˆ 22 gˆ 2 . An alternative is to use kernel regression. The Nadaraya-Watson kernel estimator is h ˆ2s Ý nis1 K h Ž g ˆy qi . tˆ1 i Ý nis1 K h Ž g ˆy qi . ˆt 2 i where K hŽ u. s hy1 K Ž urh. for some bandwidth h and kernel K Ž u., such as the Epanechnikov K Ž u. s 34 Ž1 y u 2 .< u < F 14 . The bandwidth h may be selected according to a minimum mean square error criterion Žsee Hardle and Linton Ž1994... 4. CONFIDENCE INTERVALS 4.1. Threshold Estimate A common method to form confidence intervals for parameters is through the inversion of Wald or t statistics. To obtain a confidence interval for g , this would involve the distribution T from Theorem 1 and an estimate of the scale parameter v . While T is parameter-independent, v is directly a function of dn and indirectly a function of g 0 Žthrough DŽg 0 ... When asymptotic sampling distributions depend on unknown parameters, the Wald statistic can have very poor finite sample behavior. In particular, Dufour Ž1997. argues that Wald statistics have particularly poorly-behaved sampling distributions when the parameter has a region where identification fails. The threshold regression model is an example where this occurs, as the threshold g is not identified when 584 B. E. HANSEN dn s 0. These concerns have encouraged us to explore the construction of confidence regions based on the likelihood ratio statistic LR nŽg .. Let C denote the desired asymptotic confidence level Že.g. C s .95., and let c s cj Ž C . be the C-level critical value for j Žfrom Table I.. Set Gˆs g : LR n Ž g . F c 4 . Theorem 2 shows that P Žg 0 g Gˆ . ª C as n ª ` under the homoskedasticity assumption Ž9.. Thus Gˆ is an asymptotic C-level confidence region for g . A graphical method to find the region Gˆ is to plot the likelihood ratio LR nŽg . against g and draw a flat line at c. ŽNote that the likelihood ratio is identically zero at g s g ˆ.. Equivalently, one may plot the residual sum of squared errors SnŽg . against g , and draw a flat line at SnŽg ˆ . q sˆ 2 c. If the homoskedasticity condition Ž9. does not hold, we can define a scaled likelihood ratio statistic: LRUn Ž g . s LR n Ž g . h ˆ 2 s Sn Ž g . y Sn Ž g ˆ. sˆ 2h ˆ2 and an amended confidence region Gˆ * s g : LRUn Ž g . F c 4 . Since h ˆ 2 is consistent for h 2 , P Žg 0 g Gˆ *. ª C as n ª ` whether or not Ž9. holds, so Gˆ * is a heteroskedasticity-robust asymptotic C-level confidence region for g . These confidence intervals are asymptotically correct under the assumption that dn ª 0 as n ª `, which suggests that the actual coverage may differ from the desired level for large values of dn . We now consider the case of a s 0, which implies that dn is fixed as n increases. We impose the stronger condition that the errors e i are iid N Ž0, s 2 ., strictly independent of the regressors x i and threshold variable qi . THEOREM 3: Under Assumption 1, modifying part 6 so that a s 0, and the errors e i are iid N Ž0, s 2 . strictly independent of the regressors x i and threshold ¨ ariable qi , then P Ž LR n Ž g 0 . G x . F P Ž j G x . q o Ž 1 . . Theorem 3 shows that at least in the case of iid Gaussian errors, the likelihood ratio test is asymptotically conservative. Thus inferences based on the confidence region Gˆ are asymptotically valid, even if dn is relatively large. Unfortunately, we do not know if Theorem 3 generalizes to the case of non-normal errors or regressors that are not strictly exogenous. The proof of Theorem 3 relies on the Gaussian error structure and it is not clear how the theorem would generalize. 585 SAMPLE SPLITTING 4.2. Simulation E¨ idence We use simulation techniques to compare the coverage probabilities of the confidence intervals for g . We use the simple regression model Ž3. with iid data and x i s Ž1 z i .9, e i ; N Ž0, 1., qi ; N Ž2, 1., and g s 2. The regressor z i was either iid N Ž0, 1. or z i s qi . The likelihood ratio statistic is invariant to u . Partitioning dn s Ž d 1 d 2 .9 we set d 1 s 0 and assessed the coverage probability of the confidence interval Gˆ as we varied d 2 and n. We set d 2 s .25, .5, 1.0, 1.5, and 2.0 and n s 50, 100, 250, 500, and 1000. Using 1000 replications Table II reports the coverage probabilities for nominal 90% confidence intervals. The results are quite informative. For all cases, the actual coverage rates increase as n increases or d 2 increases, which is consistent with the prediction of Theorem 3. For small sample sizes and small threshold effects, the coverage rates are lower than the nominal 90%. As expected, however, as the threshold effect d 2 increases, the rejection rates rise and become quite conservative. 4.3. Slope Parameters Letting u s Ž u , dn . and uˆs Ž uˆ, dˆ.. Lemma A.12 in the Appendix shows that Ž 11. 'n Ž uˆy u . ªd Z ; N Ž0, Vu . where Vu is the standard asymptotic covariance matrix if g s g 0 were fixed. This means that we can approximate the distribution of uˆ by the conventional ˆ Žg . denote the normal approximation as if g were known with certainty. Let Q conventional asymptotic C-level confidence region for u constructed under the ˆ Žgˆ .. ª C as n ª `. assumption that g is known. Ž11. shows that P Ž u g Q In finite samples, this procedure seems likely to under-represent the true sampling uncertainty, since it is not the case that g ˆs g 0 in any given sample. It may be desirable to incorporate this uncertainty into our confidence intervals for u . This appears difficult to do using conventional techniques, as uˆŽg . is not differentiable with respect to g , and g ˆ is non-normally distributed. A simple yet constructive technique is to use a Bonferroni-type bound. For any r - 1, let TABLE II CONFIDENCE INTERVAL CONVERGENCE FOR g AT 10% LEVEL x i s qi d2 s .25 .5 n s 50 n s 100 n s 250 n s 500 n s 1000 .86 .82 .83 .90 .90 .87 .90 .93 .93 .93 1.0 .93 .96 .97 .97 .98 x i ; N Ž0, 1 . 1.5 .97 .98 .98 .98 .99 2.0 .99 .99 .99 .99 .99 .25 .5 .90 .84 .80 .81 .86 .87 .86 .92 .93 .93 1.0 .93 .92 .94 .95 .94 1.5 .93 .96 .96 .96 .96 2.0 .97 .95 .98 .98 .97 586 B. E. HANSEN Gˆ Ž r . denote the confidence interval for g with asymptotic coverage r . For each ˆ Žg . and then set g g Gˆ Ž r . construct the pointwise confidence region Q D Qˆ Ž g . . ˆr s Q g g Gˆ Ž r . ˆr > Qˆ Žgˆ ., it follows that P Ž u g Qˆr . G P Ž u g Qˆ Žgˆ .. ª C as n ª `. Since Q This procedure is assessed using a simple Monte Carlo simulation. In Table III we report coverage rates of a nominal 95% confidence interval Ž C s .95. on d 2 . The same design is used as in the previous section, although the results are reported only for the case x i independent of qi and a more narrow set of n and ˆ0 d 2 to save space. We tried r s 0, .5, .8, and .95. As expected, the simple rule Q is somewhat liberal for small u 2 and n, but is quite satisfactory for large n or d 2 . In fact, all choices for r lead to similar results for large d 2 . For small d 2 and n, the best choice may be r s .8, although this may produce somewhat conservative confidence intervals for small d 2 . ˆ 0 s Qˆ Žgˆ . works fairly well for large n In summary, while the naive choice Q andror large threshold effects, it has insufficient coverage probability for small n or threshold effect. This problem can be solved through the conservative ˆr with r ) 0, and the choice r s .8 appears to work reasonably well procedure Q in a simulation. 5. APPLICATION: GROWTH AND MULTIPLE EQUILIBRIA Durlauf and Johnson Ž1995. suggest that cross-section growth behavior may be determined by initial conditions. They explore this hypothesis using the Summers-Heston data set, reporting results obtained from a regression tree methodology. A regression tree is a special case of a multiple threshold regression. The estimation method for regression trees due to Breiman et al. Ž1984. is somewhat ad hoc, with no known distributional theory. To illustrate the usefulness of our estimation theory, we apply our techniques to regressions similar to those reported by Durlauf-Johnson. TABLE III CONFIDENCE INTERVAL CONVERGENCE FOR d 2 n s 100 d2s .25 .5 ˆ0 Q ˆ.5 Q ˆ.8 Q ˆ.95 Q .90 .90 .93 .95 .96 .96 .95 .97 .99 .99 1.0 AT 5% LEVEL n s 250 2.0 .25 .5 1.0 .95 .95 .90 .94 .95 .96 .95 .96 .97 .96 .97 .98 .95 .94 .99 .99 n s 500 2.0 .25 .5 1.0 2.0 .94 .94 .91 .94 .97 .98 .94 .94 .94 .95 .96 .94 .97 .98 .95 .95 .97 .94 .99 .99 .95 .95 SAMPLE SPLITTING 587 The model seeks to explain real GDP growth. The specification is ln Ž YrL . i , 1985 y ln Ž YrL . i , 1960 s z q b ln Ž YrL . i , 1960 q p 1 ln Ž IrY . i q p 2 ln Ž n i q g q d . q p 3 ln Ž SCHOOL . i q e i , where for each country i: v Ž YrL. i, t s real GDP per member of the population aged 15]64 in year t; v Ž IrY . i s investment to GDP ratio; v n i s growth rate of the working-age population; Ž SCHOOL. i s fraction of working-age population enrolled in secondary school. v The variables not indexed by t are annual averages over the period 1960]1985. Following Durlauf-Johnson, we set g q d s 0.05. Durlauf-Johnson estimate Ž12. for four regimes selected via a regression tree using two possible threshold variables that measure initial endowment: per capita output YrL and the adult literacy rate LR, both measured in 1960. The authors argue that the error e i is heteroskedastic so present their results with heteroskedasticity-corrected standard errors. We follow their lead and use heteroskedasticity-consistent procedures, estimating the nuisance parameter h 2 using an Epanechnikov kernel with a plug-in bandwidth. Since the theory outlined in this paper only allows one threshold and one threshold variable, we first need to select among the two threshold variables, and verify that there is indeed evidence for a threshold effect. We do so by employing the heteroskedasticity-consistent Lagrange multiplier ŽLM. test for a threshold of Hansen Ž1996.. Since the threshold g is not identified under the null hypothesis of no threshold effect, the p-values are computed by a bootstrap analog, fixing the regressors from the right-hand side of Ž12. and generating the bootstrap dependent variable from the distribution N Ž0, ˆ e i2 ., where ˆ e i is the OLS residual from the estimated threshold model. Hansen Ž1996. shows that this bootstrap analog produces asymptotically correct p-values. Using 1000 bootstrap replications, the p-value for the threshold model using initial per capital output was marginally significant at 0.088 and that for the threshold model using initial literacy rate was insignificant at 0.214. This suggests that there might be a sample split based on output. Figure 2 displays a graph of the normalized likelihood ratio sequence LRUn Žg . as a function of the threshold in output. The LS estimate of g is the value that minimizes this graph, which occurs at g ˆs $863. The 95% critical value of 7.35 is also plotted Žthe dotted line., so we can read off the asymptotic 95% confidence set Gˆ * s w$594, $1794x from the graph from where LRUn Žg . crosses the dotted line. These results show that there is reasonable evidence for a two-regime 588 B. E. HANSEN FIGURE 2.}First sample split: Confidence interval construction for threshold. specification, but there is considerable uncertainty about the value of the threshold. While the confidence interval for g might seem rather tight by viewing Figure 2, it is perhaps more informative to note that 40 of the 96 countries in the sample fall in the 95% confidence interval, so cannot be decisively classified into the first or second regime. If we fix g at the LS estimate $863 and split the sample in two based on initial GDP, we can Žmechanically . perform the same analysis on each subsample. It is not clear how our theoretical results extend to such procedures, but this will enable more informative comparisons with the Durlauf-Johnson results. Only 18 countries have initial output at or below $863, so no further sample split is possible among this subsample. Among the 78 countries with initial output above $863, a sample split based on initial output produces an insignificant p-value of 0.152, while a sample split based on the initial literacy rate produces a p-value of 0.078, suggesting a possible threshold effect in the literacy rate. The point estimate of the threshold in the literacy rate is 45%, with a 95% asymptotic confidence interval w19%, 57%x. The graph of the normalized likelihood ratio statistic as a function of the threshold in the literacy rate is displayed in Figure 3. This confidence interval contains 19 of the 78 countries in the subsample. We could try to further split these two subsamples, but none of the bootstrap test statistics were significant at the 10% level. SAMPLE SPLITTING 589 FIGURE 3.}Second sample split: Confidence interval construction for threshold. Our point estimates are quite similar to those of Durlauf and Johnson Ž1995.. What is different are our confidence intervals. The confidence intervals for the threshold parameters are sufficiently large that there is considerable uncertainty regarding their values, hence concerning the proper division of countries into convergence classes as well. 6. CONCLUSION This paper develops asymptotic methods to construct confidence intervals for least-squares estimates of threshold parameters. The confidence intervals are asymptotically conservative. It is possible that more accurate confidence intervals may be constructed using bootstrap techniques. This may be quite delicate, however, since the sampling distribution of the likelihood ratio statistic appears to be nonstandard and nonpivotal. This would be an interesting subject for future research. Dept. of Economics, Social Science Bldg., Uni¨ ersity of Wisconsin, Madison, WI 53706-1393, U.S.A.; bhansen@ssc.wisc.edu;www.ssc.wisc.edur ; bhansen. Manuscript recei¨ ed July, 1996; final re¨ ision recei¨ ed April, 1999. 590 B. E. HANSEN APPENDIX : MATHEMATICAL PROOFS Define h i Žg 1 , g 2 . s < x i e i < < d i Žg 2 . y d i Žg 1 .< and k i Žg 1 , g 2 . s < x i < < d i Žg 2 . y d i Žg 1 .<. LEMMA A.1: There is a C1 - ` such that for g F g 1 F g 2 F g , and r F 4, Ž 12. Eh ri Ž g 1 , g 2 . F C1 < g 2 y g 1 < , Ž 13. Ek ir Ž g 1 , g 2 . F C1 < g 2 y g 1 < . PROOF: For any random variable Z, Ž 14 . d dg E Ž Zd i Ž g .. s E Ž Z < qi s g . f Ž g . . Thus under Assumption 1.4, d dg Ž < x i e i < rd i Ž g .. s E Ž < x i e i < r < qi s g . f Ž g . F w E Ž < x i e i < 4 < qi s g .x rr4 f Žg . F C r r4 f F C1 , setting C1 s maxw1, C x f. Since d i Žg 2 . y d i Žg 1 . equals either zero or one, Eh ri Ž g 1 , g 2 . s E Ž < x i e i < rd i Ž g 2 .. y E Ž < x i e i < rd i Ž g 1 .. F C1 < g 2 y g 1 < , by a first-order Taylor series expansion, establishing Ž12.. The proof of Ž13. is identical. Q.E.D. LEMMA A.2: There is a K- ` such that for all g F g 1 F g 2 F g , Ž 15. E Ž 16. E 1 n 1 n 2 F K <g 2 y g1 <, Ý Ž h2i Ž g 1 , g 2 . y Eh2i Ž g 1 , g 2 .. 'n is1 2 F K <g 2 y g1 <. Ý Ž k i2 Ž g 1 , g 2 . y Ek i2 Ž g 1 , g 2 .. 'n is1 PROOF: Lemma 3.4 of Peligrad Ž1982. shows that for r-mixing sequences satisfying Assumption 1.1, there is a K 9 - ` such that E 1 2 n Ý 'n is1 Ž h2i Ž g 1 , g 2 . y Eh2i Ž g 1 , g 2 .. F K 9E Ž h 2i Ž g 1 , g 2 . y Eh2i Ž g 1 , g 2 .. 2 F 2 K 9Eh4i Ž g 1 , g 2 . F 2 K 9C1 < g 2 y g 1 < where the final inequality is Ž12.. This is Ž15. with Ks 2 K 9C1 . The proof of Ž16. is identical. Q.E.D. 591 SAMPLE SPLITTING Let Jn Ž g . s 1 n Ý x i ei di Ž g . . 'n is1 LEMMA A.3: There are finite constants K 1 and K 2 such that for all g 1 , « ) 0, h ) 0, and d G ny1 , if 'n G K 2rh, then P ž sup g 1F gF g 1q d K1 d 2 Jn Ž g . y Jn Ž g 1 . ) h F / h4 . PROOF: Let m be an integer satisfying n dr2 F m F n d , which is possible since n d G 1. Set dm s drm. For k s 1, . . . , m q 1, set g k s g 1 q dm Ž k y 1., h i k s h i Žg k , g kq1 ., and h i jk s h i Žg j , g k .. Note that Ž12. implies Eh ri k F C1 dm and Eh ri jk F C1 < k y j < dm for r F 4. Letting Hn k s ny1 Ý nis1 h i k , observe that for g k F g F g kq1 , Jn Ž g . y Jn Ž g k . F 'n Hn k F 'n < Hn k y EHn k < q 'n EHn k . Thus Ž 17. sup g 1F gF g 1q d Jn Ž g . y Jn Ž g 1 . F Jn Ž g k . y Jn Ž g 1 . max 2FkFmq1 q max 1FkFm 'n < Hn k y EHn k < q max 1FkFm 'n EHn k . For any 1 F j - k F m q 1, by Burkholder’s inequality Žsee, e.g., Hall and Heyde Ž1980, p. 23.. for some C9 - `, Minkowski’s inequality, Eh2i k F C1 dm , Ž15., ny1 F dm and Ž k y j .1r 2 F Ž k y j ., Ž 18. 4 E Jn Ž g k . y Jn Ž g j . s E Ý x i ei Ž d i Ž g k . y d i Ž g j .. 'n is1 F C9E F C9 4 n 1 1 n 2 n Ý h 2i jk is1 EHi2jk q ž E n F C9 C1 Ž k y j . dm q Ý Ž h2i jk y Eh2i jk . is1 ž 1r2 2 2 n 1 K Ž k y j . dm n / 1r2 2 / 2 2 F C9 w C1 q 'K x ŽŽ k y j . dm . . The bound Ž18. and Theorem 12.2 of Billingsley Ž1968, p. 94. imply that there is a finite K 0 such that Ž 19. P ž max 2FkFmq1 Jn Ž g k . y Jn Ž g 1 . ) h F K 0 / Ž m dm . 2 h4 which bounds the first term on the right-hand side of Ž17.. sK0 d2 h4 , 592 B. E. HANSEN We next consider the second term. Lemma 3.6 of Peligrad Ž1982. shows there is a K - - ` such that Ž 20 . E <'n Ž Hn k y EHGn k . < 4 F K - ny1 Eh4i k q Ž Eh2i k . Ž 2 . 2 F K - Ž ny1 C1 dm q Ž C1 dm . . F K - Ž C1 q C12 . dm2 , where the third inequality is ny1 F dm . Markov’s inequality and Ž20. yield Ž 21 . P ž max <'n Ž Hn k y EHn k . < ) h F m / 1FkFm F K - Ž C1 q C12 . dm2 h4 K - Ž C1 q C12 . d 2 h4 , where the final inequality uses m dm s d and dm F d . Finally, note that Ž 22. 'n EHn k s 'n Eh i k F 'n C1 dm F 2C1 'n , since dm F 2rn. Together, Ž17., Ž19., Ž21., and Ž22. imply that when 2C1 r 'n F h , P ž sup g 1F gF g 1q d < Jn Ž g . y Jn Ž g 1 < ) 3h F / w K 0 q K - Ž C1 q C12 .x d 2 h4 , which implies the result setting K 1 s 3 4 w K 0 q K -Ž C1 q C12 .x and K 2 s 6C1. Q.E.D. Let ‘‘« ’’ denote weak convergence with respect to the uniform metric. LEMMA A.4: JnŽg . « J Žg ., a mean-zero Gaussian process with almost surely continuous sample paths. PROOF: For each g , x i e i d i Žg . is a square integrable stationary martingale difference, so JnŽg . converges pointwise to a Gaussian distribution by the CLT. This can be extended to any finite collection of g to yield the convergence of the finite dimensional distributions. Fix « ) 0 and h ) 0. Set d s «h 4rK 1 and n s maxw dy1 , K 22 rh 2 x, where K 1 and K 2 are defined in Lemma A.3. Then by Lemma A.3, for any g 1 , if n G n, P ž sup g 1F gF g 1q d Jn Ž g . y Jn Ž g . ) h F / K1 d 2 h4 s d« . This establishes the conditions for Theorem 15.5 of Billingsley Ž1968.. ŽSee also the proof of Theorem 16.1 of Billingsley Ž1968... It follows that JnŽg . is tight, so JnŽg . « J Žg .. Q.E.D. LEMMA A.5: g ˆªp g 0 . PROOF: Lemma 1 of Hansen Ž1996. shows that Assumption 1 is sufficient for Ž 23 . Mn Ž g . s 1 n X Xg Xg s 1 n n Ý x i xXi d i Ž g . ªp M Ž g . , is1 uniformly over g g R. Observe that Ž23. implies Mn s Ž1rn. X 9 X ªp M. 593 SAMPLE SPLITTING Let X 0 s Xg 0 . Then since Y s Xu q X 0 dn q e and X lies in the space spanned by Pg s XgU Ž XgUX XgU .y1 XgUX , X X X X Sn Ž g . y e9e s Y9 Ž I y Pg . Y y e9e s ye9Pg e q 2 dn X 0 Ž I y Pg . e q dn X 0 Ž I y Pg . X 0 dn . Using Assumption 1.6, Lemma A.4, and Ž23., we see that uniformly over g g G , X ny1 q2 a Ž Sn Ž g . y e9e . s ny1 c9 Ž X 0 Ž I y Pg . X 0 . c q op Ž 1 . . The projection Pg can be written as the projection onto w Xg , Zg x, where Zg s X y Xg is a matrix X X X X X whose ith row is x i Ž1 y d i Žg ... Observe that Xg Zg s 0, and for g G g 0 , X 0 Xg s X 0 X 0 and X 0 Zg s 0. Then using Ž23. we calculate that uniformly over g g w g 0 , g x X ny1 c9 Ž X 0 Ž I y Pg . X 0 . c s c9 Ž Mn Ž g 0 . y Mn Ž g 0 . Mn Ž g . ªp c9 Ž M Ž g 0 . y M Ž g 0 . M Ž g . y1 y1 Mn Ž g 0 .. c M Ž g 0 .. c ' b1Ž g . , say. Note that M Žg .y1 exists under Assumption 1.8 so b1Žg . is well defined. A calculation based on Ž14. shows Ž 24. d dg M Žg . s D Žg . f Žg . , where M Žg . and DŽg . are defined in Ž6. and Ž7.. Using Ž24., d dg b1Ž g . s c9M Ž g 0 . M Ž g . y1 DŽg . f Žg . M Žg . y1 M Žg 0 . c G 0 so b1Žg . is continuous and weakly increasing on w g 0 , g x. Additionally, d dg b1Ž g 0 . s c9Dfc ) 0 by Assumption 1.7, so b1Žg . is uniquely minimized at g 0 on w g 0 , g x. X Symmetrically, we can show that uniformly over g g w g , g 0 x, ny1 c9Ž X 0 Ž I y Pg . X 0 . c ªp b 2 Žg ., a weakly decreasing continuous function which is uniquely minimized at g 0 . Thus uniformly over g g G , ny1 q2 a Ž SnŽg . y e9e . ªp b1Žg .g G g 0 4 q b 2 Žg .g - g 0 4, a continuous function with unique minimizer g 0 . Since g ˆ minimizes SnŽg . y e9e, it follows that gˆªp g 0 Žsee, e.g., Theorem 2.1 of Newey and McFadden Ž1994... Q.E.D. LEMMA A.6: n a Ž uˆy u . s op Ž1. and n a Ž dˆy dn . s op Ž1.. PROOF: We show the result for dˆ, as the derivation for uˆ is similar. Let PX s I y X Ž X 9 X .y1 X 9. Then on g g G , using Ž23., 1 n X Ž . Ž . Ž . y1 M Ž g . s M* Ž g . , Xg PX Xg s Mn Ž g . y Mn Ž g . My1 n Mn g « M g y M g M say. Observe that for g g G , M*Žg . is invertible since My M Žg . is invertible by Assumption 1.8. Also, 1 n X Xg PX X 0 « M Ž g n g 0 . y M Ž g . My1 M Ž g 0 . s M0U Ž g . , 594 B. E. HANSEN say. Thus n a Ž dˆŽ g . y dn . s « M* Ž g . ž y1 1 y1 X Xg PX Xg n / ž 1 n X Xg PX ŽŽ X 0 y Xg . c q en a . / Ž M0U Ž g . y M* Ž g .. c s d Žg . , M0U Žg 0 . s M*Žg 0 . so d Žg 0 . s 0. Since M Žg . is continuous and M*Žg . is invertible, say. Note that d Žg . is continuous in G . Lemma A.5 allows us to conclude that n a Ž dˆy dn . s n a Ž dˆŽ g ˆ . y dn . ªp d Ž g 0 . s 0. Q.E.D. Define GnŽg . s ny1 Ý nis1Ž c9 x i . 2 < d i Žg . y d i Žg 0 .< and K nŽg . s ny1 Ý nis1 k i2 Žg 0 , g .. Set a n s n1y 2 a . LEMMA A.7: There exist constants B ) 0, 0 - d- `, and 0 - k - `, such that for all h ) 0 and « ) 0, there exists a ¨ - ` such that for all n, Ž 25. P P ž ¨ an ž Gn Ž g . inf F < g y g 0 <FB K nŽg . sup ¨ an <g y g 0 < F < g y g 0 <FB <g y g 0 < / - Ž1 y h . d F « , / ) Ž1 q h . k F « . PROOF: We show Ž25., as the proof of Ž26. is similar. First, note that for g G g 0 , Ž 27. EGn Ž g . s c9 Ž M Ž g . y M Ž g 0 .. c, so by Ž24., dEGnŽg .rdg s c9DŽg . f Žg . c, and the sign is reversed if g - g 0 . Since c9DŽg 0 . f Žg 0 . c ) 0 ŽAssumption 1.8. and c9DŽg . f Žg . c is continuous at g 0 ŽAssumption 1.5., then there is a B sufficiently small such that ds min < g y g 0 <FB c9D Ž g . f Ž g . c ) 0. Since EGnŽg 0 . s 0, a first-order Taylor series expansion about g 0 yields Ž 28. inf < g y g 0 <FB EGn Ž g . G d < g y g 0 < . Lemma A.2 Ž16. yields Ž 29. 2 2 E Gn Ž g . y EGn Ž g . F < c < 4E K n Ž g . y EK n Ž g . F < c < 4 ny1 K < g y g 0 < . For any h and « , set Ž 30. bs 1 y hr2 1yh )1 and Ž 31. ¨s 8 < c < 4K h d 1 y 1rb . « 2 2Ž . 595 SAMPLE SPLITTING We may assume that n is large enough so that ¨ ra n F B, else the inequality Ž25. is trivial. For j s 1, 2, . . . , N q 1, set g j s g 0 q ¨ b jy 1ra n , where N is the integer such that g N y g 0 s ¨ b Ny1 ra n F B and g Nq 1 y g 0 ) B. ŽNote that N G 1 since ¨ ra n F B.. Markov’s inequality, Ž29., and Ž28. yield P ž Gn Ž g j . sup y1 ) EGn Ž g j . 1FjFN h 2 2 N 2 E Gn Ž g j . y EGn Ž g j . / ž /Ý F F h h Ý js1 s ny2 a d 2 < gj y g 0 < 2 4 < c < 4K h d ¨ 2 2 4< c< K 4 F 2 < c < 4Kny1 < g j y g 0 < N 4 2 EGn Ž g j . js1 2 ` Ý js1 1 h d ¨ 1 y 1rb 2 2 1 b jy 1 s «r2 where the final equality is Ž31.. Thus with probability exceeding 1 y «r2, Ž 32. Gn Ž g j . EGn Ž g j . y1 F h 2 for all 1 F j F N. So for any g such that Ž ¨ ra n . F g y g 0 F B, there is some j F N such that g j - g - g jq1 , and on the event Ž32., Ž 33. Gn Ž g . <g y g 0 < G Gn Ž g j . EGn Ž g j . EGn Ž g j . < g jq1 y g 0 < G 1y ž h 2 / d <gj y g 0 < < g jq 1 y g 0 < s Ž 1 y h . d, using Ž28., the construction Žg j y g 0 .rŽg jq1 y g 0 . s 1rb, and Ž30.. Since this event has probability exceeding 1 y «r2, we have established P ž ¨ an inf Fg y g 0 FB Gn Ž g . <g y g 0 < / - Ž 1 y h . d F «r2. A symmetric argument establishes a similar inequality where the infimum is taken over yŽ ¨ ra n . G g y g 0 G yB, establishing Ž25.. Q.E.D. LEMMA A.8: For all h ) 0 and « ) 0, there exists some ¨ - ` such that for any B - `, Ž 34. P sup ¨ an F < g y g 0 <FB Jn Ž g . y Jn Ž g 0 . 'a n <g y g 0 < 0 )h F« . 596 B. E. HANSEN PROOF: Fix h ) 0. For j s 1, 2, . . . , set g j y g 0 s ¨ 2 jy 1ra n , where ¨ - ` will be determined later. A straightforward calculation shows that Ž 35. Jn Ž g . y Jn Ž g 0 . sup ¨ 'a F < g y g 0 <FB an n <g y g 0 < Jn Ž g j . y Jn Ž g 0 . F 2 sup 'a j)0 q 2 sup n <gj y g 0 < Jn Ž g . y Jn Ž g j . sup 'a j)0 g jF g F g jq 1 n < gj y g 0 < . Markov’s inequality, the martingale difference property, and Ž12. imply Ž 36. ž Jn Ž g j . y Jn Ž g 0 . P 2 sup 'a j)0 n <gj y g 0 < / )h F s F s ` 4 h an < g j y g 0 < js1 4 ` h2 js1 js1 C1 < g j y g 0 < an < g j y g 0 < 2 ` 4C1 h an < g j y g 0 < 2 ` Ý 2 1 Ý 2 2 2 E Ž < x i ei < 2 di Ž gj . y di Ž g 0 . . Ý 4 h E Jn Ž g j . y Jn Ž g 0 . Ý 2 js1 ¨2 jy1 s 8C1 h2¨ . Set d j s g jq1 y g j and hj s a n < g j y g 0 <h. Then ' Ž 37. ž P 2 sup Jn Ž g . y Jn Ž g j . sup j)0 g jF g F g jq 1 'a n )h <gj y g 0 < / ` F2 ÝP js1 ž sup g jF g F g jq d j Jn Ž g . y Jn Ž g j . ) hj . / y1 Observe that if ¨ G 1, then d j G ay1 . Furthermore, if ¨ G K 2 rh , then hj s ay1r2 2 jy1 ¨ h G n Gn n y1 r2 y1r2 K 2 an G K2 n . Thus if ¨ G maxw1, K 2 rh x the conditions for Lemma A.3 hold, and the right-hand side of Ž37. is bounded by ` Ž 38. 2 Ý js1 K 1 d j2 hj4 ` s Ý js1 K 1 < g jq1 y g j < 2 a2n < g j y g 0 < 4h 4 s 8 K1 3h 4 ¨ 2 . Ž35., Ž36., Ž37., and Ž38. show that if ¨ G maxw1, K 2rh x, P sup ¨ an F < g y g 0 <FB Jn Ž g . y Jn Ž g 0 . 'a < < n gyg0 0 ) 2h F 8C1 h2¨ q 8 K1 3h 4 ¨ 2 , which can be made arbitrarily small by picking suitably large ¨ . Q.E.D. LEMMA A.9: a nŽg ˆy g 0 . s Op Ž1.. PROOF: Let B, d, and k be defined as in Lemma A.7. Pick h ) 0 and k ) 0 small enough so that Ž 39. Ž 1 y h . dy 2 Ž < c < q k . h y 2 Ž < c < q k . k Ž 1 q h . k y Ž 2 < c < q k . k Ž 1 q h . k ) 0. 597 SAMPLE SPLITTING Let En be the joint event that < g ˆy g 0 < F B, n a < uˆy u 0 < F k , n a < dˆy d 0 < F k , Ž 40. ¨ an Ž 41. inf F < g y g 0 <FB sup ¨ an F < g y g 0 <FB Gn Ž g . G Ž 1 y h . d, <g y g 0 < K nŽg . F Ž1 q h . k , <g y g 0 < and Ž 42. sup ¨ an F < g y g 0 <FB Jn Ž g . y Jn Ž g 0 . 'a n <g y g 0 < -h. Fix « ) 0, and pick ¨ and n so that P Ž En . G 1 y « for all n G n, which is possible under Lemmas A.5]A.8. Let SUn Žg . s SnŽ uˆ, dˆ, g ., where SnŽ?, ? , ? . is the sum of squared errors function Ž5.. Since Y s Xu q X 0 dn q e, Y y Xuˆy Xg dˆs Ž e y X Ž uˆy u . y X 0 Ž dˆy dn .. y D Xg dˆ, where D Xg s Xg y X 0 . Hence Ž 43. SUn Ž g . y SUn Ž g 0 . s Y y Xuˆy Xg dˆ 9 Y y Xuˆy Xg dˆ Ž .Ž . y Ž Y y Xuˆy X 0 dˆ . 9 Ž Y y Xuˆy X 0 dˆ . X X X s dˆ9 D Xg D Xg dˆy 2 dˆ9 D Xg e q 2 dˆ9 D Xg D Xg Ž uˆy u . X X X s d 9n D Xg D Xg dn y 2 dˆ9 D Xg e q 2 dˆ9 D Xg D Xg Ž uˆy u . X q Ž dn q dˆ . 9 D Xg D Xg Ž dˆy dn . . Suppose g g w g 0 q ¨ ra n , g 0 q B x and suppose En holds. Let ˆ c s n adˆ so that < ˆ c y c < F k . By Ž39. ] Ž43., SUn Ž g . y SUn Ž g 0 . an Ž g y g 0 . X s c9 D Xg D Xg c nŽg y g 0 . q G X y Ž c q c . 9 D XgX ˆ Gn Ž g . Žg y g 0 . X 2ˆ c D Xg e n1y a Ž g y g 0 . X q 2ˆ c9 D Xg D Xg n a Ž uˆy u . nŽg y g 0 . D Xg Ž ˆ c y c. nŽg y g 0 . y2<ˆ c< y< cqˆ c< < ˆ cyc< Jn Ž g . y Jn Ž g 0 . 'a n Žg y g 0 . y2<ˆ c < < n a Ž uˆy u . < K nŽg . Žg y g 0 . K nŽg . Žg y g 0 . G Ž 1 y h . dy 2 Ž < c < q k . h y 2 Ž < c < q k . k Ž 1 q h . k y Ž2 < c < q k . k Ž1 q h . k ) 0. We have shown that on the set En , if g g w g 0 q ¨ ra n , g 0 q B x, then SUn Žg . y SUn Žg 0 . ) 0. We can similarly show that if g g w g 0 y B, g 0 y ¨«ra n x then SUn Žg . y SUn Žg 0 . ) 0. Since SUn Žg ˆ . y SUn Žg 0 . F 0, 598 B. E. HANSEN this establishes that En implies < g ˆy g 0 < F ¨ ran . As P Ž En . G 1 y « for n G n, this shows that P Ž an < g Q.E.D. ˆy g 0 < ) ¨ . F « for n G n, as required. Ž . Ž . Let GnU Ž ¨ . s a n GnŽg 0 q ¨ ra n . and K U n ¨ s a n K n g 0 q ¨ ra n . LEMMA A.10: Uniformly on compact sets C , Ž 44 . GnU Ž ¨ . ªp m < ¨ < , and Ž 45 . Ž . < << < KU n ¨ ªp Df ¨ , where m s c9Dcf. PROOF: We show Ž44.. Fix ¨ g C . Using Ž27., Ž 46 . EGnU Ž ¨ . s a n c9 Ž M Ž g 0 q ¨ ra n . y M Ž g 0 .. c ª < ¨ < c9Dfc s < ¨ < m as n ª `. By Ž29., Ž 47. E GnU Ž ¨ . y EGnU Ž ¨ . s a2n E Gn Ž g 0 q ¨ ra n . y EGn Ž g 0 q ¨ ra n . 2 F a2n n < c < 4K ¨ an 2 s < c < 4K < ¨ < ny2 a ª 0. Markov’s inequality, Ž46. and Ž47. show that GnU Ž ¨ . ªp m < ¨ <. Suppose C s w0, ¨ x. Since GnU Ž ¨ . is monotonically increasing on C and the limit function is continuous, the convergence is uniform over C . To see this, set GŽ ¨ . s m¨ . Pick any « ) 0, then set J s ¨ mr« and for j s 0, 1, . . . , J, set ¨ j s ¨ m jrJ. Then pick n large enough so that max j F J < GnU Ž ¨ j . y GŽ ¨ j .< F « with probability greater than 1 y « , which is possible by pointwise consistency. For any j G 1, take any ¨ g Ž ¨ jy 1 , ¨ j .. Both GnU Ž ¨ . and GŽ ¨ . lie in the interval w GŽ ¨ jy1 . y « , GŽ ¨ j . q « x Žwith probability greater than 1 y « ., which has length bounded by 3 « . Since ¨ is arbitrary, < GnU Ž ¨ . y GŽ ¨ .< F 3 « uniformly over C . An identical argument yields uniformity over sets of the form wy¨ , 0x, and thus for arbitrary compact sets C . Q.E.D. Let R nŽ ¨ . s a n Ž JnŽg 0 q ¨ ra n . y JnŽg 0 ... ' LEMMA A.11: On any compact set C , RnŽ ¨ . « BŽ¨ . where B Ž ¨ . is a ¨ ector Brownian motion with co¨ ariance matrix EŽ B Ž1. B Ž1.9. s Vf. PROOF: Our proof proceeds by establishing the convergence of the finite dimensional distributions of R nŽ ¨ . to those of B Ž ¨ . and then showing that R nŽ ¨ . is tight. Fix ¨ g C . Define dUi Ž ¨ . s d i Žg 0 q ¨ ra n . y d i Žg 0 . and u n i Ž ¨ . s a n x i e i dUi Ž ¨ . so that R nŽ ¨ . s ny1 r2 Ý nis 1 u n i Ž ¨ ., and let VnŽ ¨ . s ny1 Ý nis1 u n i Ž ¨ . u n i Ž ¨ .9. Under Assumption 1.2, u n i Ž ¨ ., Fi 4 is a martingale difference array ŽMDA.. By the MDA central limit theorem Žfor example, Theorem 24.3 of Davidson Ž1994, p. 383.. sufficient conditions for R nŽ ¨ . ªd N Ž0, < ¨ < Vf . are ' Ž 48. Vn Ž ¨ . ªp < ¨ < Vf 599 SAMPLE SPLITTING and Ž 49. ny1 r2 max 1FiFn u n i Ž ¨ . ªp 0. X First, a calculation based on Ž14. shows dEŽ x i x i e i2 d i Žg ..rdg s V Žg . f Žg ., where V Žg . is defined Ž in 8.. Thus by the continuity of V Žg . f Žg . at g 0 ŽAssumption 1.5., if ¨ ) 0, Ž 50. X EVn Ž ¨ . s a n E Ž x i x i e i2 dUi Ž ¨ . . X ¨ s a n E x i x i e i2 d i g 0 q ž ž an X y E Ž x i x i e i2 d i Ž g 0 .. // ª ¨ Vf , with the sign reversed if ¨ - 0. By Ž15., Ž 51. 2 E Vn Ž ¨ . y EVn Ž ¨ . s F a2n n a2n n E K n 1 Ý 'n is1 ¨ 2 Ž < x i e i < 2 dUi Ž ¨ . y E Ž < x i e i < 2 dUi Ž ¨ . .. s ny2 a K¨ ª 0 an as n ª `, which with Ž50. combines to yield Ž48.. By Ž12., E ny1r2 max 1FiFn uni Ž ¨ . 4 F s F 1 n E uni Ž ¨ . a2n n a2n n 4 E Ž < x i e i < 4 dUi Ž ¨ . . C1 <¨ < an s ny2 a C1 < ¨ < ª 0, which establishes Ž49. by Markov’s inequality. We conclude that R nŽ ¨ . ªd N Ž0, < ¨ < Vf .. This argument can be extended to include any finite collection w ¨ 1 , . . . , ¨ k x to yield the convergence of the finite dimensional distributions of R nŽ ¨ . to those of B Ž ¨ .. We now show tightness. Fix « ) 0 and h ) 0. Set d s «h 4 rK 1 and n s Žmaxw dy1r2 , K 2rh x.1r a , where K 1 and K 2 are defined in Lemma A.3. Set g 1 s g 0 q ¨ 1ra n . By Lemma A.3, for n G n, P ž sup ¨ 1 F¨ F¨ 1 qd RnŽ ¨ . y RnŽ ¨ 1 . ) h s P / ž K1 F sup g 1 F g F g 1 q d ra n d ž / an y2 4 an h Jn Ž g . y Jn Ž g 1 . ) h a1r2 n / 2 F d« . 1r2 ŽThe conditions for Lemma A.3 are met since dran G ny1 when n a G dy1r2 , and hra1r2 n G K 2 rn when n a G K 2 rh , and these hold for n G n.. As discussed in the proof of Lemma A.4, this shows Q.E.D. that R nŽg . is tight, so R nŽ ¨ . « B Ž ¨ .. 600 B. E. HANSEN LEMMA A.12: 'n Ž uˆy uˆŽg 0 .. ªp 0, and 'n Ž uˆŽg 0 . y u . ªd Z ; NŽ0, Vu .. PROOF: Let Jn s ny1 r2 X 9e, and observe that Lemma A.4 implies 1 XgUX e s 'n Jn J , J Žg . ž / ž / « Jn Ž g . a Gaussian process with continuous sample paths. Thus on any compact set C , Jn JnU Ž ¨ . s ž 0 Jn g 0 q ¨ « J J Žg 0 . / ž / an and Ž23. implies MnU Ž ¨ . s ž Mn g 0 q Mn Mn g 0 q ¨ an ž / ž Mn g 0 q ¨ an ¨ an / / 0 ªp M ž M Žg 0 . M Žg 0 . M Žg 0 . / uniformly on C . Also AnŽ ¨ . s an n Ž . D XgX 0q¨ r a nD Xg 0q¨ r a n F K U n ¨ ªp 0 uniformly on C by Lemma A.10, and hence Ž 52. 'n uˆ žž g0 q ¨ an y u s MnU Ž ¨ . / / ž « y1 ž JnU Ž ¨ . y M Žg 0 . M Žg 0 . M M Žg 0 . I A n Ž ¨ . cay1r2 n I ž/ y1 / J s Z, J Žg 0 . / ž / which establishes 'n Ž uˆŽg 0 . y u . ªd Z as stated. Pick « ) 0 and d ) 0. Lemma A.9 and Ž52. show that there is a ¨ - ` and n - ` so that for all n G n, P Ž a n < g ˆy g 0 < ) ¨ . F « and P ž sup 'n uˆ y¨ F¨ F¨ ž g0 q ¨ an / y uˆŽ g 0 . ) d F « . / Hence, for n G n, P Ž 'n < uˆy uˆŽ g 0 . < ) d . F P ž sup 'n uˆ y¨ F¨ F¨ ž g0 q ¨ an / y uˆŽ g 0 . ) d / qP Ž a n < g ˆy g 0 < ) ¨ . F 2 « , establishing that 'n < uˆy uˆŽg 0 .< ªp 0 as stated. Q.E.D. Let Q nŽ ¨ . s SnŽ uˆ, dˆ, g 0 . y SnŽ uˆ, dˆ, g 0 q ¨ ra n .. LEMMA A.13: On any compact set C , Q nŽ ¨ . « QŽ ¨ . s ym < ¨ < q 2'l W Ž ¨ ., where l s c9Vcf. X PROOF: From Ž43. we find Q nŽ ¨ . s yGnU Ž ¨ . q 2 c R nŽ ¨ . q L nŽ ¨ ., where Ž . L n Ž ¨ . F 2'n < dˆy dn < R n Ž ¨ . q Ž 2 n a < ˆ c < < uˆy u < q < c q ˆ c< < ˆ c y c <. K U n ¨ «0 601 SAMPLE SPLITTING ' <ˆ < Ž . Ž . Ž . Ž . ' <ˆ < Ž . Ž . since Žuniformly on C . K U n ¨ s Op 1 , R n ¨ s Op 1 , n u y u s Op 1 , and n d y d n s Op 1 , by Lemmas A.10, A.11, and A.12. Applying again Lemmas A.10 and A.11, Q nŽ ¨ . « ym < ¨ < q 2 c9B Ž ¨ .. The process c9B Ž ¨ . is a Brownian motion with variance l s c9Vcf, so can be written as 'l W Ž ¨ ., Q.E.D. where W Ž ¨ . is a standard Brownian motion. PROOF OF THEOREM 1: By Lemma A.9, a nŽg ˆy g 0 . s argmax ¨ QnŽ ¨ . s Op Ž1., and by Lemma A.13, Q nŽ ¨ . « QŽ ¨ .. The limit functional QŽ ¨ . is continuous, has a unique maximum, and lim < ¨ < ª ` QŽ ¨ . s y` almost surely Žwhich is true since lim ¨ ª ` W Ž ¨ .r¨ s 0 almost surely.. It therefore satisfies the conditions of Theorem 2.7 of Kim and Pollard Ž1990. which implies an Ž g ˆy g 0 . ªd argmax Q Ž ¨ . . ¨ gR Making the change-of-variables ¨ s Ž lrm2 . r, noting the distributional equality W Ž a2 r . ' aW Ž r ., and setting v s lrm2 , we can re-write the asymptotic distribution as argmax w ym < ¨ < q 2'l W Ž ¨ .x s y`- n -` l y`-r-` ' v argmax y s v argmax y y`-r-` y`-r-` PROOF OF y argmax m2 l m <r< 2 l m < r < q 2'l W < r <q2 l m l ž / m2 r WŽr. qWŽ r . . Q.E.D. THEOREM 2: Lemma A.12 and Ž23. show that ˆ , gˆ .. y Ž Sn Ž uˆ, g 0 . y Sn Ž uˆ, gˆ .. sˆ LR n Ž g 0 . y Q n Ž ¨ˆ. s Ž Sn Ž uˆŽ g 0 . , g 0 . y Sn Ž Q s Sn Ž uˆŽ g 0 . , g 0 . y Sn Ž uˆ, g 0 . s Ž uˆŽ g 0 . y uˆ. 9 XgUX XgU Ž uˆŽ g 0 . y uˆ. ªp 0. Now applying Lemma A.13 and the continuous mapping theorem, LR n Ž g 0 . s Qn Ž ¨ . sˆ 2 q op Ž1. s supn Q n Ž ¨ . sˆ 2 q op Ž 1 . ªd supn Q Ž ¨ . s2 . This limiting distribution equals 1 s2 sup w ym < ¨ < q 2'l W Ž ¨ .x s n ' 1 s2 sup ym l s 2m r l m2 r q 2'l W l ž / m2 r sup w y< r < q 2W Ž r .x s h 2j r by the change-of-variables ¨ s Ž lrm2 . r, the distributional equality by W Ž a2 r . ' aW Ž r ., and the fact h 2 s lrŽ s 2m .. To find the distribution function of j , note that j s 2 maxw j 1 , j 2 x, where j 1 s sup s F 0 w W Ž s . y 12 < s <x and j 2 s sup 0 F s w W Ž s . y 21 < s <x. j 1 and j 2 are iid exponential random variables with distribution function P Ž j 1 F x . s 1 y eyx Žsee Bhattacharya and Brockwell Ž1976... Thus 2 P Ž j F x . s P Ž 2 max w j 1 , j 2 x F x . s P Ž j 1 F xr2. P Ž j 2 F xr2. s Ž 1 y eyx r2 . . PROOF OF THEOREM 3: Note that by the invariance property of the likelihood ratio test, LR nŽg 0 . is invariant to reparameterizations, including those of the form g ª g * s FnŽg .. Since the threshold variable qi only enters the model through the indicator variables qi F g 4, by picking FnŽ x . to be the 602 B. E. HANSEN empirical distribution function of the qi , we see that qi F g 4 s Fn Ž qi . F Fn Ž g .4 s ½ j n Fg* 5 for some 1 F j F n. Without loss of generality, we therefore will assume that qi s irn for the remainder of the proof. Let j0 be the largest integer such that j0 rn- g 0 . Without loss of generality, we can also set s 2 s 1. If we set a s 0, the proof of Lemma A.13 shows that uniformly in ¨ Ž 53. Q n Ž g 0 q ¨ rn . s yGnU Ž ¨ . q 2 d 9R n Ž ¨ . q op Ž 1 . where for ¨ ) 0 GnU Ž ¨ . s n Ý Ž d 9 x i . 2 g 0 - qi F g 0 q ¨ rn4 is1 n s Ý Ž d 9 x i .2 is1 ½ j0 n - qi F j0 q ¨ n 5 and n RnŽn . s Ý x i ei is1 ½ j0 n - qi F j0 q n n 5 . While Lemma A.13 assumed that a ) 0, it can be shown that Ž53. continues to hold when a s 0. X Note that the processes GnU Ž ¨ . and dn R nŽ ¨ . are step functions with steps at integer-valued ¨ . Let Nq denote the set of positive integers and DnŽ ¨ . be any continuous, strictly increasing function such that GnU Ž k . s DnŽ k . for k g Nq. Let NnŽ ¨ . be a mean-zero Gaussian process with covariance kernel E Ž Nn Ž ¨ 1 . Nn Ž ¨ 2 .. s Dn Ž ¨ 1 n ¨ 2 . . Since d 9R nŽ ¨ . is a mean-zero Gaussian process with covariance kernel s 2 GnU Ž ¨ 1 n ¨ 2 ., the restriction of d 9R nŽ ¨ . to the positive integers has the same distribution as NnŽ ¨ .. Since DnŽ ¨ . is strictly increasing, there exists a function ¨ s g Ž s . such that DnŽ g Ž s .. s s. Note that NnŽ g Ž s .. s W Ž s . is a standard Brownian motion on Rq. Let Gqs s: g Ž s . g Nq4 . It follows from Ž53. that max g) g 0 Qn Ž g . sˆ 2 s max w yGnU Ž k . q 2 d 9R n Ž k .x q op Ž 1 . kgN q ' max w yDn Ž k . q 2 Nn Ž k .x q op Ž 1 . kgN q s max w yDn Ž g Ž s .. q 2 Nn Ž g Ž s ..x q op Ž 1 . sgG q s max w ys q 2W Ž s .x q op Ž 1 . sgG q F max w ys q 2W Ž s .x q op Ž 1 . . sgR q We conclude that LR n Ž g 0 . s max max g) g 0 Qn Ž g . sˆ 2 , max g- g 0 Qn Ž g . sˆ 2 F max max w y< s < q 2W Ž s .x , max w y< s < q 2W Ž s .x q op Ž 1 . sG0 ªd max y`-s-` sF0 w y< s < q 2W Ž s .x s j . This shows that P Ž LR nŽg 0 . G x . F P Ž j G x . q oŽ1., as stated. Q.E.D. SAMPLE SPLITTING 603 REFERENCES BAI, J. Ž1997.: ‘‘Estimation of a Change Point in Multiple Regression Models,’’ Re¨ iew of Economics and Statistics, 79, 551]563. BHATTACHARYA, P. K, and P. J. Brockwell Ž1976.: ‘‘The Minimum of an Additive Process with Applications to Signal Estimation and Storage Theory,’’ Z. Wahrschein. Verw. Gebiete, 37, 51]75. BILLINGSLEY, P. Ž1968.: Con¨ ergence of Probability Measures. New York: Wiley. BREIMAN, L., J. L. FRIEDMAN, R. A. OLSHEN, AND C. J. STONE Ž1984.: Classification and Regression Trees. Belmont, CA: Wadsworth. CHAN, K. S. Ž1989.: ‘‘A Note on the Ergodicity of a Markov Chain,’’ Ad¨ ances in Applied Probability, 21, 702]704. }}} Ž1993.: ‘‘Consistency and Limiting Distribution of the Least Squares Estimator of a Threshold Autoregressive Model,’’ The Annals of Statistics, 21, 520]533. CHAN, K. S., AND R. S. TSAY Ž1998.: ‘‘Limiting Properties of the Least Squares Estimator of a Continuous Threshold Autoregressive Model,’’ Biometrika, 85, 413]426. CHU, C. S., AND H. WHITE Ž1992.: ‘‘A Direct Test for Changing Trend,’’ Journal of Business and Economic Statistics, 10, 289]299. DAVIDSON, J. Ž1994.: Stochastic Limit Theory: An Introduction for Econometricians. Oxford: Oxford University Press. DUFOUR, J. M. Ž1997.: ‘‘Some Impossibility Theorems in Econometrics with Applications to Structural Variables and Dynamic Models,’’ Econometrica, 65, 1365]1387. D¨ UMBGEN, L. Ž1991.: ‘‘The Asymptotic Behavior of Some Nonparametric Change Point Estimators,’’ The Annals of Statistics, 19, 1471]1495. DURLAUF, S. N., AND P. A. JOHNSON Ž1995.: ‘‘Multiple Regimes and Cross-Country Growth Behavior,’’ Journal of Applied Econometrics, 10, 365]384. HALL, P., AND C. C. HEYDE Ž1980.: Martingale Limit Theory and Its Application. New York: Academic Press. HANSEN, B. E. Ž1992.: ‘‘Tests for Parameter Instability in Regressions with IŽ1. Processes,’’ Journal of Business and Economic Statistics, 10, 321]335. }}} Ž1996.: ‘‘Inference when a Nuisance Parameter is not Identified under the Null Hypothesis,’’ Econometrica, 64, 413]430. }}} Ž2000.: ‘‘Testing for Structural Change in Conditional Models,’’ Journal of Econometrics, forthcoming. HARDLE, W., AND O. LINTON Ž1994.: ‘‘Applied Nonparametric Methods,’’ in Handbook of Econometrics, Vol. IV, ed. by R. F. Engle and D. L. McFadden. Amsterdam: Elsevier Science, pp. 2295]2339. IBRAGIMOV, I. A. Ž1975.: ‘‘A Note on the Central Limit Theorem for Dependent Random Variables,’’ Theory of Probability and Its Applications, 20, 135]141. KIM, J., AND D. POLLARD Ž1990.: ‘‘Cube Root Asymptotics,’’ Annals of Statistics, 18, 191]219. NEWEY, W. K., AND D. L. MCFADDEN Ž1994.: ‘‘Large Sample Estimation and Hypothesis Testing,’’ Handbook of Econometrics, Vol. IV, ed. by R. F. Engle and D. L. McFadden. New York: Elsevier, pp. 2113]2245. PELIGRAD, M. Ž1982.: ‘‘Invariance Principles for Mixing Sequences of Random Variables,’’ The Annals of Probability, 10, 968]981. PICARD, D. Ž1985.: ‘‘Testing and Estimating Change-Points in Time Series,’’ Ad¨ ances in Applied Probability, 17, 841]867. POTTER, S. M. Ž1995.: ‘‘A Nonlinear Approach to U.S. GNP,’’ Journal of Applied Econometrics, 2, 109]125. TONG, H. Ž1983.: Threshold Models in Nonlinear Time Series Analysis: Lecture Notes in Statistics, 21. Berlin: Springer. }}} Ž1990.: Non-Linear Time Series: A Dynamical System Approach. Oxford: Oxford University Press. YAO, Y.-C. Ž1987.: ‘‘Approximating the Distribution of the ML Estimate of the Changepoint in a Sequence of Independent R.V.’s,’’ The Annals of Statistics, 3, 1321]1328.
© Copyright 2025