Finite-Sample Properties GMM of Some Alternative Estimators Lars Peter HANSEN Departmentof Economics,Universityof Chicago,Chicago, IL 60637 John HEATON KelloggGraduateSchool of Management,NorthwesternUniversity,Evanston,IL 60208 Amir YARON G.S.I.A.,Carnegie MellonUniversity,Pittsburgh,PA 15213 We investigatethe small-samplepropertiesof three alternativegeneralizedmethodof moments (GMM)estimatorsof asset-pricingmodels.The estimatorsthatwe considerincludeones in which the weightingmatrixis iteratedto convergenceandones in whichthe weightingmatrixis changed with eachchoiceof the parameters. Particularattentionis devotedto assessingthe performanceof the asymptotictheoryfor makinginferencesbaseddirectlyon the deterioration of GMMcriterion functions. KEY WORDS: Asset pricing;Generalizedmethodof moments;MonteCarlo. The purposeof this article is to investigatethe small- Kocherlakota(1990a), all of our experimentscome from sample properties of generalized method of moments single-consumereconomies with power utility functions. (GMM)estimatorsappliedto asset-pricingmodels.Oural- Withinthe confinesof these economies,there is still conternativeasymptoticallyefficientestimatorsincludeones in siderableflexibility in the experimentaldesign. Some of which the weighting matrix is estimatedusing an initial our experimentaleconomies are calibratedto annualtime (consistent)estimatorof theparametervector,ones in which series data presuminga centuryof data.Othereconomies the weightingmatrixis iteratedto convergence,andones in are calibratedto monthlypostwardata.In the experiments whichthe weightingmatrixis changedfor everyhypotheti- calibratedto annualdataandseveralof the experimentscalcal parametervalue.The last of these threeapproacheshas ibratedto monthlydata,the momentconditionsare nonlinnot been used very muchin the empiricalasset-pricinglit- ear in at least one of the parametersof interest.Moreover, erature,but it has the attractionof being insensitiveto how some of the specificationsintroducetime nonseparabilities the momentconditionsare scaled.In addition,we studythe in the consumer preferencesthat are motivatedby either advantagesto basing statisticalinferencesdirectly on the local durabilityor habit persistence.In the other expericriterionfunctionratherthanon quadraticapproximations ments calibratedto monthlydata, the momentconditions to it. are, by design,linearin the parameterof interest.Some of We addressthe followingissues: these setupsare specialcases of the classicalsimultaneous1. How does the procedurefor constructingthe weight- equationsmodel. For other setups, the observeddata are ing matrixaffect the small-samplebehaviorof the GMM modeledas being time averaged,introducinga movingavcriterionfunction? erage structurein the disturbanceterms. 2. How do the confidenceregions of parameterestimaThe articleis organizedas follows. Section 1 describes tors constructedusingthe GMMcriterionfunctionperform the alternativeestimatorswe study and the relatedeconorelative to confidenceregions based on the usually con- metric literature.Section 2 specifies the Monte Carlo enstructedstandarderrors? vironmentswe use. Section 3 gives an overview of the 3. Is the small-sampleoverrejectionoften foundin stud- calculationsincludinga descriptionof how inferencesare ies of GMMestimatorsreducedwhenusing an estimatorin madebaseddirectlyon the shapeof the criterionfunctions. which the weightingmatrixis continuouslyaltered? Section 4 then presentsthe resultsof the MonteCarloex4. How are the small-samplebiases of the GMM esti- perimentsusing a lognormalmodel calibratedto monthly matorsaffectedby the choice of procedurefor constructing data.Section5 presentsthe resultscalibratedto annualand the weightingmatrix? monthlydatausing a Markov-chainapproximation. Finally, in our Section 6. remarks are concluding Because there has been an extensive body of empirical workinvestigatingthe consumption-based intertemporal model GMM estimation (CAPM) using capitalasset-pricing @ 1996 American Statistical Association methods,we use suchmodelsas laboratoriesfor ourMonte Journal of Business & Economic Statistics Carloexperiments.As in the work of Tauchen(1986) and July 1996, Vol. 14, No. 3 262 Hansen, Heaton,and Yaron:Finite-SamplePropertiesof Some AlternativeGMMEstimators 1. ALTERNATIVE ESTIMATORS AND RELATED LITERATURE sion.] An advantageof this estimatorrelativeto the previous two is thatit is invariantto how the momentconditions are scaledeven whenparameter-dependent scale factorsare introduced.A simpleexampleof a continuous-updating estimatoris a minimumchi-squaredestimatorused for restrictedmultinomialmodels in which the efficientdistance matrixis constructedfrom the probabilitiesimpliedby the underlyingparametersand hence is parameterdependent. The threeGMMestimatorshave antecedentsin the classical simultaneous-equations literature.Considerestimating One of the goals of our study is to comparethe finitesample propertiesof three alternativeGMM estimators, each of which uses a given collection of moment conditions in an asymptoticallyefficientmanner.Write the moment conditionsas E[(p(Xt, j)] = 0, 263 (1) where / is the k-dimensionalparametervector of interest. In (1) the function (o has n _>k coordinates.We assume that {1/'7 ~t1 (p(Xt, 3)} convergesin distributionto a normallydistributedrandomvector with mean 0 and covariancematrixV(3). Let VT(0) denote (an infeasible)consistentestimatorof this covariancematrix. This latter estimatoris typically made operationalby substitutinga consistentestimatorfor 0/, denoted{b1}. An efficientGMMestimatorof the parameter vectorp is thenconstructedby choosingthe parameter vector b that minimizes a single equation, say yt = 3'xt + ut, where / is the pa- rameterof interest.Let zt denote the vector of predetermined variablesat time t that by definitionare orthogonal to ut. One way to estimate/3 is to use two-stageleast squares,which is our two-step estimatorunder the additionalrestrictionsthatthe disturbancetermis conditionally homoscedasticand serially uncorrelated.In this case the iterativeestimatorconvergesafter two steps and hence is the two-stepestimator.It is well knownthat the two-stage least squaresestimatoris not invariantto normalization.In fact Hillier(1990) criticizedthe two-stageleast squaresestimatorby arguingthat the object that is identifiedis the I T1 T -[1 T direction[1,-0'] butnot its magnitude.Hillierthenshowed (Xt,b) "[ P-(Xt, that the conventionaltwo-stageleast squaresestimatorof [VT(bl)]L L t=l t=l1 directionis distortedby its dependenceon normalization. The first two GMM estimatorsthat we considerdiffer in As an alternative,Sargan(1958)suggestedaninstrumentalthe way in which this is accomplished. variables-typeestimatorthatminimizes Two-StepEstimator The first estimator,called the twostep estimator,uses anidentitymatrixto weightthe moment Tt= t1 S t=1(Yt- b'xt)z =l ztz1 x conditionsso that bI is chosen to minimize (5) T b'xt)zt (Yt- b)1 (2) Z1 : 1 1 t=(5) 1E # t=1(Yt- b'xt)2 T t=l t=l1 Let bl denotethe estimatorobtainedby minimizing(2). Iterative Estimator. The second estimator continues from the two-step estimatorby reestimatingthe matrix V(/) using V(bT-1) and constructinga new estimatorbiT. This is repeateduntil b- convergesor until j attainssome large value. Let bO denotethis estimator. ContinuousUpdatingEstimator Insteadof taking the weightingmatrixas given in each step of the GMM estimation,we also consideran estimatorin which the covariance matrixis continuouslyalteredas b is changedin the minimization.Formallylet b, be the minimizerof ( t=ll , "1pXi,?[ 1 ( t=l (Xt, b) . (4) Allowing the weighting matrixto vary with b clearly alters the shape of the criterionfunctionthat is minimized. Although the first-orderconditionsfor this minimization problem have an extra term relative to problems with fixed weightingmatrices,this termdoes not distortthe limiting distributionfor the estimator.[See Pakes and Pollard (1989, pp. 1044-1046) for a more formal discussion and provision of sufficientconditionsthat justify this conclu- by choice of b. Underthe additionalrestrictionsimposedon the disturbanceterm discussedin the previousparagraph, this is our continuous-updating estimator.Notice thatif we ignorethe denominatortermin (5) andminimize,the solution is the two-stageleast squaresestimator.By including the denominatorterm, Sarganshowed that, for an appropriate choice of zt, the solution is the (limited information) quasi-maximumlikelihood estimator(using a Gaussian likelihood),which as an estimatorof directionis invariantto normalization. The estimationenvironmentsthat we study are more complicatedthan the one just described.Sometimes the moment conditionsare not linear in the parameters,and the disturbancetermsare often conditionallyheteroscedastic and/or seriallycorrelated.As a consequence,the twostep and iterativeestimatorsno longer coincide and the estimatorcan no longerbe interpreted continuous-updating as a quasi-likelihoodestimator.The estimationmethodsremainlimitedinformation,however,in thatthe momentconditionsused aretypicallynot sufficientto fully characterize the time series evolutionof the endogenousvariables. The second goal of our analysisis to comparethe reliabilityof confidenceregionscomputedusing quadraticapproximationsto criterionfunctionsto ones based directly on the deteriorationof the originalcriterionfunctions.The formerapproachis more commonlyused in the empirical 264 Journalof Business & EconomicStatistics,July 1996 asset-pricingliteraturepartiallybecauseit is easier to im- the finite-samplepropertiesof the estimatorsdescribedin mechanismsareconstructed plement.The latterapproachexploits the chi-squaredfea- Section 1. The data-generating ture of the appropriatelyscaled criterionfunctions.From to be consistentwith a representativeagent consumptionthe vantagepoint of hypothesistesting, the plausibilityof basedcapitalasset-pricingmodel (CCAPM).Estimatorsof an observeddeteriorationof the criterionfunctioncaused the parametersof the representativeagent'sutilityfunction by imposingparameterrestrictionscanbe assessedby using are consideredalong with tests of the overidentifyingconthe appropriatechi-squareddistribution.Using this same ditions impliedby the model. We use the CCAPMas the insight, confidenceregions can be computedby using the basis of our experimentsbecauseGMMhas been used exappropriatechi-squareddistributionto prespecifysome in- tensivelyin studyingthis model(e.g., see DunnandSinglecrementin the criterionfunction and inferringthe set of ton 1986;Eichenbaumand Hansen 1990; Epsteinand Zin parametervalues that imply no more than that increment. 1991;FersonandConstantinides1991;HansenandSingleSuch confidenceregions can have unusualshapes and, in ton 1982). Furthermore,the CCAPMforms the basis for fact, may not even be connected.One of the key questions two studies of the finite-samplepropertiesof GMM conof this investigationis whetheror not they lead to more ductedby Tauchen(1986) and Kocherlakota(1990a). reliablestatisticalinferences. Preferencesand EulerEquations. In the model the repOne of our reasons for studying the performanceof resentative consumeris assumedto have preferencesover criterion-function-based inferencecomes from the workof consumptiongiven by Magdalinos(1994). Within the confines of the classical simultaneous-equations paradigm,Magdalinosstudiedthe > 0, (6) of alternative tests of instrumentadmissibil1 performance U- =1E E6t As a result of his recommended ity. analysis,Magdalinos alteringthe weighting matrix to embody the restrictions where ct is consumptionat date t. The parameter0 capas is done in the continuous-updating method.In addition, tures some time nonseparabilityin preferences. Examhe found that test statistics are better behavedusing the ples of models with time nonseparabilityin preferences limited-information maximumlikelihoodestimatorthanthe can be found in the work of Abel (1990), Constantinides least two-stage squaresestimator.Recall that the former (1990), Detempleand Zapatero(1991), Dunn and Singleestimatorcoincides with our continuous-updating estima- ton (1986), Eichenbaumand Hansen (1990), Gallantand tor and the latter to our two-step and iteratedestimators Tauchen(1989), Heaton(1993, 1995),Novales (1990), Ryin the classicalsimultaneous-equations estimationenviron- der andHeal (1973), and Sundaresan(1989).If 0 > 0, conment consideredby Magdalinos. sumptionis durableor substitutableover time. If 0 = 0, the Anotherreason is Nelson and Startz's(1990) criticism preferencesof the consumeraretime-additive.If 0 < 0, of the use of instrumental-variables methods for study- consumptionis complementaryover time and the prefering consumption-basedasset-pricingmodels. These au- ences of the representativeconsumerexhibit habitpersisthors were concernedaboutthe behaviorof instrumental- tence. We considerestimatorsof the parameters6,-y,and 0, as variablesestimatorswhen the instrumentsare poorly corwell as tests of the modelbasedon implicationsof the Eurelated with the endogenous variables.Their arguments ler equations.Let must = (ct + Oct-l)--, which can be were based on analogiesto results derivedformallyfor t statisticsandoveridentifyingrestrictionstests in the classi- interpretedas the indirectmarginalutilityfor consumption cal simultaneous-equations setting. The questionof inter- "services"as measuredby st = ct + Oct-,. Similarly,let est to us is the extent to which criterion-function-basedmuct (ct + Oct-1)-' +-06E[(ct+l + Oct)F-YTt], where Ft inferenceand continuousupdatingcan help overcomethe gives the informationset at time t. The Eulerequationfor a concerns of Nelson and Startz. Furthermore,Stock and representativeagent'sportfolioallocationdecisionis given Wright(1995) provideda theoreticalrationalefor consid- by criterionfunctioninsteadof ering the continuous-updating muct = E(6muct+lRt+1_ Ft), (7) other GMM implementationsin situationsin which the where Rt+i is a gross returnon an asset from t to t + 1. model is poorlyidentified. The finite-samplepropertiesof the two-stepanditerative Because aggregateconsumptionis growingover time, we The (normalized) GMMestimatorsin an asset-pricingsettinghavebeen stud- divided(7) by must to inducestationarity. ied previouslyby Tauchen(1986), Kocherlakota(1990a), Eulerequationthat we consideris then given by and Ferson and Foerster (1994). These investigatorsdid muct+i muct = F( (8) not studythe propertiesof the continuous-updating estimaL1I' must must tor, nor did they study the behaviorof criterion-functionRemovingconditionalexpectationsfrom(8) resultsin the basedconfidenceregions.Furthermore, Tauchen(1986)and Kocherlakota(1990a) consideredonly the case of time- Euler-equationerror separablepreferencesfor the representativeconsumer. t+2(,O) = muc muc+l Rt+l, (9) must 2. MONTE CARLO ENVIRONMENT must wheremuc _=(ct + Oct_1)-r + 69(ct+1 + Oct)-Y.Note that We considerseveralMonteCarloenvironmentsto assess. E(qt+2[It) = 0. Furthermore,notice that, when 9 is not Hansen, Heaton,and Yaron:Finite-SamplePropertiesof Some AlternativeGMMEstimators Table1. MaximumLikelihoodEstimatorof MonthlyLawof Motion No timeavg. Order Unrestricted of C(L) log-likelihood Log-likelihood -y 1 2 3 3,214.9 3,218.1 3,220.6 3,237.3 3,238.8 4.55 4.21 Timeaveraged Log-likelihood y 3,230.3 4.99 0, 4t+2 has a first-ordermovingaverage[MA(1)]structure. 265 and covariancematrix I and where p is the mean of Yt. Furthermore,B(L) is a matrix of polynomialsin the lag operator.To use this assumptionaboutthe dynamicsof consumptionandreturnsalongwith the Eulerequation(7), we assumethatthe preferencesof the representativeagent are time additive.In this case the Euler-equation errorfor each returncan be writtenas -- log(ct+1/ct) + log Rt+1 - = ?7t+l, (12) By choosing instruments, zt, in Yt, unconditional moment whereE(rqt+l ht) = 0 and n is a constant(e.g., see Hansen and Singleton1983). The relation(12) implies a set of restrictionson the law E[pt+2 (6, , 0)] = E[ztqt+2(61Y,7)] = 0. (10) of motion (11). To impose these restrictions,we consider several finite-order of B(L) and use the parameterizations Finally,notice that bt+2 can be expressedin termsof con- methods described Hansen and (1991). These by Sargent sumptionratiosandreturns,whichwe taketo be stationary are of the form parameterizations processes. One unpleasantfeature of these moment conditionsis C(L) (L)(13) B(L) that they can always be made to be satisfiedin a degener' a(L) ate fashion.Supposethat y = 0 and 60 = -1; then clearly where C(L) is a 3 x 3 matrix of polynomialsin the lag muc0 = 0 and the momentconditionsare trivially satisfied. Without imposing additionalconstraintson the pa- operatoranda(L) is a 1 x 1 polynomialin the lag operator. rametervectors,this degeneracyin the momentconditions We restrict the polynomiala(L) to be second order and createsproblemsfor the two-stepand iterativeestimators. consideredseveraldifferentordersof C(L). To estimate the constrainedlaw of motion, we used Of course,whentime separabilityis imposed(0 = 0), these datafrom 1959.2to 1992.12.Aggregateconsumpmonthly problematicparametervaluescannotbe reached.When0 is tion is seasonally adjustedreal aggregateconsumptionof permittedto be differentfrom 0, Eichenbaumand Hansen nondurables services for the UnitedStatestakenfrom plus (1990) were led to dividethe momentconditionsby 1 + 60 CITIBASE. These data were convertedto a per capitameaso that the two-step and iterativeestimatorsnot be driven sure total U.S. by dividing by populationfor each month, to the degeneratevalues.We will do likewise.An attractive obtained from CITIBASE. The equity returnis the valueestimatoris its insensipropertyof the continuous-updating return from the Center for Researchin Security weighted scale factorsandhenceto the tivity to parameter-dependent Prices and the bond return is the Fama-Blissrisk(CRSP), momenttransformationused by Eichenbaumand Hansen free return from CRSP. Each of these returnseries was (1990). Moreover,the criterionof the continuous-updating converted into a real return the using implicit pricedeflator estimatordoes not necessarilytendto 0 in the vicinity y = 0 for nondurables and services from CITIBASE. and 60 = -1. Althoughthe estimatedmean for {(Pt+2} beFor simplicitywe removed the sample mean from the comes smallin the neighborhoodof theseparametervalues, vector Yt so thatthe constantsin (11) and(12) did not have so does the estimatedasymptoticcovariancematrix,and to be estimated. As a result, the only preferenceparamethe criterionfunctionfor the continuous-updated estimator ter to be estimated is 7. The results of estimatingthe law plays off this tension. of motion exact maximumlikelihoodfor differ(11) using We build severalMonteCarloenvironmentsto simulate ent orders of the polynomial C(L) are given in Table 1. returnsand consumptiongrowth that are consistentwith The column labeled "Unrestricted log-likelihood"reports (8). These are used to assess the finite-sampleproperties the in the case of unrestricted estimationof log-likelihood of the estimatorsof Section 1 basedon the momentcondithe in the The columns labeled polynomials lag operator. tions (10). "no time avg."reportthe log-likelihoodand the estimated conditionsare given by 2.1 Lognormal Model Time-AdditiveModel, No Time Averaging. In our first Monte Carlo environment we model consumption growth and returns directly by assuming that they are jointly lognormally distributed as in the work of Hansen and Singleton (1983). Let Y(t) = [log(ct/ct-1)log Rflog Rf]', where ct is aggregate consumption at time t, R• is the gross return on a stock index at time t, and R[ is the gross return on a bond at time t. We assumethat Yt = l + B(L)Et, (11) where Et is a normally distributed three-dimensional random vector that is independent over time and has zero mean value of 7 under the restrictions implied by (12). In searching for the maximized log-likelihood, larger values of the log-likelihood were found for values of 7 larger than 50. The results reported in Table 1 for the constrained models correspond to local maxima. We used the local maximizers for our simulations because they result in plausible values for 7y.Notice that there is substantial improvement in the log-likelihood in moving from a first- to a second-order polynomial for C(L) in the unrestricted case. This indicates that more than a first-order polynomial is needed for C(L). There is little improvement in going to a third-order polynomial in the unrestricted case. For both the second- and third-order polynomial cases, there is great deterioration in the log-likelihood when the 266 Journalof Business & EconomicStatistics,July 1996 model restrictionsare imposed.This is consistentwith the resultsreportedby HansenandSingleton(1983).Moreover, thereis little improvementin the log-likelihoodin moving from a second-orderto a third-orderC(L). For this reason we usedthe pointestimatesfromtherestrictedmodelwitha second-orderC(L) to conductourMonteCarloexperiments for the lognormalmodel with no time averaging. In assessing the finite-samplepropertiesof GMM estimatorsin this case, we constructed500 MonteCarlosamples, each with a sample size of 400. A sample size of 400 approximatesthe size of availablemonthlyconsumption data.We constructedmomentconditionsbasedon the Euler-equationerrorsfor both the bond and the stock reerrorwe used ForeachEuler-equation turnssimultaneously. one periodlagged(log)bondand(log) stockreturnsandone periodlagged(log) consumptiongrowthas instruments.We did not include constantsas instrumentsbecause the data are simulatedunderthe assumptionthat they have a zero mean. Notice thatin this case the Euler-equationerror t+ n 7=1 h=l (t-1+(T/n)+(h/n) (19) is predictableat time t. However,E(r t• lt_-1) = 0 so that instrumentscan be chosen from the informationset at time t - 1. An instrumental-variables estimatorof this modelmustaccountfor the MA(1)structureof the moment condition. We estimatedthe log-linearlaw of motion (13) for consumptionandasset returnsunderthe restrictionimpliedby (18). Becausethe consumptiondataare an arithmeticaverage of consumptionexpendituresover a period,in applying (18) to actual consumptiondata we assume that geometric averagesand arithmeticaveragesare approximatelythe same. This assumptionwas made by Grossman,Melino, ModelWithTimeAveraging. As a further and Shiller(1987), Hall (1988), andHansenand Time-Additive Singleton data-generatingmechanism,we also consideran example (1996).Note also thatthe returnsin (18) aretime averaged. in which the decision intervalof the representativeagent For simplicity,in estimatingthe law of motion(13), we use is much smallerthanthe intervalof the data.Supposethat the the model monthlyCRSP series directly.Furthermore, therearen decisionperiodswithineachobservationperiod. with time averagingimposes a weaker set of restrictions For example,if the representativeagent'sdecisioninterval thandoes the modelthattakesno accountof time averaging is a week and dataare observedmonthly,then n wouldbe becausewe do not imposethe restrictionon the first-order agent'sutilityfunction autocorrelation approximately4. The representative of the Euler-equationerrorof the stock reat time t is given by turn.In the limit case of continuousdecision making,the of the errorshouldbe .25, as disfirst-orderautocorrelation . U cussed et al. Grossman (14) (1987) and Hall (1988). by -1 =O(6n)h (Ct+h/n)lWe consider the case of a third-orderpolynomialfor If we maintainthe assumptionthatconsumptionandreturns C(L) and a second-orderpolynomialfor a(L). The results error of this estimationarereportedin Table1 in the columnslaarejointlylognormallydistributed,the Euler-equation this error can be beled "Timeaveraged."Notice thatthe log-likelihoodfuncis again given by (12). Moreover, rt+l tion improvessomewhatcomparedto the case in whichtime as decomposed averagingis ignored.The model,however,is still substann at odds with the data. The estimatedvalue of 7 is tially (15) 7lt+l = E (t+h/n, largeras well. slightly h=1 We used this model to create 500 Monte Carlo draws, where each with a samplesize of 400. As in the case of no time averaging,we studiedestimatorsbased on the Eulerequa(16) (t+h/n E(77t+lITFt+h/n) E(7rt+l• Ft+(h-1)/n). tions for both returns.In this case the instrumentswere In this environment,we presumethatobservedconsump(log) stock returns,(log) bondreturns,and (log) consumption does not correspondto the actual point-in-timecon- tion growth,all laggedtwo periods. sumptionof the representativeagent but insteadis an averageof actualconsumptionover one unit of time. Specifi- 2.2 Discrete-StateModels cally,supposethatobservedconsumption,c?, is a geometric In our second set of MonteCarloenvironmentswe folaverageof actualconsumption, low Tauchen(1986) andKocherlakota(1990a)andconsider a Markov-chainmodel for aggregateconsumptionanddiv1 (17) idend growth.Aggregateconsumptionis assumedto repc= Ct-1+h/n resentthe endowmentof the representativeconsumer,and and similarlyfor the observedreturn,R?. Averaging(12) dividendsrepresentthe cash flow from holding stock. We form one-periodstock returnsandthe returnsto holdinga over time impliesthat one-period(real)discountbond.Each of these returnscan -1Ylog(cal/c ) + log Rl1 be representedas functionsof the stateof the Markovchain. Constructionof these returnsin the case of time-additive was describedby Kocherlakota(1990a)andTauchen = (18) utility I4 + - t +(/t-n)+(h/n)+(h/n). (1986). 7=1 h=l j ( Hansen, Heaton,and Yaron:Finite-SamplePropertiesof Some AlternativeGMMEstimators and Yaron(1994). GMMestimatesand test statisticswere computedfor 500 replicationsof a sample size of 100. This sample size correspondsapproximatelyto the length of most annualdatasets. Table2. PreferenceSettings forDiscreteState-Space Model 6to(ct+ect-)1'1 t=-o 1 -- - Preferencecase 6 TS1 TS2 TNS1 TNS2 TNS3 .97 1.139 .97 .97 .97 7y 1.3 13.7 1.3 1.3 1.3 U=E 8 0 0 1/3 -1/3 -2/3 AnnualModel. To calibratethe first Markovchain,we used the methoddescribedby Tauchenand Hussey (1991) to approximatea first-ordervectorautoregression(VAR)for consumptionand dividendgrowth.The parametersof the VARaretakenfromKocherlakota(1990a)andare givenby In 1t .0 4 InAt .021 + ] .1 7 .414 .017 .161 In'ti + In/At-1 (20) where?t is the gross growthrate of real annualdividends on the Standard& Poor 500, where At is the gross growth rate of U.S. per capitareal annualconsumption,andwhere E(t .01400 .00177 .00177 .00120 267 (21) MonthlyModel. We repeatedsome of the experiments using a law of motion calibratedto postwarmonthlydata. As in the constructionof the Markov chain for the annual model, we startedwith a first-orderVAR for consumptionanddividendgrowth.The consumptiondataused in estimatingthe VAR were aggregateU.S. expenditures on nondurablesand services describedin Subsection2.1. We constructeddividendsimplied by the monthlyCRSP value-weightedportfolioreturn.Thesedividendswere convertedto real dividendsusingthe implicitpricedeflatorfor monthlynondurablesand services taken from CITIBASE. This dividendseries is highly seasonalbecauseof the regular dividendpayoutpolicies of most companies.To avoid modeling this seasonality,we let ?t = log(dt/dt-12)/12, and we assumedthat?t representsthe one-perioddividend growthof the model.The series {log(dr/dt-12)} appearsto be stationary. The parametersof the VAR estimatedusing these data are given by In t AtJ LIn _[ .0012 [ .0019J Furthermore,Et is assumedto be normallydistributedand -.1768 .1941 uncorrelatedover time. The Markovchain for [?t;At]' is + (22) IlnAt-1 t-1 ] In -.2150 .0267 chosento have 16 states. In simulatingdata from this model we chose several values of the preferenceparametersof the representative where consumer.These are presentedin Table2. Preferenceset.0001 = x 10-3. (23) ting TS1 was used by Tauchen(1986) and preferencesetE(ete') [.1438 .0001 .0001 .0.0145 145 ting TS2 was used by Kocherlakota(1990a).As shownby Kocherlakota(1990a), these latter parameters,along with the Markov-chainmodel of endowments,imply first and As in the case of the annual Markov-chainmodel, we second momentsfor asset returnsthat mimic their sample approximatedthe VAR of (22) and (23) with a 16-state The largevalueof 6 in TS2 is not inconsistent Markovchain using the methodsof Tauchenand Hussey counterparts. with the existence of an equilibriumin the model because (1991). The MonteCarlodataconsistedof 500 replications of a samplesize of 400. We focusedexclusivelyon the more of the large value of y (Kocherlakota1990b). preferenceconfiguration(TS1), adjusting6 for We restrictour attentionto "moderate" valuesof 6 and'y "moderate" the shorter interval.(Moreprecisely,we used the sampling in our examinationof time-nonseparable preferences,and twelfth root of .97 in place of .97 for 6.) In addition,we we considera rangeof valuesof 0. ParametersettingTNSI Monte Carlo data using nonseparablespecificagenerated introducesa modest degreeof durabilityby letting 0 = 1 tion TNS and with 6 adjustedappropriately. TNS2, again 1 and TNS2 introduces habit persistence with 0 = -?. TNS3 results in a more extremeamountof habit persistenceby 3. OVERVIEW OF MONTECARLORESULTS setting 0 = - . The asymmetric (in magnitude) across the 3.1 DescriptiveStatistics specificationsof 8 is guidedin partby the a priorinotion that there should only be a limited amountof durability In describingthe resultsof the variousMonteCarloexin the goods classifiedas "nondurable" in NationalIncome perimentsin Sections 3 and 4, we focus most of our disandProductAccountsandby empiricalevidencefor a sub- cussionon the following calculations: stantialdegreeof habitpersistencereportedby Fersonand Constantinides(1991). Table3. MomentConditions In implementingthe estimators,the Euler equationsfor Numberof moment the stock and bond returnsare multipliedby instrumental conditions Momentset Instruments,zt variablesto constructmomentconditions.The two instruM1 RR, At, 1 8 ment sets are listed in Table 3. Monte Carlo results for M2 4 At,1 other momentconditionswere given by Hansen,Heaton, 268 Journalof Business & EconomicStatistics,July 1996 1. We foundthe minimumvalueof the criterionfunction. Call this JT. The limitingdistributionof TJTis chi-squared with degrees of freedomequal to the numberof moment conditionsminusthe numberof parametersestimated.We used this limiting distributionto test the overidentifying momentconditions. 2. We evaluatedthe criterionfunctionat the trueparametervector.Callthis J'. The limitingdistributionof TJT' is chi-squaredwith degreesof freedomequalto the numberof momentconditions.Withthis limitingdistribution,we characterizethe familyof parametervectorsthatlook plausible from the standpointof the momentconditions.Stock and Wright(1995) advocatedthe use of this type of inference when it is suspectedthat the parametersare poorlyidentified. Here we examinewhetherthe limitingdistributionof for JT%providesa reasonablesmall-sampleapproximation inference. this conducting 3. We found the minimumvalue of the criterionfunction when7 is constrainedto be its truevalue.Call this Ji. Because y is the only parameterestimatedin the lognormal model,in this case J/ coincideswith J'. The limiting distribution of T(J - JT) is chi-squared with 1 df. This limitingdistributionallowsus to constructa confidenceregion for y based on the incrementsof the criterionfunction from its unconstrainedminimum.By evaluatingT(J?/ - JT) we determined whether the true value of y is in the correspondingchi-squaredcriticalvalueandthe fractionof the actualcomputedstatisticsthatare abovethatvalue (depicted on the y axis). Thus the 45-degreeline (depictedas referencefor assessingthe qualityof ...) is the appropriate the limitingdistribution.FollowingDavidsonand McKinnon (1994),these plots are referredto as p-valueplots, and we presentthe results for the interval[0, .5] becausethis boundsthe regionof probabilityvaluesused in most applications.Althoughwe use probabilityvaluesas our basis of comparison,confidenceintervalsat alternativesignificance levels can be assessedby simplysubtractingthe probability valuesfrom 1. The figures are organizedas follows. For each Monte Carlo setup we first consider a single figure with four graphstitled as follows: "(a)Minimized","(b)True","(c) and "(d)Wald,"correspondingto Constrained-Minimized" the statistics TJT,TJ•, T(JT? - JT), and T(yT - 7)2 (UT)2, respectively.To providea formalstatisticalmeasure of the distancebetweenthe empiricaldistributionsandtheir theoreticalcounterparts,on each figurea bandaroundthe 45-degreeline is plottedusing dottedlines. This bandis a 90%confidenceregionbasedon the Kolmogorov-Smirnov Test. This states that the probabilitythat the maximaldifferencebetweenthe empiricaldistributionandthe theoretical one will lie within those lines is 90%. Maximaldifferenceswithinthese bandsare not statisticallysignificant at the 10%significancelevel. Althoughwe presentresults confidence for the interval[0, .5], the Kolmogorov-Smirnov between the on the is based calculating supremum region line over the and the distribution region 45-degree empirical resultingintervalfor alternativeconfidencelevels. 4. We constructedthe morestandardconfidenceintervals for 7 based on a quadraticapproximationto the criterion function.In particular,let 7YTbe an estimatorof 7yand a be the estimatedasymptoticstandarderrorof the estimator [0, 1]. In each graph,the dashedline gives the MonteCarlorewhich has an asymptotic /(T2, 7)2 W7TWe study T(QT for the two-step estimator,the dot-dashline for the sults chi-squareddistributionwith 1 df. Notice thatthis objectis iterated estimator,and the solid line for the continuousjust the Waldstatisticfor the hypothesisthatthe truevalue of the parameteris y. In constructingthis statisticfor the updatingestimator.Forthe minimizedcriterionfunctionrethereis a necessaryorderingbetweenthe continuousestimator,the standarderrorsinclude sults, continuous-updating Whenthe iterativeestia term that reflectsthe derivativeof the GMM weighting updatingandthe iterativeestimator. value of the criterion functioncan also the mator converges, matrixwith respectto the parameters. estimator.Because be obtainedby the continuous-updating Our Monte Carlo calculationsare greatly simplifiedby estimatorminimizesits criterion, the continuous-updating our knowledgeof the true parametervector.In empirical must be smallerthanthe criterionfor minimized value this work,the corresponding computationswouldbe morecom- the iterativeestimator.As a result, the plot for the miniplicated. For instance,to constructa confidenceinterval mized criterionof the estimatormust continuous-updating for y based on the originalcriterionfunction,a researcher lie below the plot for the iterativeestimatorunless the itwould have to characterizenumericallythe hypothetical There is no natural orerative estimator fails to values of this parameterthat are consistent with a prespecifled deterioration in the criterion while concentrating out all of the other parameters. When there are very few remaining components in the parametervector (in our examples 0, 1, or 2), this concentration is tractable. This approach may become very difficult, however, when the parameter vector is large. In reporting our Monte Carlo results we use one of the graphical methods advocated by Davidson and McKinnon (1994). For each Monte Carlo setup we computed the empirical distributions of the statistics and compared them to the corresponding chi-squared distributions. The results are plotted on a set of figures constructed as follows. For each probability value (depicted on the x axis), we computed the converge. dering between the results for the two-step estimator and the continuous-updating estimator or between the two-step estimator and the iterative estimator. To complement our p-value plots, we also provide some results summarizing the performance of the implied parameter estimators. The finite-sample properties of the point estimates are of interest in their own right and in some cases provide additional insights into the behavior of the p-value plots. 3.2 Numerical Search Routines The two-step and iterative estimators are given by the minimizers of the objective function (2), and the Hansen, Heaton, and Yaron: Finite-Sample Properties of Some Alternative GMM Estimators 0.5 0.4 0.4 0.3 0.2 ering an estimator of the form (b) True (a) Minimized 0.5 .. } { VT(b) T(Xt,b)PT(Xt,b)' t=1 / 0.3 / 269 T +1 0.2 , t=2 0.1 0.1 / 00 0.1 0.2 0.3 0.4 0.5 00 0.1 0.5 0.4 0.4 0.3 0.3 / 0.2 0.2 0.1 0.1 o. o? 0.1 Figure 1. 0.2 0.3 0.4 0.2 0.3 0.4 0.5 0 .. . 0.1 0.2 0.3 0.4 0.5 Criterion Functions, Monthly Lognormal Model, No Time Averaging: ---., Iterative;-- -, Two-Step; - , Continuous- Updating. continuous-updating estimator is given by the minimizer of the objective function (4). For some of our experiments, the estimators are given by solutions to linear equations. In the other cases we used the numerical optimization routines fminu.m and fmins.m, which are part of the "Optimization Toolbox" for use with MATLAB. The routine fminu.m implements a quasi-Newton method, which is dependent on an initial setting for the parameters. To check whether the results were sensitive to initialization, we considered several different starting values that included the true parameter vector. When this gradient method failed to converge or resulted in unusual estimates, we also used the routine fmins.m, which is a simplex search method. Details on these MATLAB programs can be found in the MATLAB Optimization Toolbox manual. In Section 4 we show that the continuous-updating criterion can make numerical search for the minimizer difficult. As a further check on our numerical results, when we obtained extreme parameter estimates we also examined the continuous-updating criterion over a grid of the parameters. This gave us additional assurance that the estimated parameters were indeed minimizers of the criterion. When (Xt, b) - 1/T ZT, •T(Xt,b). this estimator was not positive definite, we used an estimator proposed by Durbin (1960) (see also Eichenbaum, Hansen, and Singleton 1988). Durbin's estimator is obtained by first approximating the MA(1) model with a finite-order autoregression. The residuals from this autoregression are used to approximate the innovations. Then the parameters of the MA(1) model are estimated by running a regression of the original time series onto a one-period lag of the "approximate" innovations. Finally, an estimate of VT(b) is formed using the estimated MA coefficients and sample covariance matrix for the residuals. This procedure has the advantage that the finite-order MA structure of {ot } is imposed and the estimator is positive semidefinite by construction. It does, however, rely on the choice of a finite-order autoregression to use in the approximation. In implementing the estimator we ran a 12th-order autoregression in the initial stage. Although this covariance matrix estimator was used in searching for the parameter estimates, it was not needed at any of the converged parameter values. (a) Minimized / 0.5 0.4 (b) True 0.5 , 0.41 0.3 ' / / 0.2 0.3 / , -." 0.2 / 0.1 0.1 O 0 0.1 0.2 0.3 0.4 0.5 C0 0 0.1 - 0.3 0.4 0.5 0.5 ' 0.4 0.2 (d) Wald (c) Constrained-Minimized 0.5 0.4 / 0.3 " / 0.3' 0.2 . 0.2 0.1 4. (24) where pT(Xt, b) = / 0 + POT (Xt,b)OPT(Xt-l,b)']}, 0.5 (d) Wald (c) Constrained-Minimized 0.5 00 [(pT(Xt-1, b)(pT(Xt,b)' 0.1 MONTE CARLO RESULTS, LOGNORMALMODEL For the case of time-averaged data, {pt } has an MA(1) structure, as we discussed in Section 2.1. To account for this, the estimator of VT(b) was computed by first consid- 0 0.1 0.2 0.3 0.4 0.5 0 0.1 0.2 0.3 0.4 0.5 Figure2. CriterionFunctions,MonthlyLognormalModel,TimeAveraging:---.., Iterative; - - -, Two-Step;--• , Continuous-Updating. Journalof Business & EconomicStatistics,July 1996 270 Table4. Propertiesof Estimatorsof y, LognormalModel confidence intervals built from the Wald criteria, but the Continuous-updating Iterative Two-step distortion is substantially smaller than with the other two Property estimators. Finally, the coverage rates of the confidence reA. No timeaveraging,true7 = 4.55 gions implied by the true and constrained-minimized crite3.72 1.64 1.73 Median ria for the continuous-updating estimator accord well with Mean 1.81 2.06 7,171.17 the asymptotic distribution and are clearly better than the 4.47 1.81 2.06 Truncatedmean* -.23 10%quantile -.20 -3.67 coverage rates for the other estimators. 18.75 4.00 4.33 90% quantile These Monte Carlo results for the lognormal model support the following remedy for the concerns raised by NelB. Timeaveraging,true7 = 4.99 son and Startz (1990). From the standpoint of hypothe3.48 1.74 1.75 Median sis testing and confidence-interval construction, use of the -518.14 1.88 2.04 Mean criterion is much more reliable than continuous-updating 3.18 1.88 2.04 Truncatedmean* the other methods we study. The tests of the overidentifying -10.48 -.29 10%quantile -.66 restrictions based on the continuous-updating estimator do 90% quantile 11.52 4.22 4.85 * Estimateswithabsolutevaluesgreaterthan100 were excludedfromthe computation of the not reject too often and in fact are quite well approximated truncatedmeans. by the limiting distribution. Although confidence intervals based on the Wald criteria can be badly distorted, particCriterion Functions. Figures 1 and 2 report the prop- ularly for the two-step and iterative estimators, confidence erties of the criterion functions for the two Monte Carlo regions constructed from the continuous-updating criteria experiments. Figure 1 is for the case of no time averaging have coverage probabilities that are close to the ones imof the data, and Figure 2 is for the case of time averaging. plied by the asymptotic theory. This occurs for confidence Figures 1(d) and 2(d) report the results using the Wald (ap- sets based on both the constrained-minimized criterion in proximate quadratic) criteria. The results for the Wald and panel (c) and the true criterion in panel (b). The latter result constrained-minimized criteria are identical for the iterative supports the recommendation of Stock and Wright (1995) and two-step estimators. This occurs because the model is to base confidence intervals on the level of the continuouslinear in the parameters and the weighting matrix is fixed updating criterion. in constructing (J'r - JT). For the continuous-updating Parameter Estimates. Table 4 reports summaries of meaestimator, the results for the Wald and the constrainedminimized criteria are different due to the dependence of sures of central tendency for the three estimators of - along the weighting matrix on the hypothetical parameter values. with 10% and 90% quantiles. The medians for the two-step Notice that the small-sample distributions of the mini- and iterative estimators are considerably lower than the true mized criterion functions for the iterative and two-step es- value of -, whereas the median bias for the continuoustimators are greatly distorted. The small-sample tests of the updating estimator is much smaller. The distribution for the overidentifying restrictions based on the minimized crite- continuous-updating estimator, however, is also more disrion values are too large, leading to overrejections of the persed, as evidenced by the larger increment between the model when using these estimators. The minimized crite- 10% and 90% quantiles. Moreover, the Monte Carlo samrion function for the continuous-updatingestimator is much ple means for the continuous-updating estimator are much better behaved, and the small-sample distribution is very more severely distorted than they are for the other two esclose to being X2 for both Monte Carlo experiments. Tests timators. The enormous sample means for the continuousof the overidentifying restrictions of the models using the updating estimator occur because in the case of no time minimized value of the criterion function of the continuous- averaging and of time averaging there were 23 and 31 samupdating estimator have the correct size for the model with- ples, respectively, in which the estimates are, in absolute out time averaging and similarly for the model with time value, larger than 100. When these are removed from the averaging for probability values less than about .1. Even Monte Carlo samples, the sample means of the continuousfor probability values greater than .1, the distribution of the updating estimator are closer to the true values than are the minimized criterion for the continuous-updating estimator means for the other two estimators. Recall that the analog to the two-step and iterative estiis not greatly distorted. The finite-sample coverage probabilities of the three mator in the classical simultaneous-equations model is twoways of constructing confidence regions for 7 are depicted stage least squares and that the analog to the continuousin subplots (b), (c), and (d) in Figures 1 and 2. Recall that updating estimator is limited-information (quasi) maximum the constrained-minimized and Wald criteria coincide when likelihood. It is known from the literature that there are they are based on the two-step and iterative estimators but settings in which the two-stage least squares estimator has differ when the continuous-updating estimator is used. The finite moments, but the limited-information maximum likesmall-sample coverage probabilities are greatly distorted lihood estimator does not (e.g., see Mariano and Sawa 1972; for the intervals constructed with the iterative and two-step Sawa 1969). In light of these theoretical results and our estimators in all cases. In particular,they do not contain the Monte Carlo findings, the continuous-updating estimator true parametervalues as often as is to be expected from the is not an attractive alternative to the other estimators we limiting distribution. In the case of the continuous-updating consider if our bases of comparison are the (untruncated) estimator, for low p values coverage rates are too small for moment properties [or even relative squared errors as in Hansen, Heaton,and Yaron:Finite-SamplePropertiesof Some AlternativeGMMEstimators (b) Iterative (a) Continuous-Updating 1.5 1.5 0.5 0.5 -1.5 -1 0 Angle 1 .5 -1.5 (c) Two Step 2 1.5 -1 0 Angle 1 271 be imposed for identification,we restrictattentionto the interval[-ir/2, ir/2]. In smoothingthe histogram,we used Gaussiankernel with a bandwidthof .1. The value of the densityestimateis plottedat each of the samplepoints using a circle. The shapeof the smootheddistributionalong with the massof the plottedcirclesprovidesevidenceabout the small-sampledistributionof the parameterestimators. SC Notice that the primarymodes of the continuous-updating angle estimatorare very close to the trueparametervalues, but the modes of the other two angle estimatorsare distorted.Moreover,the densityestimatesfor the modalangle are largerfor the continuous-updating method.The Monte Carlo distributionsfor the continuous-updating angle estimatoralso, however,have secondarymodes near -r/2, estimatesof 7ywiththe correspondingto large-in-magnitude wrong sign. 0.5 CriterionFunction. The criterion Continuous-Updating function for the estimatorcan somecontinuous-updating 0 1 -1.5 -1 1.5 timesleadto extremeoutliersfor the minimizingvalueof y. Angle This occursin the two Monte Carloexperimentsfor some of EstimatedAngle Impliedby (1, Figure3. Smoothed Distribution of the trials, as we discussedpreviously.To see why this MonthlyLognormalModel,No TimeAveraging. "YT), can occur, supposefor simplicitythat there is a single return under consideration,no time averaging,and several the work of Zellner(1978)].On the otherhand,Anderson, instruments. The momentconditionsare constructedusing Kunitomo,and Sawa (1982) advocateduse of the limited informationestimatorover the two-stageleast squaresesti(26) b(Xt, g) - [log Rt+l - g log(ct+1/ct)]zt, matorbecause,amongotherthings,the medianbias of the formerestimatoris smaller.We also find less distortionin the mediansfor the continuous-updating estimatorin our where zt is a vector of instrumentsand E[I(Xt,7y)] = 0 [see Eq. (12)].Becausethe momentconditionsare linearin experiments. the criteriafor the iteratedand two-stepestimatorsare g, As we noted previously,one attractiveattributeof the estimatoris its invarianceto ad hoc quadraticin g. In contrast,the criterionfor the continuouscontinuous-updating of the momentcon- updatingestimatorconvergesas g gets large. To see this, (parameterdependent)transformations ditions. For example supposethat we reparameterize(12) observe that for a large value of g the sample average of b(Xt,g) is approximatelyg times the sample mean of as log(ct+l/ct)zt and the samplecovarianceis approximately + (25) = yolog(ct+l/ct) y11log Rt+l rt+1, (b) Iterative (a) Continuous-Updating 2 assumingthat = 0. In (25) the parametersyo and yi are 2 not uniquelyidentifiedby the momentconditions,whereas 1.5 in (12) allows identifi- 1.5 their ratio is. The parameterization cation of yo by setting -y = 1. The continuous-updating estimatorof y in (12) is invariantto how this identification is achievedvia restrictionson (25). Notice howeverthatthe 0.5 B 0.51 two-step and iterativeestimatorsare sensitive to the chosen normalization.Hillier (1990) achievedidentificationin 1 1 1.5 -1.5 -1 0 0 1.5 a differentmannerby makingthe directionfrom the origin -1.5 -1 Angle Angle definedby (Y70, 71) in two-dimensionalspace the object of (c) Two Step interest.Althoughthis angle is identified,the lengthof the ray from the origin along which the true value of (Y70, 71) 1.5 is not. Hillier's defense of limited informationmaximum likelihoodover conventionaltwo-stageleast squaresis that the formeris a betterestimatorof direction. To see whether such a conclusion might well extend 0.5 to comparisonsbetween the continuous-updating estimator and the other two estimatorswe consider,we report smootheddistributionsof the estimated"direction"in FigAngle ures 3 and 4. We measuredirectionby the angle (as measuredin radians)betweenthe horizontalaxis and the point of EstimatedAngle Impliedby (1, Figure4. Smoothed Distribution (1, 7). Becausethereis still a sign normalizationthatmust "/r), Monthly Lognormal Model, Time Averaging. 272 Journalof Business & EconomicStatistics,July 1996 Criterion with"Large" Estimateof Gamma 120 100 80 0 60 S40 0 20 -20 -15 -10 -5 0 5 10 15 20 10 15 20 g AverageCriterion 80 S60 " 40020 0 -20 -15 -10 -5 0 5 g Figure5. CriterionFunctionforContinuous-Updating Estimator,MonthlyLognormalModel,No TimeAveraging. times the sample covarianceof log(ct+l/ct)zt. There- the parametervectoris of largedimension,however,implefore, for large g, the criterionfunction is approximately mentingthe continuous-updating estimatormay sometimes a quadraticform that is (1/T) times the chi-squaredtest be difficult. statisticfor the null hypothesisthat Estimationof y and 6. The Monte Carloenvironments (27) of the lognormaland Markov-chainmodels differ in the E[log(ct+1/ct)zt] = 0. As a result it is possible for the minimizedcriterionfor assumedlaw of motion for consumptionand returnsand the continuous-updating criterionto be minimizedat a very in the numberof estimatedparameters.To allow a more direct comparisonwith the Markov-chainresults, we now large value of g., To furtherillustratethis potentialproblem,the upperplot considera MonteCarloexperimentwith the log-linearlaw in Figure5 is of the criterionfunctionfor the continuous- of motion,whereboth y and 6 are estimated.This second updating estimator for a Monte Carlo draw in which parameterallows us to check whetherthe favorableperestimatorreportedin the value of g that minimizes the criterion function is formanceof the continuous-updating 2 and is to the 1 The Figures lower in unique 5 the single-parameter setup. 673,750.4. plot Figure gives average To implement this case, we assume that 6 = .971/12 and value of the criterionfunction over the Monte Carlo experiments.These results are for the case of no time aver- that the meanof the (log of) consumptiongrowthis equal aging. Notice that in the upperplot the criterionfunction to its samplemean.These parametervalues,alongwith the approachesits lowest value as g becomeslargein absolute model for B(L) in (13), imply a set of restrictionson the value.Even in the lower plot the criterionfunctionasymp- unconditionalmeanof Yt given in (11). MonteCarlosamtotes to a local minimumfor largenegativevaluesof g. The ples of Yt were drawnunderthese restrictions.To estimate numericalsearchused to implementthe estimatorcouldbe 6 and y we used momentconditionsMl of Table3. The resultsof this MonteCarloexperimentaredisplayed complicatedby the flatsectionsof the criterionfunctionand the searchroutinecould end up spuriouslysearchingin the in Figure 6. As in Figures 1 and 2, rejectionrates of the directionof very largevaluesof g. This is particularlytrue overidentifyingrestrictionsin the case of the continuousfor a gradient-based method(suchas fminu.min MATLAB) updatingestimatorare similar to those predictedby the in which the routineattemptsto set the gradientof the cri- asymptoticdistributions,at least for probabilityvalues of terion functionto 0. When the parametervector is of low less than.2. Moreover,as before,the small-sampledistribudimension,this problemcan easily be assessedby gridding tions of the minimizedcriteriaof the two-stepanditerative the parametervectorandevaluatingthe criterionfunctionat estimatorsaregreatlydistorted.In this case,however,a test the gridpoints.This is whatwe did to makesurethatlarge basedon the minimizedcriterionof the two-stepestimator estimatesof g were not due to numericalproblems.When would underrejectthe restrictionsof the model. Coverage 92 Hansen, Heaton,and Yaron:Finite-SamplePropertiesof Some AlternativeGMMEstimators (a) Minimized - 0.5 (b) True 0.5 . 0.4 0.4 0.3 / .0.3 0.2 0.1 0.2 - 0.1 : 00 0.1 0.2 0.3 0.4 00 0.5 0.1 (c) Constrained-Minimized 0.5 . 0.4 0.5 0.3 0.3 0.2 . 0.1 0.1 0 0.1 0.2 0.3 0.4 o00? 0.5 . ?" 0 0.1 0.2 0.3 0.4 0.5 Figure 6. CriterionFunctions,TimeAdditiveModel,6 and -7 Estimated:I---. , Iterative;- - -, Two-Step; , Continuous-Updating. probabilitiesof the truevaluesof both6 and-ybasedon the "true"criterionaccordwell withthe asymptoticdistribution estimator.The coverwhen using the continuous-updating age probabilitiesfor the othertwo estimators,however,are substantiallyat odds with the theory. 0.5 0.5 0.4 Results for y = 1.3 and 6 = .97. We start by discussing the results obtainedusing the more "moderate" values of the preferenceparametersTS1 that were used by Tauchen(1986). Figure 7 displays the results for the first set of eight momentconditionsMl. Note from panel (b) that the GMMcriterionfunctionsevaluatedat the true parametervalues are largerthan predictedby the asymp(b) True 0.5 0.5 0.4 0.4 0.3 0.3 0.4 - 0.3 First we considerthe results for time-separablepreferences (0 = 0) using the Markov-chainmodel calibratedto annualdata. For these runs VT(b)is computedas a simple covariancematrixestimator.The resultingp-valueplots are depictedin Figures 7-9 for differentcombinationsof the preferencesettingsand momentsets given in Tables2 and 3. Featuresof the Monte Carlo distributionsfor the point estimatesof -yare reportedin Table5. (a) Minimized (b) True (a) Minimized Referring now to panel (c), the coverage rates for the true-constrained criterionare somewhattoo small for the continuous-updatingestimator.The coverage probabilities are much more seriously distorted,however, for the two-step and iterativeestimatorsjust as in Figures 1 and 2. Moreover,confidenceintervalsbased on the trueestimator constrainedcriterionfor the continuous-updating are a little better behavedthan those based on the Wald statisticfor all threeestimators.To summarize,the results displayedin Figure6 are largelyconsistentwith the results in which-yis the only parameterestimated. 5. MONTECARLORESULTS, MARKOV-CHAIN MODELS 5.1 Time-SeparablePreferences,AnnualData 0.5 0.4 0 0.3 (d) Wald 0.4! 0.2 0.2 273 0.3 .- 0.2 0.2 0.2 0' , 0.1 00 0.1 0.2 0.3 0.4 0.5 00 0 0.1 . 0.4 - 0.3 ..' 0.2 0.1 0.1 0 0.1 0.2 0.3 0.4 0.5 - 0.2 0.3 0.4 0.5 0 0 0.1 0.2 0.3 0.4 0.5 0.4 0.5 (d) Wald 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 / ' - 0.1 00 " 0.1 1.3, 8 = O, Moment Conditions MI:--.-., , Continuous-Updating. 0.1 ' (c) Constrained- Minimized 0.2 0.3 0.4 Iterative;- - -, Two-Step; . '. /0.1-/..:. 0 0.5 Model,6 = .97, Figure7 CriterionFunctions,AnnualMarkov-Chain 7= 0 0.5 0.5 0.2 0 0.4 0.4 .' 0.3 0.3 (d) Wald (c) Constrained- Minimized 0.5 0.2 . .,I 0.1 0.1 0.1 0.2 , 0.1 0.2 0.3 0.4 0.5 0 0.1 0.2 0.3 Model,6 = .97 Figure8 CriterionFunctionsAnnualMarkov-Chain 7= - 1.3, = O, Moment Conditions M2: ----., , Continuous-Updating. Iterative;- - -, Two-Step; Journalof Business & EconomicStatistics,July 1996 274 (a) Minimized 0.5 (b) True 0.5 0.4 0.4 . 0.3 0.3 0.2 0.2 / 0.1 0.1 0' 0 0.1 0.2 0.3 0.4 0.5 00 , / 0.1 0.2 0.3 - Minimized (c) Constrained 0.5 0.3 / A' 0.2 , 0.4 /. 0.3 0.5 (d) Wald 0.5 0.4 0.4 / 0.2 . 0.1 0.1 00 0.1 0.2 0.3 0.4 0.5 0" 0 0.1 0.2 0.3 0.4 0.5 Model, S = Figure 9. CriterionFunctions,Annual Markov-Chain 1.139, -y = 13.7, = 0, Moment ConditionsM2: I---., iterative; - - -, Two-Step; , Continuous-Updating. totic theory.Althoughthe asymptoticapproximationsare betterwith the continuous-updating estimator,at least for smallerprobabilityvalues,the criterionevaluatedat the true parametersis still too large. On the other hand,the minimizedcriteriafunctionsaremuchbetterbehaved,especially for the continuous-updating estimatorwith probabilityvalues less than.15 [see Fig. 7(a)].Confidenceintervalsbased Table5. Propertiesof Estimatorsof 9y, Markov-Chain Model,AnnualData Property Continuous-updating Iterative Two-step A. TS1-M1, true -y = 1.3, 6 = .97 Mean Truncatedmean* Median 10%quantile 90% quantile 1.90 1.90 1.15 0.61 5.76 .85 .85 .70 .13 1.60 1.08 1.08 .78 -1.25 4.09 1.35 1.35 1.10 -1.35 3.51 1.35 1.35 1.25 -2.71 5.59 B. TS1-M2,true y = 1.3, 6 = .97 Mean Truncatedmean* Median 10%quantile 90%quantile 1.36 1.53 1.17 .61 4.10 Results for y -- 13.7 and 6 = 1.139. C. TS2-M2,truey = 13.7,6 = 1.139 Mean Truncatedmean* Median 10%quantile 90% quantile 6.29 14.80 14.14 8.96 22.68 13.68 13.68 13.10 9.01 19.40 on criteria-function behaviorperformedpoorly in this setting, althoughthey performedbetter for the continuousupdatingestimatorthanfor the othertwo estimators.Moreconfidenceintervalsfor over, the criterion-function-based the continuous-updating estimatorprovedto be more reliable than the confidenceintervalsbased on the Wald statistic. Figure 8 reportsplots for the set of four momentconditions M2. The reductionin momentconditions(relative to M ) leads to an improvementin the underlyingcentral limit approximationsdepictedin panel (b). This is especriterionfunction cially true for the continuous-updating exceptat largeprobabilityvalues.The minimizedcriterionfunctionvaluesused to test the overidentifyingrestrictions behaveas predictedby the asymptotictheoryfor all three estimators.The criterion-function-based confidenceintervals work quite well for the continuous-updating estimator but not for confidenceintervalsbasedon the Waldstatistic [comparepanels(c) and (d)]. In summary,asymptotictheory gives a poor guide for statisticalinferencein the case of momentconditionsM1. Presumablythis occursbecauseof the manymomentconditionsrelativeto the samplesize. For the smallermoment conditionset M2, the asymptoticapproximations workconsiderablybetter.Forbothmomentconditionsthe asymptotic distributionsfor the overidentifyingrestrictionstests and criterion-function-based inferenceare more reliablewhen basedon the continuous-updating estimator. Recall that the minimized value of the continuousupdatingcriterionis alwaysless thanor equalto thatof the limitingcriterionof the iterativeestimator.Tauchen(1986) reportedcasesin whichthe minimizedcriterionfunctionfor the iterativeestimatorled to underrejection of the overidenIn restrictions. these the cases of the tifying underrejection restrictions more was when usoveridentifying pronounced the criterion function. See Hansen ing continuous-updating et al. (1994) for an analysisof one of these cases. Of course,assessingthe reliabilityof the asymptotictheory as appliedto the differentparameterestimatorsis a differentquestionthanassessingthe performanceof the parameterestimatorsthemselves.In regardto this latterquestion, the resultsin the firstportionof Table5 showthatthe estimatorhas considerablyless bias in continuous-updating the medianthanthe othertwo estimatorsin the case of M1. The iterativeestimator,however,has much less dispersion in this case as measuredby the widthbetweenthe .10 and .90 quantiles.In the case of M2, thereis little medianbias for all three estimators.On the otherhand,the dispersion in the continuous-updating estimatoris smallerthanthatof the othertwo estimators. 14.10 14.10 13.43 9.85 19.54 * Estimateswithabsolutevaluesgreaterthan100 wereexcludedfromthe computation of the truncatedmeans. Next we con- sider resultsusing the preferencespecificationconsidered by Kocherlakota(1990a) (TS2 in Table 1). We only consider the performanceof GMM estimatorsobtainedusing momentconditionsM2. Ourresultsare displayedin Figure 9 and Table 5. With this change in parameterconfiguration, the resultsfor the M2 momentconditionsare similar to those in Figure 8. In regardto the parameterestimates, Hansen, Heaton,and Yaron:Finite-SamplePropertiesof Some AlternativeGMMEstimators (a) Minimized 0.5 275 (a) Minimized (b) True 0.5 0.4 0.4 0.3 0.3 0.2 0 0 0.1 0.2 0.3 0.4 0.5 - Minimized (c) Constrained 0.5 0.4 0.5 0.1 0.2 0.3 0.4 0.5 7 0.1 0.2 = 1.3, 0 = 0, Moment Conditions Ml: ---. Two-Step;- 0 0 0.1 0.2 0.3 0.4 0.5 0.3 0 0 0.1 0.2 0.3 0.4 0.5 , Iterative;- --, , Continuous-Updating. again the median bias is small relative to the dispersion for all three estimators.The continuous-updating estimator displayssomewhatmore dispersionthanthe other two estimators. 5.2 Time-SeparablePreferences, MonthlyData To explore the extent to which the limiting distribution providesa betterguidefor inferencefor largersamplesizes (with less extremedatapoints),we redid some of our calculationsusing simulationscalibratedto monthlydata as described in Subsection2.2. We focused exclusively on the more "moderate"preferenceconfigurationadjusting6 accordingly.In this case we only looked at estimatorsconstructedusing momentconditionsMl andM2. We areparticularlyinterestedin momentset M1 becauseof its common use in practicewhen analyzingpostwardata.Ourresults are reportedin Figures10 and 11 andTable6. Notice r 0.4 0.3 0.2 0.2 0.1 0.1 0 0.1 0.2 0.3 0.4 0.5 0 0.1 0.2 0.3 0.4 0.5 Figure 11. CriterionFunctions,MonthlyMarkov-Chain Model,6 = y = 1.3, - - -, Two-Step; .971/12, = 0, Moment Conditions M2:.--I, Continuous-Updating. , Iterative; have distributionsthatare muchmore concentratedaround the trueparametervalue thanthe distributionsfor the twostep estimator(againsee Table6). The performanceof the two-stepestimatorcould potentiallybe improvedby using a differentweightingmatrixin the first step. For example, the residualsfromnonlineartwo-stageleast squaresapplied to eachEulerequationcouldbe usedto estimatethe asymptotic covariancematrixof the momentconditions.Another possibilityis to use the covariancematrixof the prices of the "synthetic"securitiesimplicitin the use of instrumental variables.See HansenandJagannathan (1993)for a discussion of this weightingmatrix. Table6. Propertiesof Estimatorsofy, Markov-Chain Model, Monthly Data: True-y = 1.3, 6 = .971/12 Property Continuous-updating Iterative Two-step 1.25 1.25 1.18 .96 1.61 1.57 1.57 1.17 -2.12 4.63 1.38 1.38 1.27 1.00 1.89 1.84 1.84 1.27 -2.95 6.50 A. TS1-M1 Mean Truncatedmean* Median 10%quantile 90% quantile updating estimates are very close to one another when M2 1.37 1.37 1.28 1.02 1.85 B. TS1-M2 is used. This is reflectedin quantilesreportedin Table6 as well as in the "Minimized" and "Constrained-Minimized" Mean mean* graphs.Presumably,the reasonfor this is that the weight- Truncated Median ing matrix tends to be a relatively "flat" function of the 10%quantile 90% quantile parameters. In regard to the parameter estimates of -y, both the continuous-updating estimator and the iterative estimators (d) Wald 0.3 that all of the asymptotic approximations are consistently reliablefor the continuous-updating estimationmethod.In sharpcontrast,large-sampleinferencesfor the two-stepestimatorare of particularlypoor qualitywith the exception of the overidentifyingrestrictionstest usingM2. Moreover, it is of note thatthe iterativeestimatesandthe continuous- 0.4 0.5 0.5 0.4 Model,6 = Figure 10. CriterionFunctions,MonthlyMarkov-Chain .971/12, 0.1 0.5 0.5 0 0 0.2 . - Minimized (c) Constrained (d) Wald 0.1 0.1 0 0 0.4 0.2 . / 0.3 0.3 , 0.2 0.1 0.2 0.4 / 0.3 0 0 0.3 0.1 0.1 / 0.1 . 0.2 0.2 / 0.4 ." .0.4 0.3 (b) True 0.5 0.5 1.38 1.38 1.27 1.00 1.89 * Estimateswithabsolutevaluesgreaterthan100 wereexcludedfromthe computation of the truncatedmeans. 276 Journalof Business & EconomicStatistics,July 1996 separablepreferenceswhenthe restriction0 = 0 is imposed in the estimation,the estimatorVT(b) of the asymptoticcovariancematrixaccommodatesan MA(1) structurein the 0.4 0.4 // Euler-equationerrors.As in Section 3, we used the VT(b) estimatorgiven in (24) exceptwhenit was not positivedef0.3 0.3 inite, in whichcase we shiftedto Durbin's(1990)estimator 0.2 0.2 /. with a fourth-orderautoregression. Figure12 presentsthep-valueplotsfor the criterionfunc0.1 0.1 tions for the differentsettingsfor 0. The p-valueplots of Figures 1, 2, 6, and 7-11 consideredresultsfor fixedpref00 '0 0.1 0.2 0.3 0.4 0.5 0 0.1 0.2 0.3 0.4 0.5 erence parametersand the three differentestimators.The same MonteCarlodatawere used for the experimentsfor Constrained - Minimized Wald each estimatorin these plots. 0.5 0.5 . In contrastto the time-separablecase (the solid line in 0.4 0.4 Fig. 8), the distributionof the minimizedcriteria imply .4 small-sampleoverrejectionof the moment conditionsfor 0.3 0.3 each of the settingsfor 0. Furthermore,even when evaluated at the true parameters,the criterionfunctionsare not 0.2 0.2 distributedas a chi-square.This occursfor all four settings 0.1 0.1 of 0 including0 = 0 (time-separable preferences).Evidently 0 0 ............... the estimatorof the asymptoticcovariancematrixof the 0 0 0 0.1 0.2 0.3 0.4 0.5 0 0.1 0.2 0.3 0.4 0.5 momentconditions,which assumesan MA(1) structurefor the errors,causes small-sampledistortionof the GMMcriModel,6 = Figure 12. CriterionFunctions,AnnualMarkov-Chain terion function.Notice that the distributionsof the Wald y = 1.3, Time Nonseparable, Moment Conditions M2: ---., .971/2, statistics are very far from being chi-squared.Consistent , ;---, ,= = = O 0;-with the resultsreportedin Section 4 and Subsections5.1 These particularMonte Carlo experimentsare ones in which the unconditionalmomentrestrictionsprovidemuch Table7. Propertiesof Estimatorsof y and 6, Markov-Chain moreidentifyinginformationaboutthe powerparameter-y Model,AnnualData:Continuous-Updating Estimator, thananyof the otherexperimentswe reporton. Not only are Truey = 1.3, 6 = .97 estimatesmore accuratethan the estimatesobtainedfrom Property 7 0 the MonteCarloexperimentscalibratedto annualdata,but A. TNS1-M2,true6 = also the estimatesfrom the lognormalMonteCarloexperimentsreportedin Section3 thathavethe samesamplesize. Mean 4.56 5.99 x 105 .83 Truncatedmean* 4.56 Presumablythe mainreasonfor this latterdisparityis that 1.71 Median .52 the Markov-chainmodelsare calibratedto dividendbehav10%quantile .64 .10 ior ratherthanreturnbehavior.As is well knownfrom the 90% quantile 10.83 1.25 empiricalasset-pricingliterature,the dividendcalibrations implyreturnsthatareless volatilethanhistoricaltime series B. TS1-M2,true0 = 0 becauseof some fundamentalmodel misspecification. Mean -.34 5.88 Minimized 0.5 True 0.5 5.3 Time-NonseparablePreferences,AnnualData As we discussed in Subsection 5.1, the continuousupdatingestimatorgenerallyprovidesmore reliableinference in the case of time-separablepreferenceswhen data are generatedfrom the annualMarkov-chainmodel. Even for that estimator,however,it is only when momentconditions M2 are used that the distributionsof the criteria TJT (minimized), TJ( (true) and T(J? - JT) (con- strained-minimized)accord well with the corresponding chi-squareddistributions.For thesereasonswe considerresults with time-nonseparable preferencesusing only moment conditionsM2 and only for the continuous-updating estimator.To constructour Monte Carlodatasetswe used the threetime-nonseparable settingsof theparameterslisted in Table 2 as TNS1, TNS2, and TNS3 (0 = 0 = - ,1 0 = - ). We also used data generated with 0 = 0 but still estimatedwith the parameter0. Unlike the case of time- Truncatedmean* Median 10%quantile 90% quantile 3.67 .73 .00 10.92 -.34 -.12 -.92 .18 C. TNS2-M2, true 6 = -- Mean Truncatedmean* Median 10%quantile 90% quantile 3.05 2.69 .97 .01 8.28 D. TNS3-M2,true0 = .41 Mean 2.16 Truncatedmean* 1.22 Median .04 10%quantile 5.79 90% quantile -.46 -.46 -.42 -.93 -.01 .07 -.64 -.69 -.94 -.40 * Estimateswithabsolutevaluesgreaterthan100 wereexcludedfromthe computation of the truncatedmeans. Hansen, Heaton,and Yaron:Finite-SamplePropertiesof Some AlternativeGMMEstimators Minimized 0.5 . 0.4 True 0.5 0.4. .2 0.3 0.3 0.2 0.2 r 0.1 0.1 o 0 0.1 0.2 0.3 0.4 0.5 0 0 0.1 Constrained- Minimized 0.3 0.4 0.5 Wald 0.5 0.5 0.4 0.4 0.3 0.2 0.3 , 277 equationerrorsis close to being singularat this value of 0. Once again we used the estimatorof V(b) given by (24). When Durbin's(1960) estimatorwas necessarywe used a twelfth-orderautoregression. Figure 13 reportsthe resultsfor criterionfunctionsusing moment conditionsMl. Under Ml, the minimizedcriterion performsreasonablywell for all three settings of 0. Tests of the overidentifyingrestrictionsof the model have the correct small-samplesize in this case. Notice further that the criterionT(Jd - JT) (constrained-minimized) is close to being chi-squareddistributedbut that the distribution of the Wald statistic is very far from chi-squared. Hence, we continueto find that inferencesbased directly on criterionfunctions are much more reliablethan those based on quadraticapproximationsto the criterionfuncTable 8. Propertiesof Estimatorsof 7yand 6, Markov-Chain Model, 0.2 0.2 0.1.. 0.1 C Monthly Data: Continuous-Updating Estimator, True 7 = 1.3, 6 = .971/12 Property 0 0.1 0.2 0.3 0.4 0.5 0 0= ;-, 3' 0 A. TNS1-M1, true 0 = 1 0.1 0.2 0.3 0.4 0.5 Model,6= Figure 13. CriterionFunctions,MonthlyMarkov-Chain .971/12, 7 = 1.3, Time Nonseparable, Moment Conditions Ml: ---. = 3. = 0;-, and 5.2, confidenceintervalsfor 7 constructedusing the criterionfunctionperformmuchbetterthanthose basedon the Waldstatistic. Table 7 reportsstatisticssummarizingthe propertiesof the estimatorsof -y and 0. The estimatorof -y does not in generalperformas well as in the time-separablepreference case (see Table5 for the comparison).The medianfor 7T is substantiallybelow the truevalue of -yin the case of 0 = 0 1 and above for 0 = 1. and 0 Furthermore, the disof is of the estimators 7 considerablylargerthan persion is when time separability correctlyimposed,at least in part due to having to estimate an additionalparameterand to accommodateMA(1) termsin the estimatorof the asymptotic covariancematrixV(b). Regardingthe estimatorsof 0, thereis substantialdispersionfor each case as evidenced by the 10% and 90% quantiles.Notice furtherthat there is some medianbias in the estimatorsof 0 for the cases of 0 = g,0, and - (panelsA, B, and C of Table 7). In summary,the annualdatado not permitsimultaneousestimation of 0 and 7 with any reasonableprecision,at least for momentconditionM2. 5.4 Time-Nonseparable Preferences, Monthly Data Using the monthlyMarkov-chainmodel, we also exam- ined the time-nonseparable model for 0 = g, 0, and Mean Truncatedmean* Median 10%quantile 90% quantile 13.56 2.90 1.27 .79 7.35 .35 .35 .34 .22 .48 B. TS1-M1,true0 = 0 16.68 4.18 1.29 .38 13.46 -.01 -.01 .01 -.23 .19 C. TNS2-M1,true6 = -1 6.03 5.82 1.24 .00 17.95 -.42 -.42 -.35 -.98 .09 Mean Truncatedmean* Median 10%quantile 90%quantile D. TNS1-M2,true = 32 153.96 11.09 1.45 .48 410.29 .66 .66 .37 .16 1.16 E. TNS1-M2,true6 = 0 Mean 62.51 17.89 Truncatedmean* Median 1.40 10%quantile .03 142.79 90%quantile -.14 -.14 .04 -.66 .26 Mean Truncatedmean* Median 10%quantile 90% quantile Mean Truncatedmean* Median 10%quantile 90% quantile 1 andfor momentconditionsM2. We furtherconsideredmomentconditionsMl1becausetheseconditionsareoftenused in practice and because the continuous-updating estimareasonablesmall-samplepropertiesunder torsdemonstrated time-separablepreferences.We did not considerthe case of 0 = - with the monthly model because the Markovchain model implies that the covariance matrix of the Euler- F TNS2-M2,true = -• Mean Truncatedmean* Median 10%quantile 90% quantile 12.98 12.78 .35 .01 40.99 -.40 -.40 -.58 -.91 .17 * Estimateswithabsolutevaluesgreaterthan100 wereexcludedfromthe computation of the truncatedmeans. Journalof Business & EconomicStatistics,July 1996 278 True Minimized 0.5 0.5 0.4 0.4- 0.3 0.3 0.2 0.2 r 0.1 0.1 00 0.2 0.1 0.3 0.4 0.5 0 0 0.1 Constrained- Minimized 0.5 0.4 0.4 ' . o0 0.4 0.5 0.4 0.5 0.3 0.2 .2 0.2 0.1 0.3 Wald 0.5 1 0.3 0.2 ... 0.1 0.1 . 0.2 0.3 0.4 0.5 o0 0.1 0.2 0.3 Model,6 = Figure 14. CriterionFunctions,MonthlyMarkov-Chain .971/12, y = 1.3, Time Nonseparable, Moment Conditions M2: ---. 1 0= = "3'. 0O= ', " , ' 0;-, tions. Finally,we reportsummarystatisticsof the central tendencyof the estimatorsof -yand0 in Table8, panelsA, B, andC. Lackof priorknowledgeof the parameter6 again causes the estimatorof -yto be much less precise as measuredby the distancebetweenthe 90%and 10%quantiles. (Forinstance,comparethe firstcolumnof Table6 to panel B of Table8.) Figure 14 presentsthe p-valueplots for momentconditions M2. In this case the distributionof the minimizedcriterionfunctionsandthe constrained-minimized criteriaare not chi-squared.The model'soveridentifyingconditionsare rejectedtoo often for all of the parametersettings,andconfidenceintervalsfor the parameter-yhave the wrong coverage probabilities.With momentconditionsM2, allowing for time nonseparabilityresultsin a substantialnumberof very largeestimatesof -yas reflectedby the size of the 90% quantiles(see Table8, panelsC, D, and E). It appearsthat "theadditionof returnsas instruments(the differencebetween momentconditionsMl and M2) improvesthe performanceof the estimatorand the quality of the central limit approximations. 6. CONCLUDINGREMARKS In this article we examinedthe finite-sampleproperties of three alternativeGMM estimatorsthat differin the way in which the momentconditionsare weighted.Particular attentionwas paid to both the performanceof tests of overidentifyingrestrictionsandto comparingalternative ways of constructingconfidencesets. In documentingfinitesampleproperties,we used severaldifferentspecifications of the CCAPM.The experimentsdifferedsubstantiallyin the amountof sample informationthat there is about the parametersof interest.Although the experimentsdo not uniformlysupportthe conclusionthatone estimatordominates the others,some interestingpatternsemerged. 1. Continuousupdatingin conjunctionwith criterionfunction-basedinferenceoften performedbetterthanother methodsfor annualdata;however,the large-sampleapproximationsare still not very reliable. 2. In monthlydata the centrallimit approximationsfor estimationmethodappliedin conthe continuous-updating methodof inferjunctionwith the criterion-function-based ence performedwell in most of our experiments,including ones in whichthe parametersareestimatedvery accurately and ones in which thereis a substantialamountof dispersion in the estimates. 3. Confidenceintervalsconstructedusing quadraticapproximationsto the criterion function performed very poorly in manyof our experiments. 4. The continuous-updating estimatortypicallyhad less medianbias thanthe otherestimators,but the MonteCarlo sampledistributionsfor this estimatorsometimeshadmuch fattertails. 5. The tests for overidentifyingrestrictionsare, by construction,more conservativewhen the weightingmatrixis continuouslyupdated,andin manycases this led to a more reliabletest statistic. inferOur reasonfor exploringcriterion-function-based ences and continuousupdatingis to assess some simple ways of making GMM inferences more reliable. Moreover, when continuousupdatingis used in conjunction with criterion-function-based inferences,the large-sample approximationsbecome invariantto parameter-dependent transformationsof the momentconditions.In this article we have made no attemptto explorethe ramificationsfor powerof the resultingstatisticaltests. Moreover,from the standpointof obtainingpoint estimates,we see no particular advantageto using continuousupdatingwhen minimizing GMMcriterionfunctions.For example,continuous updatingcan indeedfatten the tails of the distributionsof the estimators.In this sensecontinuousupdatingsometimes inheritsthe defects of maximumlikelihoodestimatorsrelative to two-stageleast squaresestimatorsin the classical environment. simultaneous-equations OurMonteCarloexperimentsfor monthlydatawere sufficiently successful to convince us to reexaminesome of the empiricalevidence for the CCAPM.In most tests of the CCAPM,the model'soveridentifyingconditionsarerejected (e.g., see Hansenand Singleton 1982). Because the two-stepor iterativeestimatoris typicallyused in practice, one potentialexplanationfor these rejectionscould be the tendencyof these estimatorsto result in overrejectionof the model in small samples.To assess this possibilitywe estimatedthe time-separableand time-nonseparable modestimator.We used the els using the continuous-updating consumptionand returndata describedin Subsection2.1 along with momentconditionsMl given in Table3. Estimationof the time-separablemodel resultedin point estimatesof 6 and 7 of .25 and 720.65, respectively.This Hansen, Heaton,and Yaron:Finite-SamplePropertiesof Some AlternativeGMMEstimators 279 Time SeparablePreferences,Delta Estimated 65 60 .2 55 S50 245 (. 400 2 4 6 8 12 10 Gamma 14 16 18 20 18 20 Time NonseparablePreferences,Delta andThetaEstimated 40 35 C: 230 1 0 25 M 20 1510 0 2 4 6 8 10 12 14 16 Gamma Figure 15. GMMCriterionFunctions,MonthlyData, -yConstrained,MomentConditionsM1. is an example in which the tail behaviorof the criterion resultsin largeestimatedvalueof 7. The minimizedGMM criterionwas 5.94 with an impliedp value of .43; hence, estimatorimplies it appearsthat the continuous-updating that the model is not at odds with the data. The estimate parameters,however,are very far frombeing economically plausible.As we found in severalof our Monte Carlo exestimator,extreme periments,with the continuous-updating point estimates of the parametersare possible. In those cases, however,there typically was little deteriorationin the criterionfunctionwhen evaluatednearthe trueparameter values, so in practiceit is importantto evaluatethe criterionfunctionat plausiblevalues of the parameters.In this case we restricted-7to the range[0, 20] andestimated6 for each hypotheticalvalue of 7. The resultingminimized criterionas a function of 'yis plotted in the top panel of Figure 15. Notice that for this range of y the minimized criterionfunction is well above 30, where the implied p value is essentially0. As a result, once a plausibleset of parametersis considered,the model is still rejectedwhen estimator. using the continuous-updating In estimationof the time-nonseparable model, the point estimatesof the parameterswerealso quiteimplausiblewith estimatesof 6, y-, and 0 of 1.20, 267.96, and .32, respectively.The bottompanelof Figure 15 presentsthe criterion estimatorwith -yrefunctionfor the continuous-updating = strictedto the range[0, 201.At -y 20 the criterionreaches a minimumof 13.55 with an impliedp value of .035. As a result,even at this extremevalue for -ythe model is still substantiallyat odds with the data. estimator In summary,althoughthe continuous-updating does not save the CCAPM,the experimentsthat we have presentedprovideevidencethatit shouldbe of use in many GMMestimationenvironments. ACKNOWLEDGMENTS "Wethank Renee Adams, Wen Fang Liu, Alexander Monge,MarcRoston,GeorgeTauchen,and an anonymous refereefor helpfulcommentsandMarcRostonfor assisting with the computations.Financialsupportfromthe National Science Foundation(Hansenand Heaton) and the Sloan Foundation(HeatonandYaron)is gratefullyacknowledged. Additionalcomputerresourceswere providedby the Social Sciences and PublicPolicy ComputingCenter.Partof this researchwas conductedwhile Heatonwas a NationalFellow at the HooverInstitution. [ReceivedJanuary1995. RevisedJuly1995.] REFERENCES Abel, A. (1990), "AssetPricesUnderHabitFormationand CatchingUp Withthe Joneses,"A.E.R.Papersand Proceedings,80, 38-42. Anderson,T. W., Kunitomo,N., and Sawa,T. (1982), "Evaluationof the DistributionFunctionof the LimitedInformationMaximumLikelihood Econometrica,50, 1009-1028. Estimator," G. (1990), "HabitFormation:A Resolutionof the Equity Constantinides, PremiumPuzzle,"Journalof PoliticalEconomy,98, 519-543. Davidson,R., and McKinnon,J. G. (1994), "GraphicalMethodsfor Investigatingthe Size andPowerof HypothesisTests,"DiscussionPaper 903, QueensUniversity,Dept.of Economics. Detemple,J., andZapatero,F. (1991),"AssetPricesin anExchangeEconEconometrica,59, 1633-1657. omy WithHabitFormation," 280 Journalof Business & EconomicStatistics,July 1996 Dunn,K., andSingleton,K. (1986),"ModelingtheTermStructureof Interest RatesUnderNonseparable UtilityandDurabilityof Goods,"Journal of Financial Economics, 17, 27-55. Durbin,J. (1960), "TheFitting of Time Series Models,"Reviewof the International Statistical Institute, 28, 233-244. Eichenbaum,M., and Hansen,L. (1990), "EstimatingModels With IntertemporalSubstitutionUsing AggregateTime Series Data,"Journal of Business & Economic Statistics, 8, 53-69. ences andTimeAggregation," Econometrica, 61, 353-385. (1995), "AnEmpiricalInvestigationof Asset PricingWith Time Preferences," Econometrica,63, 681-717. Nonseparable Hillier,G. (1990), "Onthe Normalizationof StructuralEquations:Propertiesof DirectionEstimators," Econometrica,58, 1181-1194. N. R. (1990a),"OnTestsof Representative ConsumerAsset Kocherlakota, S- Pricing Models," Journal of Monetary Economics, 26, 285-304. - (1990b),"ANote on the 'Discount'Factorin GrowthEconomies," Eichenbaum,M., Hansen,L., and Singleton,K. (1988), "A Time Series Journal of Monetary Economics, 25, 43-48. andLeisure Analysisof a Representative AgentModelof Consumption Magdalinos,A. M. (1994), "TestingInstrumentAdmissibility:Some ReChoice Under Uncertainty" Quarterly Journal of Economics, 103, 51finedAsymptoticResults,"Econometrica,62, 373-405. 78. Mariano,R. S., andSawa,T. (1972),"TheExactFinite-SampleDistribuRiskAversion,andTemporal Epstein,L., andZin,S. (1991),"Substitution, tion of the Limited-Information MaximumLikelihoodEstimatorin the Behaviorof ConsumptionandAsset Returns:An EmpiricalAnalysis," Caseof TwoIncludedEndogenousVariables," Journalof theAmerican Journal Political 263-286. of Economy, 199, Statistical Association, 67, 159-163. G. (1991),"HabitFormationandDurabil- Nelson,C. R., andStartz,R. Ferson,W.,andConstantinides, of theInstrumental (1990),"TheDistribution ity in AggregateConsumption: EmpiricalTests,"Journalof Financial VariablesEstimatorandIts t-RatioWhentheInstrument Is a PoorOne," Economics, 29, 199-240. Ferson,W.,andFoerster,S. (1994),"FiniteSamplePropertiesof the GeneralizedMethodof Momentsin Testsof ConditionalAssetPricingMod- Journal of Business, 63, 125-164. Novales,A. (1990),"SolvingNonlinearRationalExpectationsModels:A StochasticEquilibrium Modelof InterestRates,"Econometrica, 58, 93Journal els," of Financial Economics, 36, 29-55. 111. Estimationof Pakes, A., and Gallant,A., and Tauchen,G. (1989), "Seminonparametric Pollard,D. (1989), "Simulationand the Asymptoticsof ConditionallyConstrainedHeterogeneousProcesses,"Econometrica, OptimizationEstimators," Econometrica,57, 1027-1057. 57, 1091-1120. H. E., Jr.,andHeal,G. M. (1973),"OptimalGrowthWithIntertemRyder, S. Grossman, J., Melino, A., and Shiller, R. (1987), "Estimatingthe porally Dependent Preferences," Review of Economic Studies, 40, 1-31. Continuous-Time Consumption-Based Asset-PricingModel,"Journalof Sargan,J. (1958), "TheEstimationof EconomicRelationshipsUsing InBusiness & Economic Statistics, 5, 315-327. strumentalVariables," Econometrica,26, 393-415. Substitutionin Consumption," Hall, R. (1988),"Intertemporal Journalof Sawa, T. (1969), "The Exact SamplingDistributionof OrdinaryLeast Political Economy, 96, 339-357. Journalof theAmerSquaresandTwo-StageLeastSquaresEstimator," Hansen,L., Heaton,J., and Yaron,A. (1994), "FiniteSampleProperties ican Statistical Association, 64, 923-937. of Some AlternativeGMMEstimators," WorkingPaper3728-94-EFA, MassachusettsInstituteof Technology,SloanSchoolof Management. Stock, J., and Wright, J. (1995), "Asymptoticsfor GMM Estimators WithWeakInstruments," unpublishedmanuscript,HarvardUniversity, R. (1993), "AssessingSpecificationErrors Hansen,L., and Jagannathan, School. Kennedy in StochasticDiscountFactorModels,"unpublishedmanuscript. Sundaresan,S. (1989), "Intertemporally DependentPreferencesand the Hansen,L., and Sargent,T. (1991), "ExactLinearRationalExpectations of and Wealth," Review of Financial Studies, 2, Volatility Consumption Models," in Rational Expectations Econometrics, eds. L. Hansen and T. 73-89. Boulder: Westview 45-75. Press, Sargent, pp. Tauchen,G. (1986), "StatisticalPropertiesof GeneralizedMethod-ofInstrumental Hansen,L., andSingleton,K. (1982), "Generalized Variable MomentsEstimatorsof Structural Parameters ObtainedFromFinancial Estimationof NonlinearRationalExpectationsModels,"Econometrica, Market Data," Journal of Business & Economic Statistics, 4, 397-425. 50, 1269-1286. Methodsfor ObRiskAversion,andtheTemporal Tauchen,G., andHussey,R. (1991),"Quadrature-Based (1983),"StochasticConsumption, taining.ApproximateSolutionsto NonlinearAsset Pricing Models," Behavior of Asset Returns,"Journal of Political Economy, 91, 249-265. Econometrica, 59, 371-396. (1996),"EfficientEstimationof LinearAssetPricingModelsWith A. (1978), "Estimationof Functionsof PopulationMeans and Zellner, Moving-Average Errors,"Journal of Business & Economic Statistics, 14, CoefficientsIncludingStructuralCoefficients;A Minimum Regression 53-68. Journalof Econometrics,8, 127ExpectedLoss (MELO)Approach," PreferHeaton,J. (1993), "TheInteractionBetweenTime-Nonseparable 158.
© Copyright 2025