Generalized Estimating Equations (GEE) Siljander Ilona GEE provides a regression approach for GLMs when the responses are not independent THE MODEL IS USED IN u Longitudinal/ repeated measures studies u Clustered/ multilevel studies u Estimating Equation gives an approach to model the dataset parameters in order to predict such as mean for instance. u u It is based on maximum likelihood calculations which assumes mutual independent observations. GEE gives an approach on how to find “maximum likelihood estimators” for datasets, where the measurements/observations are not mutually independent. Estimating equations for normal linear model u The density for the normal distribution ( f y µ, σ 2 u ) "$ ( y − µ )2 &$ = exp #− ' 2 2 $% 2σ $( 2πσ 1 The likelihood function 2& "$ 1 $ y − µ ( ) i 2 2 L (µ, σ y1,..., yn ) = ∏ exp #− ln ( 2πσ ) − '= 2 2σ $( $% 2 i=1 2 -& "$ n * $ y − µ 1 ( ) i 2 /' exp #∑,− ln ( 2πσ ) − 2 2σ /.$( $% i=1 ,+ 2 n Estimating equations for normal linear model (continued) u Given a panel i ,the task is to model the expected value µi of the measured quantity, with some of the individual characteristics, for example biomarkers (vector xi ). u The linear combination of the covariates is called the linear predictor ηi = xi β g (µi ) = ηi = xi β u u Here the function g identifies whether the model is linear, log-linear etc. This is naturally related to the range restriction of the model. For instance if g ( x ) = ln ( x ) indicates that µ = exp (η ) > 0 which is natural if the expected value is always expected positive. Estimating equations for normal linear model (continued) u For the aforementioned modelling purposes we now subtitute likelihood function: µ = xi β in the 2& "$ 1 $ y − x β ( ) i i 2 2 L ( β, σ X, y1,..., yn ) = ∑ #− ln ( 2πσ ) − ' 2 2σ $ 2 i=1 % ($ n u By differentiation we obtain a sort of ”maximum likelihood estimate” for beta and sigma. In this case, these are the estimating equations. n 1$& ∂δ 4 (& 1 3% 6 = ∑ xij 2 ( yi − xi β )) &* j=1,..., p 6 3&' ∂β j i=1 σ 3 6 2 n + 3 ∂δ yi − xi β ) . 6 1 ( 0 6 3 2 = −∑-- 2 + 4 0 2σ 32 ∂σ i=1 , 2σ / 65 ( p+1) x1 = [ 0 ]( p+1) x1 Estimating equations for GLM u n Let’s consider a more general case with i panels consisting of i observations. Each parameter it is modeled with a regression model g (µit ) = xit β . We can also replace the normal density with a more general family of exponential distributions µ "$ y θ − b (θ ) &$ it it it exp # − c ( yit , φ )' a (φ ) $% $( u Here E ( yit ) = b' (θ it ) = µit , V yit = b'' θ it a φ and c is a normalizing constant to guarantee that the integral of the density = 1. ( ) ( ) ( ) Estimating equations for GLM (continued) u With this model the estimating equation will be 0)+ n ni 3 -+ " % y − µ ∂ µ 5 = [0] ψ ( β ) = 2*∑∑ it it $ ' x jit . px1 21+, i=1 t=1 a (φ ) V (µit ) # ∂η &it +/ 54 j=1,..., p px1 ∂µ u = 1 , but Observe that previously, in the case of normal linear model we had ∂ η in general it needs to be included in the formula. u Using a matrix notation, the above can be rewritten as /( n 2 ! y − µ $,* " % * −1 ∂ µ 4 = [0] ψ ( β ) = 1)∑ x Tji D $ '!"V (µi )#$ ## i i &&px1 1*+ i=1 4 ∂η & a (φ ) %*. # " 0 j=1,..., p 3 px1 u where u Here the identity matrix I corresponds to the assumption that the observations are mutually independent. 1/2 1/2 V (µi ) = "$ D (V (µit )) Ι (ni xni ) D (V (µit )) %' # &ni xni Generalized estimating equations u Our aim is to model the expected value with mutually dependent observations. Formally, we may do this by replacing the identity matrix with the corresponding correllation matrix R(alpha), where alpha is a parameter vector used to represent the correllation data: 1/2 1/2 V (µi ) = !# D (V (µit )) R (α )(n xn ) D (V (µit )) $& " %ni xni i i u Plugging this into the aforementioned estimating equations, we obtain the generalized estimating equations: (. n * " y − µ %20 " % 0 −1 ∂ µ 6 = [0] ψ ( β ) = 5/∑ x Tji D $ '()V (µi )*+ $$ i i ''3 px1 501 i=1 6 ∂η & a (φ ) &04 # # ) j=1,..., p + px1 u While the derivation of the estimating equations, we explained, is not valid anymore due to not having independent covariates, it turns out that the equations can still be justified (Liang, Zeger 1986). By solving the equations we find the coefficient matrix beta which models the expected value µit via g (µit ) = xit β Basic ideas of GEE u Objective: u u u Fit a model to repeated categorical responses, that is correlated and clustered responses Variables: u A response variable Yit can be either continuos or categorical (i for subjects and t for different time points) u X = ( x1, x2 ,..., xk ) as a set of explanatory variables which can be discrete, continous, or a combination. ( xi as in nixk matrix of covariates) Model u As said before, its form is like GLM, but full specification of the joint distribution is not required, and thus no likelihood function: g (µi ) = xiT β Basic ideas of GEE (continued) u Random component: u u Systematic component: u u A linear predictor of any combination of continuous and discrete variables Link function: u u Any distribution of the response that we can use for GLM, e.g., binomial, multinomial, normal, etc. Can be any g (µi ) , e.g., identity, log, logit, etc. Covariance structure: u Correlated data are modeled using the same link function and linear predictor setup (systematic component) as in the independence case, but the covariance structure of the correlated responses must also be specified and modeled Basic ideas of GEE (continued) u Assumptions: u The responses Y1,Y2 ,...,Ynare correlated or clustered, i.e., cases don’t need to be independent. u The homogeneity of variances does not need to be satisfied. u Errors can be correlated. u It uses quasi-likelihood estimation rather than maximum likelihood estimation (MLE) or ordinary least squares (OLS) to estimate the parameters, but at times these will coincide. u Covariance specification. These are typically four or more correlation structures that we assume apriori: 1. Independence: Correlation between time points is independent 2. Exchangable: All measurements on the same unit are equally correlated 3. Autoregressive order 1 (AR1): Correlation depends on time or distance between measurements 4. Unstructured: No assumptions about the correlation Basic ideas of GEE (continued) u Parameter Estimation: u GEE estimates of model parameters are valid even if the covariance is mis-specified, because they depend on the first moment, e.g. mean. However, if the correlation structure is misspecified, the standard errors are not good, and some adjustments based on the data are needed to get more appropriate standard errors. Basic ideas of GEE (continued) u Model Fit: u We don’t test for the model of the GEE, because this is really an estimating procedure; there is no likelihood function. u We look at the empirical estimates of the standard errors and the covariance. u Compare the empirical estimates with the model-based estimates. u Overdispersion Basic ideas of GEE (continued) u u Advantages: u Computationally more simple than MLE for categorical data. u Does not require multivariate distribution. u Consistent estimation even with mis-specified correlation structure. Limitations: u There is no likelihood function since the GEE does not specify completely the joint distribution. u Likelihood-based methods are not available for testing fit, comparing models, and conducting inferences about parameters. u Empirical based standard errors underestimate the true ones, unless very large sample size. Choosing the best model u GLM u u AIC GEE u QIC Example with R u Data ”Epilepsy” from a clinical trial of 59 epileptics. u For a baseline, patients were observed for 8 weeks and the number of seizures recorded. u The patients were then randomized to treatment by the drug Progabide (31 patients) or to the placebo group (28 patients). u They were observed for four 2-week periods and the number of seizures were recorded. u Primary question: Does Progabide reduce the rate of seizures? R code for the example # load the 'faraway' package: library(faraway) # Loading the 'gee' package: library(gee) # loading the 'epilepsy' dataset: data(epilepsy) # attaching the 'epilepsy' dataset: attach(epilepsy) epilepsy[1:10,] with(epilepsy,by(seizures/timeadj,list(treat,expind),mean)) # Plotting the square root of seizure counts against period # treatment group = solid, placebo group = dashed y <- matrix(epilepsy$seizures, nrow=5) matplot(1:4, sqrt(y[-1,]),type='l', lty=epilepsy$treat[5*(1:59)]+1, xlab="Period", ylab="Sqrt(Seizures)") R code for the example # Finding the weekly mean of the seizure count (removing the baseline period) for each column my <- apply(y[-1,],2,mean)/2 # Plotting the square root of the weekly mean seizure count for the experimental period # against the square root of the weekly mean seizure count for the baseline period plot(sqrt(epilepsy$seizures[epilepsy$expind==0]/8), sqrt(my), pch=epilepsy$treat[5*(1:59)]+2, xlab='Sqrt(Baseline seizures)', ylab='Sqrt(Experimental seizures)'); abline(0,1) # treatment group = '+', placebo group = triangle #Now we fit the GEE model. seiz.gee <- gee(seizures ~ offset(log(timeadj)) + expind + treat + I(expind*treat), id, family=poisson, corstr="AR-M", Mv=1, data = epilepsy, subset=(id!=49)) summary(seiz.gee) # We fit a Poisson-type rate model with the offset term accounting for the # number of weeks the seizures were counted for in each observation # The code: corstr="AR-M", Mv=1 # indicates a first-order autoregressive [AR(1)] correlation structure #The 49th observation is excluded because it has an extreme number of seizures: plot(unique(id),tapply(seizures,id,sum),xlab='ID',ylab='Total seizures') R code for the example #p-values of the estimators pvalue_intercept<-2*min(pnorm(8.2187244), 1-pnorm(8.2187244)) pvalue_intercept pvalue_expind<-2*min(pnorm(1.3257816), 1-pnorm(1.3257816)) pvalue_expind pvalue_treat<-2*min(pnorm(-0.4027256), 1-pnorm(-0.4027256)) pvalue_treat pvalue_both<-2*min(pnorm(-2.2421009), 1-pnorm(-2.2421009)) pvalue_both Example with R (continued) Seizures per week in epilepsy patients BASELINE EXPERIMENT PLACEBO 3.85 4.30 TREATMENT 3.96 3.98 References u Hardin, James W. and Hilbe, Joseph M. Generalized estimating equations, second edition. Chapman and Hall/CRC 2012. u Introduction to Generalized Estimating Equations. The Pennsylvania state University, Department of Statistics Online Programs. 2015. Available at: https://onlinecourses.science.psu.edu/stat504/node/180 u Faraway, Julian J. Extending the linear model with R: generalized linear, mixed effects and nonparametric regression models. CRC press, 2005. Exercise u The data is from a 1996 study (Gregoire, Kumar Everitt, Henderson and Studd) on the efficacy of estrogen patches in treating postnatal depression. u Women were randomly assigned to either a placebo control group (group=0, n=27) or estrogen patch group (group=1, n=34). u Prior to the first treatment all patients took the Edinburgh Postnatal Depression Scale (EPDS). EPDS data was collected monthly for six months once the treatment began. Higher scores on the EDPS are indicative of higher levels of depression. u Variables: id=identifying number; subj=subject number; group= (0=placebo, 1=treatment); visit=time of monthly visit; dep=depression score (higher scores indicate higher level of depression); depd=dichotomous variable of depression scale (<11=0, >11=1) u Question: Use GEE to model and determine does estrogen patches reduce the level of depression?
© Copyright 2025