Generalized Estimating Equations

Neil J.Salkind

doi:10.4135/9781412952644

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

Generalized Estimating Equations

Edited by:
Neil J. Salkind
In:Encyclopedia of Measurement and Statistics
Chapter DOI:https://doi.org/10.4135/9781412952644.n187
Subject:Anthropology, Business and Management, Criminology and Criminal Justice, Communication and Media Studies, Counseling and Psychotherapy, Economics, Education, Geography, Health, History, Marketing, Nursing, Political Science and International Relations, Psychology, Social Policy and Public Policy, Social Work, Sociology, Science, Technology, Computer Science, Engineering, Mathematics, Medicine

Request Permissions

Show page numbers Hide page numbers

Correlated data sets arise from repeated measures studies where multiple observations are collected from a specific sampling unit (a specific patient's status over time), or from grouped or clustered data where observations are grouped based on sharing some common characteristic (animals in a specific litter). When measurements are collected over time, the term longitudinal or panel data is preferred. Generalized estimating equations (GEEs) provide a framework for analyzing correlated data. This framework extends the generalized linear models methodology, which assumes independent data. We discuss the estimation of model parameters and associated variances via generalized estimating equation methodology.

The usual practice in model construction is the specification of the systematic and random components of variation. Classical maximum likelihood models then rely on the validity of the specified[Page 398] components. Model construction proceeds from the (components of variation) specification to a likelihood and, ultimately, an estimating equation. The estimating equation for maximum likelihood estimation is obtained by equating zero to the derivative of the loglikelihood with respect to the parameters of interest. Point estimates of unknown parameters are obtained by solving the estimating equation.

Generalized Linear Models

The theory and an algorithm appropriate for obtaining maximum likelihood estimates where the response follows a distribution in the exponential family was introduced in 1972 by Nelder and Wedderburn. They introduced the term generalized linear model (GLM) to refer to a class of models that could be analyzed by a single algorithm. The theoretical and practical application of GLMs has since received attention in many articles and books.

GLMs encompass a wide range of commonly used models such as linear regression, logistic regression for binary outcomes, and Poisson regression for count data outcomes. The specification of a particular GLM requires a link function that characterizes the relationship of the mean response to a vector of covariates. In addition, a GLM requires specification of a variance function that relates the variance of the outcomes as a function of the mean.

The derivation of the iteratively reweighted least squares (IRLS) algorithm appropriate for fitting GLMs begins with the likelihood specification for the exponential family. Within an iterative algorithm, an updated estimate of the coefficient vector may be obtained via weighted ordinary least squares where the weights are related to the link and variance specifications. The estimation is then iterated to convergence where convergence may be defined, for example, as the change in the estimated coefficient vector being smaller than some tolerance.

For any response that follows a member of the exponential family of distributions, f(y) = exp{[y θ − b(θ)]/φ + c(y, φ)}, where θ is the canonical parameter and φ is a proportionality constant, we can obtain maximum likelihood estimates of the p × 1 regression coefficient vector β by solving the estimating equation given by

In the estimation equation, Xi is the ith row of an n × p matrix of covariates X, μi = g(xiβ) represents the expected outcome E(y) = b′(θ) in terms of a transformation of the linear predictor ηi = xiβ via a monotonic (invertible) link function g(), and the variance V(μi) is a function of the expected value proportional to the variance of the outcome V(yi)=φ V(μi). The estimating equation is also known as the score equation because it equates the score vector Ψ(β) to zero.

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

Generalized Estimating Equations

Generalized Linear Models

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends