Rare events represent a great analytical challenge. The maximum likelihood-based (ML) binary logit model as the workhorse model in the social sciences can generate heavily biased parameter estimates if events are rare. In detail, the finite sample bias in ML estimates may be substantially larger than that observed in cases with balanced data of the same sample size. Furthermore, the ML estimator is prone to overfitting rare event data even in low-dimensional models and not identified in cases of perfectly separated data. Starting with a brief introduction to the standard binary logit as a reference model, this entry discusses several design issues (e.g., selection on the dependent variable) and analytical approaches (e.g., first-order bias correction, exact conditional inference, penalized ML estimation, specification of cloglog models) to overcome these threats to valid inferences. Finally, the potential of Bayesian rare event modeling, which addresses some limitations of the frequentist probability perspective, is briefly introduced.
By: Heinz Leitgöb
|
Edited by: Paul Atkinson, Sara Delamont, Alexandru Cernat, Joseph W. Sakshaug & Richard A. Williams
Published: 2020 | Length:
10
| DOI:
|