Skip to main content icon/video/no-internet

The term computational statistics has two distinct but related meanings. An older meaning is synonymous with the term statistical computing, or simply computations for use in statistics. The more recent meaning emphasizes the extensive use made of computations in statistical analysis.

Statistical Computing: Numerical Analysis for Applications in Statistics

Statistical analysis requires computing, and applications in statistics have motivated many of the advances in numerical analysis. Particularly noteworthy among the subareas of numerical analysis are numerical linear algebra, numerical optimization, and the evaluation of special functions. Regression analysis, which is one of the most common statistical methods, as well as other methods involving linear models and multivariate analysis, requires fast and accurate algorithms for linear algebra. Linear regression analysis involves analysis of a linear model of the form y = Xb + e, where y is a vector of observed data, X is a matrix of observed data, b is a vector of unknown constants, and e is an unobservable vector of random variables with zero mean. Estimation of the unknown b is often performed by minimizing some function of the residuals r(b) = yXb with respect to b. Depending on the function, this problem may be a very difficult optimization problem.

Numerical Linear Algebra

A very common approach to linear regression analysis is to minimize the sum of the squares of the residuals. In this case, the optimization problem reduces to a linear problem: Solve XTXb = XTy. (Here, the superscript T means transpose.) This problem and other similar ones in the analysis of linear models are often best solved by decomposing X into the product of an orthogonal matrix and an upper triangular matrix without ever forming XTX. Methods for doing this have motivated much research in numerical linear algebra.

Other important applications of numerical linear algebra arise in such areas as principal components analysis, where the primary numerical method is the extraction of eigenvalues and eigenvectors.

Numerical Optimization

Many statistical methods, such as regression analysis mentioned above, are optimization problems. Some problems, such as linear least squares, can be formulated as solutions to linear systems, and then the problems fall into the domain of numerical linear algebra. Others, such as nonlinear least squares and many maximum likelihood estimation problems, do not have closed-form solutions and must be solved by iterative methods, such as Newton's method, quasi-Netwon methods, general descent methods such as Nelder-Mead, or stochastic methods such as simulated annealing.

Another class of optimization problems includes those with constraints. Restricted maximum likelihood and constrained least squares are examples of statistical methods that require constrained optimization.

Evaluation of Special Functions

Methods for evaluation of cumulative distribution functions (probabilities) and inverse cumulative distribution functions (quantiles) are important in all areas of applied statistics. Some evaluations are straightforward, such as for z scores or for p values of common distributions such as t or F, but others are much more complicated. Computations involving posterior distributions in Bayesian analyses are often particularly difficult. Most of these computations are performed using Markov chain Monte Carlo methods.

...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles

Sage Recommends

We found other relevant content for you on other Sage platforms.

Loading