Cross-validation is a data-dependent method for estimating the prediction error of a fitted model or a trained algorithm. The basic idea is to divide the available data into two parts, called training data and testing data, respectively. The training data are used for fitting the model or training the algorithm, while the testing data are used for validating the performance of the fitted model or the trained algorithm on predication purpose.

A typical proportion of the training data might be roughly 1/2 or 1/3 when the data size is large enough. The division of the data into training part and testing part can be done naturally or randomly. In some applications, a large enough subgroup of the available data is collected independently of the other parts ...

  • Loading...
locked icon

Sign in to access this content

Get a 30 day FREE TRIAL

  • Watch videos from a variety of sources bringing classroom topics to life
  • Read modern, diverse business cases
  • Explore hundreds of books and reference titles