Tuesday, April 14, 2020

An Introduction to Akaikes Information Criterion (AIC)

An Introduction to Akaike's Information Criterion (AIC) The Akaike Information Criterion (commonly referred to simply as AIC) is a criterion for selecting among nested statistical or  econometric models. The AIC is essentially an estimated measure of the quality of each of the available econometric models as they relate to one another for a certain set of data, making it an ideal method for model selection. Using AIC for Statistical and Econometric Model Selection The Akaike Information Criterion (AIC) was developed with a foundation in information theory. Information theory is a branch of applied mathematics concerning the quantification (the process of counting and measuring) of information. In using AIC to  attempt to measure the relative quality of econometric models for a given data set, AIC provides the researcher with an estimate of the information that would be lost if a particular model were to be employed to display the process that produced the data. As such, the AIC works to balance the trade-offs between the complexity of a given model and its goodness of fit, which is the statistical term to describe how well the model fits the data or set of observations. What AIC Will Not Do Because of what the Akaike Information Criterion (AIC) can do with a set of statistical and econometric models and a given set of data, it is a useful tool in model selection. But even as a model selection tool, AIC has its limitations. For instance, AIC can only provide a relative test of model quality. That is to say that AIC does not and cannot provide a test of a model that results in information about the quality of the model in an absolute sense. So if each of the tested statistical models are equally unsatisfactory or ill-fit for the data, AIC would not provide any indication from the onset. AIC in Econometrics Terms The AIC is a number associated with each model: AICln (sm2) 2m/T Where m is the number of parameters in the model, and sm2  (in an AR(m) example) is the estimated residual variance: sm2 (sum of squared residuals for model m)/T. That is the average squared residual for model m. The criterion may be minimized over choices of m to form a trade-off between the fit of the model (which lowers the sum of squared residuals) and the models complexity, which is measured by m. Thus an AR(m) model versus an AR(m1) can be compared by this criterion for a given batch of data. An equivalent formulation is this one: AICT ln(RSS) 2K where K is the number of regressors, T the number of observations, and RSS the residual sum of squares; minimize over K to pick K. As such, provided a set of econometrics models, the preferred model in terms of relative quality will be the model with the minimum AIC value.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.