  # Chi-Square Statistic

Jump to: navigation, search

The chi-squared statistic summarizes the discrepancies between the expected number of times each outcome occurs (assuming that the model is true) and the observed number of times each outcome occurs.

The calculation is performed by summing the squares of the discrepancies, normalized by the expected numbers, over all the categories:

Chi-squared = (observed1 - expected1)2/expected1 + (observed2 - expected2)2/expected2 + . . . + (Observedk - expectedk)2/expectedk.

As the sample size n increases, if the model is correct, the sampling distribution of the chi-squared statistic is approximated increasingly well by the chi-squared curve with (#categories - 1) = k – 1 degrees of freedom (df), in the sense that the chance that the chi-squared statistic is in any given range grows closer and closer to the area under the chi-squared curve over the same range.

Chi square tests use discrete, count data, arranged in a matrix of rows and columns, to look for statistical differences among populations.