In general:
For simple linear regression:
Source | Sum of Squares | Degrees of Freedom | Mean Square | F |
---|---|---|---|---|
Regression | \(SSReg = \sum^n_{i=1}(\hat{y}_i - \bar{y})^2\) | \(k\) | \(MSReg = \frac{SSReg}{k}\) | \(F = \frac{MSReg}{MSE}\) |
Residual | \(SSE = \sum^n_{i=1}(y_i-\hat{y}_i)^2\) | \(n - k - 1\) | \(MSE = \frac{SSE}{n - k - 1} = s^2\) | |
Total | \(SST = \sum^n_{i=1}(y_i - \bar{y})^2\) | \(n - 1\) |
Note: For simple linear regression, \(k = 1\)
\(SSReg = \beta_1^2 S_{xx}\)
\(E[\hat{\beta_1^2}] = \frac{\sigma^2}{S_{xx}} + \beta_1^2\)
\(E(MSReg) = E(SSReg) = E(\beta_1^2 S_{xx}) = \sigma^2 + \beta_1^2 S_{xx}\)
\(E(MSE) = \sigma^2\)
When \(\beta_1 = 0\), \(E(MSReg) = E(MSE)\) -> same means, F = 1
When \(\beta_1 \neq 0\), \(E(MSReg) > E(MSE)\) -> F > 1
Reject \(H_0\) if \(F > F_{1-\alpha; 1, n-2}\)
Do not reject \(H_0\) if \(F \leq F_{1-\alpha; 1, n-2}\)
Note that \(F = t^2\) for simple linear regression
Leverage points: a point with a distant \(x\) value
Bad leverage points: leverage point whose \(y\) value is an outlier
We want a rule to help identify \(x_i\) that are leverage points
\[f(x) = \frac{1}{\sigma \sqrt{2\pi}} \exp\left(-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2\right)\]