Use of the lm function
Everything in this tutorial will be exemplified on pseudodata that we now create.
y <- c(-9, -11, 1, 19)
x <- c(-3, -1, 1, 3)
x2 <- x^2
Data <- data.frame(y = y, x = x, x2 = x2)
print(Data)
## y x x2
## 1 -9 -3 9
## 2 -11 -1 1
## 3 1 1 1
## 4 19 3 9
lm
functionWe will be fitting the following linear model: \[ \mathsf{E}Y = \beta_0 + \beta_1\,x + \beta_2\,x^2. \]
m <- lm(y ~ x + x2)
print(m)
##
## Call:
## lm(formula = y ~ x + x2)
##
## Coefficients:
## (Intercept) x x2
## -6.25 4.80 1.25
class(m)
## [1] "lm"
The R
function lm
uses the QR decomposition of the model matrix \(\mathbf{X}\) to calculate the LSE and related quantities which are now stored inside the object m
.
m
names(m)
## [1] "coefficients" "residuals" "effects" "rank" "fitted.values" "assign"
## [7] "qr" "df.residual" "xlevels" "call" "terms" "model"
Meaning of many elements of m
should be quite clear from their names.
m$coefficients
## (Intercept) x x2
## -6.25 4.80 1.25
m$residuals
## 1 2 3 4
## 0.4 -1.2 1.2 -0.4
m$rank
## [1] 3
m$fitted.values
## 1 2 3 4
## -9.4 -9.8 -0.2 19.4
m$df.residual
## [1] 1
m
useful only in special circumstanciesComponents of the object m
are useful only in some special circumstancies. There is no need to explore them now in much detail.
m$assign
## [1] 0 1 2
m$xlevels
## named list()
m$call
## lm(formula = y ~ x + x2)
m$terms
## y ~ x + x2
## attr(,"variables")
## list(y, x, x2)
## attr(,"factors")
## x x2
## y 0 0
## x 1 0
## x2 0 1
## attr(,"term.labels")
## [1] "x" "x2"
## attr(,"order")
## [1] 1 1
## attr(,"intercept")
## [1] 1
## attr(,"response")
## [1] 1
## attr(,".Environment")
## <environment: 0xecf5b10>
## attr(,"predvars")
## list(y, x, x2)
## attr(,"dataClasses")
## y x x2
## "numeric" "numeric" "numeric"
Nevertheless, there are still some components of the m
object that deserve a detailed exploration.
m$model
## y x x2
## 1 -9 -3 9
## 2 -11 -1 1
## 3 1 1 1
## 4 19 3 9
Note that in more complex models, the model matrix does not directly appear here.
QR decomposition of the model matrix is stored in a compact form:
m$qr
## $qr
## (Intercept) x x2
## 1 -2.0 0.0000000 -10.0000000
## 2 0.5 -4.4721360 0.0000000
## 3 0.5 0.4472136 8.0000000
## 4 0.5 0.8944272 -0.9296181
## attr(,"assign")
## [1] 0 1 2
##
## $qraux
## [1] 1.500000 1.000000 1.368524
##
## $pivot
## [1] 1 2 3
##
## $tol
## [1] 1e-07
##
## $rank
## [1] 3
##
## attr(,"class")
## [1] "qr"
qr.Q(m$qr)
## [,1] [,2] [,3]
## [1,] -0.5 0.6708204 0.5
## [2,] -0.5 0.2236068 -0.5
## [3,] -0.5 -0.2236068 -0.5
## [4,] -0.5 -0.6708204 0.5
qr.R(m$qr)
## (Intercept) x x2
## 1 -2 0.000000 -10
## 2 0 -4.472136 0
## 3 0 0.000000 8
qr.Q(m$qr, complete = TRUE)
## [,1] [,2] [,3] [,4]
## [1,] -0.5 0.6708204 0.5 0.2236068
## [2,] -0.5 0.2236068 -0.5 -0.6708204
## [3,] -0.5 -0.2236068 -0.5 0.6708204
## [4,] -0.5 -0.6708204 0.5 -0.2236068
We keep all above matrices:
(Q <- qr.Q(m$qr))
## [,1] [,2] [,3]
## [1,] -0.5 0.6708204 0.5
## [2,] -0.5 0.2236068 -0.5
## [3,] -0.5 -0.2236068 -0.5
## [4,] -0.5 -0.6708204 0.5
(P <- qr.Q(m$qr, complete = TRUE))
## [,1] [,2] [,3] [,4]
## [1,] -0.5 0.6708204 0.5 0.2236068
## [2,] -0.5 0.2236068 -0.5 -0.6708204
## [3,] -0.5 -0.2236068 -0.5 0.6708204
## [4,] -0.5 -0.6708204 0.5 -0.2236068
(N <- matrix(P[, (m$rank + 1):(m$rank + m$df.residual)], ncol = m$df.residual))
## [,1]
## [1,] 0.2236068
## [2,] -0.6708204
## [3,] 0.6708204
## [4,] -0.2236068
(R <- qr.R(m$qr))
## (Intercept) x x2
## 1 -2 0.000000 -10
## 2 0 -4.472136 0
## 3 0 0.000000 8
effects
m$effects
## (Intercept) x x2
## 0.000000 -21.466253 10.000000 1.788854
Compare it with:
crossprod(P, y)
## [,1]
## [1,] 0.000000
## [2,] -21.466253
## [3,] 10.000000
## [4,] 1.788854
It is seen that effects
\(=\) \(\mathbf{P}^\top\mathbf{Y}\). This term is useful for several things as will be now illustrated.
The first \(r\) elements of effects
equals to \(\mathbf{Q}^\top\mathbf{Y}\) and can be used to calculate
easily the fitted values: \(\widehat{\mathbf{Y}} = \mathbf{H}\mathbf{Y} = \mathbf{Q}\mathbf{Q}^\top\mathbf{Y}\).
Q %*% m$effects[1:m$rank]
## [,1]
## [1,] -9.4
## [2,] -9.8
## [3,] -0.2
## [4,] 19.4
m$fitted ## the same
## 1 2 3 4
## -9.4 -9.8 -0.2 19.4
The remaining elements of effects
equals to \(\mathbf{N}^\top\mathbf{Y}\) and can be used to calculate
easily the residuals: \(\mathbf{U} = \mathbf{M}\mathbf{Y} = \mathbf{N}\mathbf{N}^\top\mathbf{Y}\).
N %*% m$effects[(m$rank + 1):(m$rank + m$df.residual)]
## [,1]
## [1,] 0.4
## [2,] -1.2
## [3,] 1.2
## [4,] -0.4
m$residuals
## 1 2 3 4
## 0.4 -1.2 1.2 -0.4
Solution to normal equations is obtained as \[ \mathbf{X}^\top\mathbf{X}\mathbf{b} = \mathbf{X}^\top\mathbf{Y} \] \[ \mathbf{R}\mathbf{b} = \mathbf{Q}^\top\mathbf{Y} \] \[ \mathbf{R}\mathbf{b} = effects[1:r], \] which is solved by backward substitution:
(b_from_QR_again <- backsolve(R, m$effects[1:m$rank]))
## [1] -6.25 4.80 1.25
print(m$coefficients) ## previously calculated b
## (Intercept) x x2
## -6.25 4.80 1.25
Residual sum of squares is \[ \mbox{SS}_e = \mathbf{U}^\top\mathbf{U} = \mathbf{Y}^\top\mathbf{M}^\top\mathbf{M}\mathbf{Y} = \mathbf{Y}^\top\mathbf{N}\mathbf{N}^\top\mathbf{N}\mathbf{N}^\top\mathbf{Y} = \mathbf{Y}^\top\mathbf{N}\mathbf{N}^\top\mathbf{Y} = \bigl\|\mathbf{N}^\top\mathbf{Y}\bigr\|^2 = \bigl\|effects[r+1, \ldots]\bigr\|^2. \]
(SSe_from_QR <- crossprod(m$effects[(m$rank + 1):(m$rank + m$df.residual)]))
## [,1]
## [1,] 3.2
as.numeric(crossprod(m$residuals)) ## also sum of squares
## [1] 3.2
lm
Many methods have been implemented for objects of class lm
which return the important quantities.
methods(class = "lm")
## [1] add1 alias anova Anova
## [5] avPlot Boot bootCase boxCox
## [9] case.names ceresPlot coerce confidenceEllipse
## [13] confint cooks.distance crPlot deltaMethod
## [17] deviance dfbeta dfbetaPlots dfbetas
## [21] dfbetasPlots drop1 dummy.coef durbinWatsonTest
## [25] effects ellipse3d extractAIC family
## [29] formula hatvalues hccm infIndexPlot
## [33] influence influencePlot initialize inverseResponsePlot
## [37] kappa labels leveneTest leveragePlot
## [41] linearHypothesis logLik mcPlot mmp
## [45] model.frame model.matrix ncvTest nextBoot
## [49] nobs outlierTest plot powerTransform
## [53] predict print proj qqnorm
## [57] qqPlot qr residualPlot residualPlots
## [61] residuals rstandard rstudent show
## [65] sigmaHat simulate slotsFromS3 spreadLevelPlot
## [69] summary variable.names vcov
## see '?methods' for accessing help and source code
coefficients(m)
## (Intercept) x x2
## -6.25 4.80 1.25
coef(m) ## the same
## (Intercept) x x2
## -6.25 4.80 1.25
m$coefficients ## the same
## (Intercept) x x2
## -6.25 4.80 1.25
residuals(m)
## 1 2 3 4
## 0.4 -1.2 1.2 -0.4
m$residuals ## the same
## 1 2 3 4
## 0.4 -1.2 1.2 -0.4
fitted(m)
## 1 2 3 4
## -9.4 -9.8 -0.2 19.4
m$fitted.values ## the same
## 1 2 3 4
## -9.4 -9.8 -0.2 19.4
deviance(m)
## [1] 3.2
In full rank models, \[ \mbox{var}\bigl(\widehat{\beta}\bigr) = \frac{\mbox{SS}_e}{n-r}\,\bigl(\mathbf{X}^\top\mathbf{X}\bigr)^{-1}: \]
vcov(m)
## (Intercept) x x2
## (Intercept) 2.05 0.00 -0.25
## x 0.00 0.16 0.00
## x2 -0.25 0.00 0.05
In full rank models, for given \(j = 0,\ldots, k-1\), a confidence interval for \(\beta_j\) with a coverage of \(1 - \alpha\) is given as \[ \widehat{\beta}_j \;\pm\; \mbox{S.E.}\bigl(\widehat{\beta}_j\bigr)\,\mbox{t}_{n-r}\bigl(1 - \alpha/2\bigr), \] where \(\mbox{S.E.}\bigl(\widehat{\beta}_j\bigr)\) is standard error of \(\widehat{\beta}_j\) (square root of appropriate diagonal element of matrix \(\mbox{var}\bigl(\widehat{\beta}\bigr)\)).
confint(m)
## 2.5 % 97.5 %
## (Intercept) -24.4425166 11.942517
## x -0.2824819 9.882482
## x2 -1.5911938 4.091194
confint(m, level = 0.99)
## 0.5 % 99.5 %
## (Intercept) -97.39258 84.89258
## x -20.66270 30.26270
## x2 -12.98408 15.48408
summary(m)
##
## Call:
## lm(formula = y ~ x + x2)
##
## Residuals:
## 1 2 3 4
## 0.4 -1.2 1.2 -0.4
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -6.2500 1.4318 -4.365 0.1434
## x 4.8000 0.4000 12.000 0.0529 .
## x2 1.2500 0.2236 5.590 0.1127
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.789 on 1 degrees of freedom
## Multiple R-squared: 0.9943, Adjusted R-squared: 0.983
## F-statistic: 87.62 on 2 and 1 DF, p-value: 0.07532
Note that the result of summary
is an object of its own class
sm <- summary(m)
class(sm)
## [1] "summary.lm"
names(sm)
## [1] "call" "terms" "residuals" "coefficients" "aliased" "sigma"
## [7] "df" "r.squared" "adj.r.squared" "fstatistic" "cov.unscaled"
It is then possible to extract different important elements (model statistics).
sm$residuals
## 1 2 3 4
## 0.4 -1.2 1.2 -0.4
sm$coefficients
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -6.25 1.4317821 -4.365189 0.14336634
## x 4.80 0.4000000 12.000000 0.05292935
## x2 1.25 0.2236068 5.590170 0.11269007
\[ \widehat{\sigma} = \sqrt{\frac{1}{n-r}\mbox{SS}_e}. \]
sm$sigma
## [1] 1.788854
\[ F = \frac{\;\;\frac{\mbox{SS}_T}{n - 1}\;\;}{\;\;\frac{\mbox{SS}_e}{n - r}\;\;}. \]
sm$fstatistic
## value numdf dendf
## 87.625 2.000 1.000
\[ R^2 = 1 - \frac{\mbox{SS}_e}{\mbox{SS}_T}, \qquad R^2_{adj} = 1 - \frac{n-1}{n-r}\;\frac{\mbox{SS}_e}{\mbox{SS}_T}. \]
sm$r.squared
## [1] 0.9943262
sm$adj.r.squared
## [1] 0.9829787