Girth Height Volume
Min. : 8.30 Min. :63 Min. :10.20
1st Qu.:11.05 1st Qu.:72 1st Qu.:19.40
Median :12.90 Median :76 Median :24.20
Mean :13.25 Mean :76 Mean :30.17
3rd Qu.:15.25 3rd Qu.:80 3rd Qu.:37.30
Max. :20.60 Max. :87 Max. :77.00
##################################### Regression line:# We can consider several lines passing through the scatterplot:plot(trees$Diameter, trees$Volume, pch=19,ylab="Volume", xlab ="Diameter")abline(a=-40, b=5, col=1)abline(h=mean(trees$Volume), col=2)abline(a=10, b=2, col=3)abline(a=-60, b=7, col=4)abline(a=-100, b=9, col=6)# Let's estimate the regression line,# that is the BEST line passing through the data:lm1 <-lm(Volume ~ Diameter, data=trees)summ1 <-summary(lm1);summ1
Call:
lm(formula = Volume ~ Diameter, data = trees)
Residuals:
Min 1Q Median 3Q Max
-8.065 -3.107 0.152 3.495 9.587
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -36.9435 3.3651 -10.98 7.62e-12 ***
Diameter 5.0659 0.2474 20.48 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 4.252 on 29 degrees of freedom
Multiple R-squared: 0.9353, Adjusted R-squared: 0.9331
F-statistic: 419.4 on 1 and 29 DF, p-value: < 2.2e-16
abline(lm1, col=2,lwd=2)
# Let's look at the regression line:plot(trees$Diameter, trees$Volume, pch=19,ylab="Volume", xlab ="Diameter")abline(lm1, col=2)# Compute the predicted values of the model:coef(lm1)[1] +coef(lm1)[2]*trees$Diameter
#### Residuals:# Once we fitted the model, we can compute the residualse <- trees$Volume - y_prev# Graphical representation of the residuals:segments(x0=trees$Diameter, x1=trees$Diameter,y0=trees$Volume, y1=y_prev)# Estimate of sigma^2SSres <-sum(e^2)sigma2_hat <- SSres/(n-2)sqrt(sigma2_hat)
# The anova function compares two models:# If we accept the hypothesis that the two models # have the same fit, then we prefer the simpler oneanova(lm_poly_1, lm_poly_3)
Analysis of Variance Table
Model 1: Volume ~ Height
Model 2: Volume ~ Height + I(Height^2) + I(Height^3)
Res.Df RSS Df Sum of Sq F Pr(>F)
1 29 5204.9
2 27 5106.1 2 98.815 0.2613 0.772
################################### Let us suppose that a tree is a cylinder.# We know that the volume of a cylinder# is V = pi * (Diameter/2)^2 * Height# Then log(V) = log(pi/4) + 2log(Diameter) + log(Height)Y <-log(trees$Volume)X1 <-log(trees$Diameter)X2 <-log(trees$Height)lm_log <-lm(Y ~ X1 + X2)summary(lm_log)
Call:
lm(formula = Y ~ X1 + X2)
Residuals:
Min 1Q Median 3Q Max
-0.168561 -0.048488 0.002431 0.063637 0.129223
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -6.63162 0.79979 -8.292 5.06e-09 ***
X1 1.98265 0.07501 26.432 < 2e-16 ***
X2 1.11712 0.20444 5.464 7.81e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.08139 on 28 degrees of freedom
Multiple R-squared: 0.9777, Adjusted R-squared: 0.9761
F-statistic: 613.2 on 2 and 28 DF, p-value: < 2.2e-16
plot3d <-scatterplot3d(X1, X2, Y, angle=60, scale.y=0.7, pch=16, color ="red", main ="Regression Plane")plot3d$plane3d(lm_log, lty.box ="solid")plot3d$points3d(X1, X2, Y,col="blue", type="h", pch=16)