Simple Linear Regression Exercises

 

17.6  In television's early years, most commercials were 60 seconds long. Now, however, commercials can be any length. The objective of commercials remains the same-to have as many viewers as possible remember the product in a favorable way and eventually buy it. In an experiment to determine how the length of a commercial affects people's memory of it, 60 randomly selected people were asked to watch a 1 hour television program. In the middle of the show, a commercial advertising a brand of toothpaste appeared. Some viewers watched a commercial that lasted for 20 seconds, others watched one that lasted for 24 seconds, 28 seconds, …,  60 seconds. The essential content of the commercials was the same. After the show, each person was given a test to measure how much he or she remembered about the product. The commercial times and test scores (on a 30-point test) are stored in file XR17-06.

 

a) Obtain a scatter diagram of the data to determine whether a linear model appears to be appropriate.

 

b) Determine the least squares line.

 

c) Interpret the coefficients.

 

To solve by hand:

x-bar length = 38

 

y-bar test = 13.8

 

SS length.test = 3060

 

SS length = 11440

 

SS test = 2829.6

 

 

17.8  The growing interest in and use of the Internet has forced many companies into considering ways to sell their products on the web. Therefore it is of interest to these companies to determine who is using the web. A statistician undertook a study to determine how education and Internet use are connected. She took a random sample of 200 adults (20 years of age and older) and asked each to report the years of education they had completed and the number of hours of Internet use the previous week. These data are stored in columns I and 2 (education and Internet use, respectively) in file XR17-08.

 

a) Perform a regression analysis to describe how the two variables are related.

 

b) Interpret the coefficients.

 

To solve by hand:

x-bar Education = 11.04

 

y-bar Internet = 6.67

 

SS Education.Internet = 612.92

 

SS Education = 776.1

 

SS Internet = 4409.84

 

17.20   Refer to Exercise 17.6.

 

a)  Determine the standard error of estimate and de- scribe what this statistic tells you about the regression model.

 

b)  Determine the coefficient of determination. What does this statistic tell you about how well the linear regression model fits?

 

c)  Can we infer at the 5% significance level that the length of commercial and memory test score are linearly related?

 

17.22 Refer to Exercise 17.8.

a)  Determine the standard error of estimate, and describe what this statistic tells you about the regression line.

 

b) Can we conclude at the 1% significance level that educational level and Internet use are linearly related?

 

c) Determine the coefficient of determination and cuss what its value tell you about the two variables.

 

 

17.27  An economist wanted to investigate the relationship, between office rents and vacancy rates. Accordingly, he took a random sample of monthly office rents and the percentage of vacant office space in 30 different cities. The results were stored in file XR17-27 (column 1= vacancy rates in percent and column 2 = monthly rents in dollars per square foot).

 

a) Determine the regression line.

 

b) Interpret the coefficients.

 

c) Can we conclude at the 5% significance level that higher vacancy rates result in lower rents?

 

d) Measure how well the linear model fits the data. Discuss what this (these) measure(s) tells you.

 

To solve by hand:

x-bar Vacancy = 11.33

 

y-bar Rent = 17.20

 

SS Education.Internet = -312.62

 

SS Vacancy = 1028.63

 

SS Rent = 325.96

 

17.42 Refer to Exercise 17.6.

 

a) Predict with 95% confidence the memory test score of a viewer who watches a 36-second commercial.

 

b) Estimate with 95% confidence the mean memory test score of people who watch 36-second commercials.

 

17.44 Refer to Exercise 17.8 Estimate with 90% confidence the mean amount of time spent on the Internet by people with 15 years of education.

 

Solutions

 

17.6


a)

b)         = = .267; = 13.8 - (.267)(38.0) = 3.65

Excel Printout

SUMMARY OUTPUT

 

 

 

 

 

Regression Statistics

 

 

 

 

Multiple R

0.5378

 

 

 

 

R Square

0.2893

 

 

 

 

Adjusted R Square

0.2770

 

 

 

 

Standard Error

5.89

 

 

 

 

Observations

60

 

 

 

 

ANOVA

 

 

 

 

 

 

df

SS

MS

F

Significance F

Regression

1

818.5

818.5

23.61

0.0000

Residual

58

2011.1

34.7

 

 

Total

59

2829.6

 

 

 

 

 

 

 

 

 

 

Coefficients

Standard Error

t Stat

P-value

 

Intercept

3.64

2.23

1.63

0.1078

 

Length

0.267

0.0551

4.86

0.0000

 

 

Sample regression line: = 3.64 + .267x

c) = .267: For each additional second of commercial, the memory test score increases, on average by .267.  3.64 is the y-intercept.

 

 

17.8

a)         = = .790; = 6.67 - (.790)(11.04) = -2.05

Excel Printout

SUMMARY OUTPUT

 

 

 

 

 

Regression Statistics

 

 

 

 

Multiple R

0.3308

 

 

 

 

R Square

0.1094

 

 

 

 

Adjusted R Square

0.1050

 

 

 

 

Standard Error

4.45

 

 

 

 

Observations

200

 

 

 

 

ANOVA

 

 

 

 

 

 

df

SS

MS

F

Significance F

Regression

1

482.7

482.7

24.33

0.0000

Residual

198

3927.8

19.8

 

 

Total

199

4410.6

 

 

 

 

 

 

 

 

 

 

Coefficients

Standard Error

t Stat

P-value

 

Intercept

-2.03

1.79

-1.14

0.2575

 

Education

0.788

0.160

4.93

0.0000

 

Sample regression line: = -2.03 + .788x

b) = .788: For each additional year of  education, Internet use increases, on average by .788 hour.  -2.03 is the y-intercept.

 

17.20

a)         SSE = = = 2011.3 (2011.1)

= = 5.89

 

b) = = =.2892 (.2893)

 

c)         = 0

0

= = = .0551

= = 4.85 (4.86, p-value = 0)

Rejection region: = 2.000 or t < -2.000

Conclusion: Reject the null hypothesis. There is enough evidence to infer that the length of the commercial and memory test scores are linearly related.

 

17.22

a)         SSE = = = 3925.8 (3927.8)

= = 4.45

 

b)         = 0

0

= = = .16

= = 4.94 (4.93, p-value = 0)

Rejection region: = 2.345 or t < -2.345

Conclusion: Reject the null hypothesis. There is enough evidence to infer that educational level and Internet use are linearly related.

 

c) = = =.1098 (.1094)

 

17.27

a) = = -.304; = 17.20 - (-.304)(11.33) = 20.64

Sample regression line: = 20.64 - .304x

 

b) = -.304: For each additional one percent increase in the vacancy rate, the rent decreases on average by $.304 (30.4 cents).  20.64 cannot be interpreted.

 

Excel Printout

SUMMARY OUTPUT

 

 

 

 

 

 

 

 

 

 

 

Regression Statistics

 

 

 

 

Multiple R

0.5396