Oefenvragen Statistics for Business and Economics


12. Multiple Regression

1. A study was conducted to assess the influence of various factors on the start of new firms in the agricultural industry. For a sample of 70 countries the following model was estimated:

yn = -59.31 + 4.983x1 + 2.198x2 + 3.816x3 - 0.310x4 11.1562 10.2102 12.0632 10.3302

-0.886x5 + 3.215x6 + 0.85x7 13.0552 11.5682 10.3542

R2 = 0.766

where:

yn = new business starts in the industry

x1 = population in millions

x2 = industry size

x3 = measure of economic quality of life

x4 = measure of political quality of life

x5 = measure of environmental quality of life

x6 = measure of health and educational quality of life

x7 = measure of social quality of life

The numbers in parentheses under the coefficients are the estimated coefficient standard errors.

a. Interpret the estimated regression coefficients.

b. Interpret the coefficient of determination.

c. Find a 90% confidence interval for the increase in new business starts resulting from a one-unit increase in the economic quality of life, with all other variables unchanged.

d. Test, against a two-sided alternative at the 5% level, the null hypothesis that, all else remaining equal, the environmental quality of life does not influence new business starts.

e. Test, against a two-sided alternative at the 5% level, the null hypothesis that, all else remaining equal, the health and educational quality of life does not influence new business starts.

f. Test the null hypothesis that, taken together, these seven independent variables do not influence new business starts.

2. Based on 25 years of annual data, an attempt was made to explain savings in Japan. The model fitted was as follows:

y = b0 + b1x1 + b2x2 + e

where

y = change in real deposit rate

x1 = change in real per capita income

x2 = change in real interest rate

The least squares parameter estimates (with standard errors in parentheses) were (Ghatak and Deadman 1989) as follows:

b1 = 0.097410.02152 b2 = 0.37410.2092

The adjusted coefficient of determination was as follows:

R2 = .91

a. Find and interpret a 99% confidence interval for b1.

b. Test, against the alternative that it is positive, the null hypothesis that b2 is 0.

c. Find the coefficient of determination.

d. Test the null hypothesis that b1 = b2 = 0.

e. Find and interpret the coefficient of multiple correlation.

3. Based on data from 63 countries, the following model was estimated by least squares:

yn = 0.58 - .052x1 - .005x2 R2 = .17

1.0192 1.0422

where:

yn = growth rate in real gross domestic product

x1 = real income per capita

x2 = average tax rate, as a proportion of gross national product

The numbers in parentheses under the coefficients are the estimated coefficient standard errors.

a. Test against a two-sided alternative the null hypothesis that b1 is 0. Interpret your result.

b. Test against a two-sided alternative the null hypothesis that b2 is 0. Interpret your result.

c. Interpret the coefficient of determination.

d. Find and interpret the coefficient of multiple correlation.

13. Additional Topics in Regression Analysis

1. The following model was fitted to data on 90 French technical companies:

yn = 0.819 + 2.11x111.792 + 0.96x21.942 - 0.059x310.1442 + 5.87x4 14.082 + 0.00226x510.001152

R2 = .410

where the numbers in parentheses are estimated coefficient standard errors and

y = share price

x1 = earnings per share

x2 = funds flow per share

x3 = dividends per share

x4 = book value per share

x5 = a measure of growth

a. Test at the 10% level the null hypothesis that the coefficient on x1 is 0 in the population regression against the alternative that the true coefficient is positive.

b. Test at the 10% level the null hypothesis that the coefficient on x2 is 0 in the population regression against the alternative that the true coefficient is positive.

c. The variable X2 was dropped from the original model, and the regression of Y on 1X1, X3, X4, X52 was estimated. The estimated coefficient on X1 was 2.95 with standard error 0.63. How can this result be reconciled with the conclusion of part a?

2. A market researcher is interested in the average amount of money per year spent by students on books. From 30 years of annual data, the following regression was estimated by least squares:

y

t = 40.93 + 0.253xt

10.1062

+ 0.546yt-1

10.1342

d = 1.86

where

yt = expenditure per student, in dollars, on books

xt = disposable income per student, in dollars, after payment of tuition, fees, and room and board

The numbers below the coefficients are the coefficient standard errors.

a. Find a 95% confidence interval for the coefficient on xt in the population regression.

b. What would be the expected impact over time of a $1 increase in disposable income per student on entertainment expenditure?

c. Test the null hypothesis of no autocorrelation in the errors against the alternative of positive autocorrelation.

15. Analysis of Variance

1. In a study to estimate the effects of drinking  alcohol on routine health risk, employees were classified as heavy drinkers, people recently cut back on alcohol, long-term drinkers, and those who never drank alcohol. Samples of 96, 34, 86, and 206 members of these groups were taken. Sample mean numbers of mean health risk rates per month were found to be 2.15, 2.21, 1.47, and 1.69, respectively.

The F ratio calculated from these data was 2.56.

a. Prepare the complete analysis of variance table.

b. Test the null hypothesis of equality of the four population mean health risk rates.

2. For the two-way analysis of variance model with one observation per cell, write the observation from the ith group and jth block as

Xij = m + Gi + Bj + eij

Refer to Exercise 15.65 and consider the observation on agent B and house 1 1x21 = 2182.

a. Estimate m.

b. Estimate and interpret G2.

c. Estimate and interpret B1.

d. Estimate e21.

16. Time-Series Analysis and Forecasting

1. In some experiments with several observations per cell the analyst is prepared to assume that there is no interaction between groups and blocks. Any apparent interaction found is then attributed to random error.

When such an assumption is made, the analysis is carried out in the usual way, except that what were previously the interaction and error sums of squares are now added together to form a new error sum of squares. Similarly, the corresponding degrees of freedom are added. If the assumption of no interaction is correct, this approach has the advantage of providing more error degrees of freedom and, hence, more powerful tests of the equality of group and block means.

For the study of Exercise 15.47, suppose that we now make the assumption of no interaction between dormitory ratings and student years.

a. State, in your own words, what is implied by this assumption.

b. Given this assumption, set up the new analysis of variance table.

c. Test the null hypothesis that the population mean ratings are the same for all dormitories.

d. Test the null hypothesis that the population mean ratings are the same for all four student years.

2. In a study to estimate the effects of smoking on routine health risk, employees were classified as continuous smokers, recent ex-smokers, long-term ex-smokers, and those who never smoked. Samples of 96, 34, 86, and 206 members of these groups were taken. Sample mean numbers of mean health risk rates per month were found to be 2.15, 2.21, 1.47, and 1.69, respectively.

The F ratio calculated from these data was 2.56.

a. Prepare the complete analysis of variance table.

b. Test the null hypothesis of equality of the four population mean health risk rates.

17. Additional Topics in Sampling

1. A hospital has 100 members of doctors. Information was obtained from the individuals responsible for managing correspondence in 61 doctors' offices. Of these, 38 specified a minimum number of complaints that must be received on an issue before action is undertaken

a. Assume these observations constitute a random sample from the population, and find a 90% confidence interval for the proportion of all doctors' offices with this policy.

b. In fact, information was not obtained from a random sample of doctor’s offices. Questionnaires were sent to all 100 offices, but only 61 responded. How does this information influence your view of the answer to part (a)?

2. Discuss the advantages and disadvantages of various sampling designs that might be used to select ballots to be recounted in a close election.
Join World Supporter
Join World Supporter
Log in or create your free account

Why create an account?

  • Your WorldSupporter account gives you access to all functionalities of the platform
  • Once you are logged in, you can:
    • Save pages to your favorites
    • Give feedback or share contributions
    • participate in discussions
    • share your own contributions through the 7 WorldSupporter tools
Follow the author: Dara Yapp
Promotions
special isis de wereld in

Waag jij binnenkort de sprong naar het buitenland? Verzeker jezelf van een goede ervaring met de JoHo Special ISIS verzekering

verzekering studeren in het buitenland

Ga jij binnenkort studeren in het buitenland?
Regel je zorg- en reisverzekering via JoHo!

Access level of this page
  • Public
  • WorldSupporters only
  • JoHo members
  • Private
Statistics
[totalcount]
Comments, Compliments & Kudos

Add new contribution

CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA
Enter the characters shown in the image.