24
Statistics 359a Regression Analysis

Statistics 359a

  • Upload
    ponce

  • View
    34

  • Download
    1

Embed Size (px)

DESCRIPTION

Statistics 359a. Regression Analysis. Necessary Background Knowledge - Statistics. expectations of sums variances of sums distributions of sums of normal random variables t distribution – assumptions and use calculation of confidence intervals simple tests of hypotheses and p-values. - PowerPoint PPT Presentation

Citation preview

Statistics 359a

Regression Analysis

Necessary Background Knowledge - Statistics

• expectations of sums• variances of sums• distributions of sums of normal random

variables• t distribution – assumptions and use• calculation of confidence intervals• simple tests of hypotheses and p-values

Necessary Background Knowledge – Linear Algebra

• multiplication of conformable matrices• transpose of a matrix• determinant of a square matrix• inverse of a square matrix• eigenvalues of a square matrix• quadratic forms

Origin of Least SquaresIntroduction of the metric system and the length of

a meter• 1790 – French National Assembly commissions

the French Academy of Sciences to design a simple decimal-based system of weights and measures

• 1791 – French Academy defines the meter to be 10-7 or one ten-millionth of the length of the meridian through Paris from the north pole to the equator.

Adrien-Marie Legendre• Legendre on the French

commission in 1792 to determine the length of the meridian quadrant

• measurements of latitude made in 1795

• complex calculations made from the measurements in 1799

• Legendre proposes the method of least squares in 1805 to determine the length of a meter

Data

• old French units of measurement: 1 module = 2 toises• old French to imperial English: 1 toise = 6.395 feet• metric to imperial: 1 meter = 3.2808 feet

From Spherical Geometry

earth theofy ellipticit the torelated is modules in

arc an of degree one of length )28500/(1 D

(90D)quadrant meridian theof length the torelated is length arc

)cos()sin(2850028500

CC

C

S

LLLLSSLL

Including measurement errors, the data and model reduce to:

)014.0()765.4(000279.0 )277.0()914.2(001529.0 )324.0()048.0(002625.0 )027.0()720.2(000475.0 )590.0()912.4(003398.0

5

4

3

2

1

CCCCC

Solution is:

D = 28497.78 modules90D = 2564800.2 modules = length of the

meridian quadrantTherefore

1 meter = 0.256480 modules = 0.512960 toises = 3.280 feetmodern meter = 3.2808 feet

Origin of the Term “Regression”

• Francis Galton, 1886, ‘Regression towards mediocrity in hereditary stature.’ Journal of the Anthropological Institute, 15: 246 – 263

• See JSTOR under UWO library databases

Data on Heights of Children and Parents

‘Regression Line’

Theoretical Basis

For X and Y bivariate normal with equal means variances

For > 0E(Y |X ) < x for x > and E(Y |X ) > x for x <

)()|( xxXYE

))(1()|( xxxXYE

Example in Data Analysis Through Regression

• Relationship between the price of a violin bow and its attributes such as age, shape and ornamentation on the bow

Violin Bow Example

The following data on violin bows made by W.E. Hill and Sons of London, England are taken from the internet site www.maestronet.com/pricehist.html. The data show the prices of the bows sold at auction at Sotheby’s auction house for the years 1994-97. Also given are data on various factors that may affect the price of the bow. These include: the year of the sale (in case of price inflation or deflation); the year of manufacture (or age – are antique bows more or less valuable?); weight of the bow in grams (do buyers like heavier or lighter bows?); the shape of the bow (is there an aesthetic effect to the price?); presence or absence of ornamental gold; presence or absence of ornamental pearl; and whether the bow has a tortoiseshell frog or an ebony frog. Only the bows for which the approximate year of manufacture has been given are included in the data set. Prices from other auction houses and for other bow makers, as well as violins, are available at the same site, but only Sotheby’s gives the year of manufacture. A Minitab file of the data is at O:\359\bows.mtb.

Price in U.S.

Dollars Year of

Sale

Year the Bow was

Made Weight in

Grams

Shape O=octagonal

R=round Gold

Accessories

Tortoise-shell Frog

Pearl Accessories

1874 1997 1957 59.0 O N N N 2436 1997 1935 62.0 R N N N 7498 1997 1920 62.0 R Y Y N 1142 1996 1945 59.5 O N N Y 1935 1996 1890 57.5 R N N N 1759 1996 1900 56.0 O N N N 5278 1996 1950 57.0 O Y Y Y 4905 1995 1920 58.0 R Y N N 7994 1995 1920 60.0 O Y Y Y 2543 1995 1926 62.5 R N N Y 1769 1994 1935 61.0 R N N N 1592 1994 1960 61.0 R N N Y 3716 1994 1935 55.0 O Y Y Y 2477 1994 1925 59.0 R N N Y 2654 1994 1930 58.0 R N N N 3362 1994 1935 58.0 R N Y Y

Price and Date of Sale• 1995 seems to be a more expensive year• Is the effect confounded with some other attribute

common to 1995?

1997199619951994

8000

7000

6000

5000

4000

3000

2000

1000

Year Sold

Price

Violin Bows - Price and Sale Date

Price and Year of Manufacture

• Is there anything special about 1920?• Is there a quadratic trend in the data?

1890 1900 1910 1920 1930 1940 1950 1960

1000

2000

3000

4000

5000

6000

7000

8000

Year Made

Price

Violin Bows - Price and Year of Manufacture

Price and Weight of the Bow

• Is there any trend with respect to the weight?

636261605958575655

8000

7000

6000

5000

4000

3000

2000

1000

Weight

Pric

e

Violin Bows - Price and Weight in Grams

Octagonal vs. Round Bows

• No apparent trend

80007000600050004000300020001000

1.0

0.5

0.0

Price

Shap

e

Violin Bows - Price and Shape1 = round, 0 = octagonal

The Gold Standard?

• The presence of gold on a bow generally makes it more expensive

80007000600050004000300020001000

1.0

0.5

0.0

Price

Gol

d

Violin Bows - Price and Gold Accessories1 = present, 0 = absent

Tortoise Shell Frogs

• Some evidence of added expense for tortoise shell

80007000600050004000300020001000

1.0

0.5

0.0

Price

Frog

Violin Bows - Price and Tortoise Shell Frogs1 = present, 0 = absent

Price and Pearl Accessories

• No apparent effect

80007000600050004000300020001000

1.0

0.5

0.0

Price

Pea

rl

Violin Bows - Price and Pearl Accessories1 = present, 0 = absent

Prediction

• Can we use the model built with the current data to predict the future price of a bow

• Example: some 1999 data from auctions

• 1920 bow, 60.5 g., round with gold and pearl accessories - $4098

• 1933 bow, 61 g., octagonal with pearl accessories only - $2421