Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Exponential Growth — §1.4 21
Modeling Population Growth
Example. Modeling the size of a population.
We would like to build a simple model to predict the size of apopulation in 10 years.
Exponential Growth — §1.4 21
Modeling Population Growth
Example. Modeling the size of a population.
We would like to build a simple model to predict the size of apopulation in 10 years.
� A very macro-level question.
Exponential Growth — §1.4 21
Modeling Population Growth
Example. Modeling the size of a population.
We would like to build a simple model to predict the size of apopulation in 10 years.
� A very macro-level question.
Definitions: Let t be time in years; t = 0 now.P(t) = size of population at time t.B(t) = number of births between times t and t + 1.D(t) = number of deaths between times t and t + 1.
Therefore, P(t + 1) = .
Definitionsimply
P(4) =B(1
2) =B(5)−D(5)=
Exponential Growth — §1.4 21
Modeling Population Growth
Example. Modeling the size of a population.
We would like to build a simple model to predict the size of apopulation in 10 years.
� A very macro-level question.
Definitions: Let t be time in years; t = 0 now.P(t) = size of population at time t.B(t) = number of births between times t and t + 1.D(t) = number of deaths between times t and t + 1.
Therefore, P(t + 1) = .
Definitionsimply
P(4) =B(1
2) =B(5)−D(5)=
Assumption: The birth rate and death rate stay constant.
That is, the birth rate b = B(t)P(t) and death rate d = D(t)
P(t) are constants.
Exponential Growth — §1.4 21
Modeling Population Growth
Example. Modeling the size of a population.
We would like to build a simple model to predict the size of apopulation in 10 years.
� A very macro-level question.
Definitions: Let t be time in years; t = 0 now.P(t) = size of population at time t.B(t) = number of births between times t and t + 1.D(t) = number of deaths between times t and t + 1.
Therefore, P(t + 1) = .
Definitionsimply
P(4) =B(1
2) =B(5)−D(5)=
Assumption: The birth rate and death rate stay constant.
That is, the birth rate b = B(t)P(t) and death rate d = D(t)
P(t) are constants.
Assumption: No migration.
Exponential Growth — §1.4 22
Population Growth
Therefore, P(t + 1) = P(t)
[P(t)P(t) + B(t)
P(t) − D(t)P(t)
].
Under our assumptions, P(t + 1) = P(t)[1 + b − d ].
Exponential Growth — §1.4 22
Population Growth
Therefore, P(t + 1) = P(t)
[P(t)P(t) + B(t)
P(t) − D(t)P(t)
].
Under our assumptions, P(t + 1) = P(t)[1 + b − d ].
This implies: P(1) = ,
Exponential Growth — §1.4 22
Population Growth
Therefore, P(t + 1) = P(t)
[P(t)P(t) + B(t)
P(t) − D(t)P(t)
].
Under our assumptions, P(t + 1) = P(t)[1 + b − d ].
This implies: P(1) = ,P(2) = , . . .
Exponential Growth — §1.4 22
Population Growth
Therefore, P(t + 1) = P(t)
[P(t)P(t) + B(t)
P(t) − D(t)P(t)
].
Under our assumptions, P(t + 1) = P(t)[1 + b − d ].
This implies: P(1) = ,P(2) = , . . .
In general, P(n) = .
Exponential Growth — §1.4 22
Population Growth
Therefore, P(t + 1) = P(t)
[P(t)P(t) + B(t)
P(t) − D(t)P(t)
].
Under our assumptions, P(t + 1) = P(t)[1 + b − d ].
This implies: P(1) = ,P(2) = , . . .
In general, P(n) = .
Definition. The growth rate of a population is r = (1 + b − d).This constant is also called the Malthusian parameter.
Exponential Growth — §1.4 22
Population Growth
Therefore, P(t + 1) = P(t)
[P(t)P(t) + B(t)
P(t) − D(t)P(t)
].
Under our assumptions, P(t + 1) = P(t)[1 + b − d ].
This implies: P(1) = ,P(2) = , . . .
In general, P(n) = .
Definition. The growth rate of a population is r = (1 + b − d).This constant is also called the Malthusian parameter.
A model for the size of a population isP(t) = P(0)r t ,
where P(0) and r are constants.
Exponential Growth — §1.4 23
Applying the Malthusian Model
Approximate US Population at: http://www.census.gov/main/www/popclock.html
Example 1. Suppose that the current US population is 313,100,000.Assume that the birth rate is 0.02 and the death rate is 0.01.What will the population be in 10 years?
Answer. Use P(t) = P(0)r t :
Exponential Growth — §1.4 23
Applying the Malthusian Model
Approximate US Population at: http://www.census.gov/main/www/popclock.html
Example 1. Suppose that the current US population is 313,100,000.Assume that the birth rate is 0.02 and the death rate is 0.01.What will the population be in 10 years?
Answer. Use P(t) = P(0)r t :
Refinement. Approx. US Growth Rate at http://www63.wolframalpha.com/input/?i=US+birth+rate
Resource: Wolfram Alpha, integrable directly into Mathematica.
Example 2. How long will it take the population to double?
Answer. Use P(t) = P(0)r t :
Exponential Growth — §1.4 24
Determining constants of exponential growth
Goal: Given population data, determine model constants.
1910 1920 1930 1940 1950 1960 1970
50
100
150
200US Population from 1900 to 1970
1920 1940 1960 1980
50
100
150
200
US Population from 1900 to 1970
Exponential Growth — §1.4 24
Determining constants of exponential growth
Goal: Given population data, determine model constants.
1910 1920 1930 1940 1950 1960 1970
50
100
150
200US Population from 1900 to 1970
1920 1940 1960 1980
50
100
150
200
US Population from 1900 to 1970
� Take the logarithm of both sides of P(t) = P(0)r t .
� We have ln[P(t)] = .
Exponential Growth — §1.4 24
Determining constants of exponential growth
Goal: Given population data, determine model constants.
1910 1920 1930 1940 1950 1960 1970
50
100
150
200US Population from 1900 to 1970
1920 1940 1960 1980
50
100
150
200
US Population from 1900 to 1970
� Take the logarithm of both sides of P(t) = P(0)r t .
� We have ln[P(t)] = .
� A linear fit for P(t) vs. t gives values for and .
Exponential Growth — §1.4 24
Determining constants of exponential growth
Goal: Given population data, determine model constants.
1910 1920 1930 1940 1950 1960 1970
50
100
150
200US Population from 1900 to 1970
1920 1940 1960 1980
50
100
150
200
US Population from 1900 to 1970
� Take the logarithm of both sides of P(t) = P(0)r t .
� We have ln[P(t)] = .
� A linear fit for P(t) vs. t gives values for and .
� Exponentiate each value to find the values for P(0) and r .
Exponential Growth — §1.4 25
Determining constants of exponential growth
Here we plot ln[P(t)] as a function of t:
10 20 30 40 50 60 70
4.24.44.64.85.05.2
Transformed Population Data
20 40 60 80
4.24.44.64.85.05.25.4
Transformed Population Data
Exponential Growth — §1.4 25
Determining constants of exponential growth
Here we plot ln[P(t)] as a function of t:
10 20 30 40 50 60 70
4.24.44.64.85.05.2
Transformed Population Data
20 40 60 80
4.24.44.64.85.05.25.4
Transformed Population Data
The line of best fit is approximately ln[P(t)] = 4.4 + 0.0135t.
Exponential Growth — §1.4 25
Determining constants of exponential growth
Here we plot ln[P(t)] as a function of t:
10 20 30 40 50 60 70
4.24.44.64.85.05.2
Transformed Population Data
20 40 60 80
4.24.44.64.85.05.25.4
Transformed Population Data
The line of best fit is approximately ln[P(t)] = 4.4 + 0.0135t.
Therefore our model says P(t) ≈ e4.4+0.0135t = 81.5 · (1.014)t .Analysis:
Exponential Growth — §1.4 25
Determining constants of exponential growth
Here we plot ln[P(t)] as a function of t:
10 20 30 40 50 60 70
4.24.44.64.85.05.2
Transformed Population Data
20 40 60 80
4.24.44.64.85.05.25.4
Transformed Population Data
The line of best fit is approximately ln[P(t)] = 4.4 + 0.0135t.
Therefore our model says P(t) ≈ e4.4+0.0135t = 81.5 · (1.014)t .Analysis: � History indicates we should split the interval [1900, 1970].
Exponential Growth — §1.4 25
Determining constants of exponential growth
Here we plot ln[P(t)] as a function of t:
10 20 30 40 50 60 70
4.24.44.64.85.05.2
Transformed Population Data
20 40 60 80
4.24.44.64.85.05.25.4
Transformed Population Data
The line of best fit is approximately ln[P(t)] = 4.4 + 0.0135t.
Therefore our model says P(t) ≈ e4.4+0.0135t = 81.5 · (1.014)t .Analysis: � History indicates we should split the interval [1900, 1970].
� What might go wrong when trying to extrapolate?
Exponential Growth — §1.4 25
Determining constants of exponential growth
Here we plot ln[P(t)] as a function of t:
10 20 30 40 50 60 70
4.24.44.64.85.05.2
Transformed Population Data
20 40 60 80
4.24.44.64.85.05.25.4
Transformed Population Data
The line of best fit is approximately ln[P(t)] = 4.4 + 0.0135t.
Therefore our model says P(t) ≈ e4.4+0.0135t = 81.5 · (1.014)t .Analysis: � History indicates we should split the interval [1900, 1970].
� What might go wrong when trying to extrapolate?
� Important: Transformations distort distances between points, soverification of a fit should always take place on y versus x axes. �
Interpolation and Extrapolation — §2.3.4 and §3.2 26
Interpolation vs. Extrapolation
Suppose you have collected a set of known data points (xi , yi ),and you would like to estimate the y -value for an unknown x-value.
The name for such an estimation depends on the placement of thex-value relative to the known x-values.
Interpolation and Extrapolation — §2.3.4 and §3.2 26
Interpolation vs. Extrapolation
Suppose you have collected a set of known data points (xi , yi ),and you would like to estimate the y -value for an unknown x-value.
The name for such an estimation depends on the placement of thex-value relative to the known x-values.
Interpolation Extrapolation
Interpolation and Extrapolation — §2.3.4 and §3.2 26
Interpolation vs. Extrapolation
Suppose you have collected a set of known data points (xi , yi ),and you would like to estimate the y -value for an unknown x-value.
The name for such an estimation depends on the placement of thex-value relative to the known x-values.
Interpolation
Inserting one or more x-valuesbetween known x-values.
10.0 10.5 11.0 11.5 12.0Age �years�54
56
58
60
62
Height �inches� Height of your sister Susie Q
Extrapolation
Inserting one or more x-valuesoutside of the range of knownx-values.
10 11 12 13 14 15Age �years�54
56
58
60
62
64
Height �inches� Height of your sister Susie Q
Interpolation and Extrapolation — §2.3.4 and §3.2 27
Interpolation vs. Extrapolation
� The most common method for interpolation is taking a weightedaverage of the two nearest data points; suppose x1 < x < x2,then,
f (x) ≈ y1 +y2 − y1
x2 − x2(x − x1).
Interpolation and Extrapolation — §2.3.4 and §3.2 27
Interpolation vs. Extrapolation
� The most common method for interpolation is taking a weightedaverage of the two nearest data points; suppose x1 < x < x2,then,
f (x) ≈ y1 +y2 − y1
x2 − x2(x − x1).
� In both interpolation and extrapolation, when you have afunction f that is a good fit to the data, simply plug in y = f (x).
Interpolation and Extrapolation — §2.3.4 and §3.2 27
Interpolation vs. Extrapolation
� The most common method for interpolation is taking a weightedaverage of the two nearest data points; suppose x1 < x < x2,then,
f (x) ≈ y1 +y2 − y1
x2 − x2(x − x1).
� In both interpolation and extrapolation, when you have afunction f that is a good fit to the data, simply plug in y = f (x).
� Confidence in approximated values depends on confidence inyour data and your model.
Interpolation and Extrapolation — §2.3.4 and §3.2 27
Interpolation vs. Extrapolation
� The most common method for interpolation is taking a weightedaverage of the two nearest data points; suppose x1 < x < x2,then,
f (x) ≈ y1 +y2 − y1
x2 − x2(x − x1).
� In both interpolation and extrapolation, when you have afunction f that is a good fit to the data, simply plug in y = f (x).
� Confidence in approximated values depends on confidence inyour data and your model.
� Confidence in extrapolated data is higher when closer to therange of known x-values.
Interpolation and Extrapolation — §2.3.4 and §3.2 28
Extrapolation: Running the Mile (p. 162)
Below is a plot of the years in which a record was broken forrunning a mile and the record-breaking time.
1880 1900 1920 1940 1960 1980 2000
3.6
3.8
4.0
4.2
4.4
4.6Mile�Running Record Times
Interpolation and Extrapolation — §2.3.4 and §3.2 28
Extrapolation: Running the Mile (p. 162)
Below is a plot of the years in which a record was broken forrunning a mile and the record-breaking time.
1880 1900 1920 1940 1960 1980 2000
3.6
3.8
4.0
4.2
4.4
4.6Mile�Running Record Times
1880 1900 1920 1940 1960 1980 2000
3.6
3.8
4.0
4.2
4.4
4.6Mile�Running Record Times
The data appears to fit the line T (t) = 15.5639 − 0.00593323t.
Interpolation and Extrapolation — §2.3.4 and §3.2 28
Extrapolation: Running the Mile (p. 162)
Below is a plot of the years in which a record was broken forrunning a mile and the record-breaking time.
1880 1900 1920 1940 1960 1980 2000
3.6
3.8
4.0
4.2
4.4
4.6Mile�Running Record Times
1880 1900 1920 1940 1960 1980 2000
3.6
3.8
4.0
4.2
4.4
4.6Mile�Running Record Times
The data appears to fit the line T (t) = 15.5639 − 0.00593323t.
Solve for T (t) = 0:
Interpolation and Extrapolation — §2.3.4 and §3.2 28
Extrapolation: Running the Mile (p. 162)
Below is a plot of the years in which a record was broken forrunning a mile and the record-breaking time.
1880 1900 1920 1940 1960 1980 2000
3.6
3.8
4.0
4.2
4.4
4.6Mile�Running Record Times
1880 1900 1920 1940 1960 1980 2000
3.6
3.8
4.0
4.2
4.4
4.6Mile�Running Record Times
The data appears to fit the line T (t) = 15.5639 − 0.00593323t.
Solve for T (t) = 0: You get t ≈ 2623.
Interpolation and Extrapolation — §2.3.4 and §3.2 28
Extrapolation: Running the Mile (p. 162)
Below is a plot of the years in which a record was broken forrunning a mile and the record-breaking time.
1880 1900 1920 1940 1960 1980 2000
3.6
3.8
4.0
4.2
4.4
4.6Mile�Running Record Times
1880 1900 1920 1940 1960 1980 2000
3.6
3.8
4.0
4.2
4.4
4.6Mile�Running Record Times
The data appears to fit the line T (t) = 15.5639 − 0.00593323t.
Solve for T (t) = 0: You get t ≈ 2623.
Conclusion: In the year 2623, the record will be zero minutes!
Interpolation and Extrapolation — §2.3.4 and §3.2 28
Extrapolation: Running the Mile (p. 162)
Below is a plot of the years in which a record was broken forrunning a mile and the record-breaking time.
1880 1900 1920 1940 1960 1980 2000
3.6
3.8
4.0
4.2
4.4
4.6Mile�Running Record Times
1880 1900 1920 1940 1960 1980 2000
3.6
3.8
4.0
4.2
4.4
4.6Mile�Running Record Times
The data appears to fit the line T (t) = 15.5639 − 0.00593323t.
Solve for T (t) = 0: You get t ≈ 2623.
Conclusion: In the year 2623, the record will be zero minutes!
� Note the lack of realistic assumptions behind the data.
Interpolation and Extrapolation — §2.3.4 and §3.2 28
Extrapolation: Running the Mile (p. 162)
Below is a plot of the years in which a record was broken forrunning a mile and the record-breaking time.
1880 1900 1920 1940 1960 1980 2000
3.6
3.8
4.0
4.2
4.4
4.6Mile�Running Record Times
1880 1900 1920 1940 1960 1980 2000
3.6
3.8
4.0
4.2
4.4
4.6Mile�Running Record Times
The data appears to fit the line T (t) = 15.5639 − 0.00593323t.
Solve for T (t) = 0: You get t ≈ 2623.
Conclusion: In the year 2623, the record will be zero minutes!
� Note the lack of realistic assumptions behind the data.� Always be careful when you extrapolate!
Regression — §3.2 29
Avoiding Using Least Squares
Justification for fitting data visually:
� Large simplifications in model developmentmean that eyeballing a fit is reasonable.
Regression — §3.2 29
Avoiding Using Least Squares
Justification for fitting data visually:
� Large simplifications in model developmentmean that eyeballing a fit is reasonable.
� Mathematical methods do not necessarily imply a better fit!
Regression — §3.2 29
Avoiding Using Least Squares
Justification for fitting data visually:
� Large simplifications in model developmentmean that eyeballing a fit is reasonable.
� Mathematical methods do not necessarily imply a better fit!
� You can make objective judgements that computers cannot;you know which data points should be taken more seriously.
Regression — §3.2 29
Avoiding Using Least Squares
Justification for fitting data visually:
� Large simplifications in model developmentmean that eyeballing a fit is reasonable.
� Mathematical methods do not necessarily imply a better fit!
� You can make objective judgements that computers cannot;you know which data points should be taken more seriously.
� Mathematics give precise answers; every answer is fallible.
Regression — §3.2 30
Regression
If we have confidence in our data, we may wish to do a regression,a method for fitting a curve through a set of pointsby following a goodness-of-fit criterion.
Regression — §3.2 30
Regression
If we have confidence in our data, we may wish to do a regression,a method for fitting a curve through a set of pointsby following a goodness-of-fit criterion.
Goal: Formulate mathematically what we do internally:Make the discrepancies between the data and the curve small.
Regression — §3.2 30
Regression
If we have confidence in our data, we may wish to do a regression,a method for fitting a curve through a set of pointsby following a goodness-of-fit criterion.
Goal: Formulate mathematically what we do internally:Make the discrepancies between the data and the curve small.
� Make the sum of the set of absolute deviations small. (Pic!)
minimize over all f the sum:∑
(xi ,yi )
∣∣yi − f (xi )∣∣
Regression — §3.2 30
Regression
If we have confidence in our data, we may wish to do a regression,a method for fitting a curve through a set of pointsby following a goodness-of-fit criterion.
Goal: Formulate mathematically what we do internally:Make the discrepancies between the data and the curve small.
� Make the sum of the set of absolute deviations small. (Pic!)
minimize over all f the sum:∑
(xi ,yi )
∣∣yi − f (xi )∣∣
� Make the largest of the set of absolute deviations small.minimize over all f the value: max
(xi ,yi )
∣∣yi − f (xi )∣∣
Regression — §3.2 30
Regression
If we have confidence in our data, we may wish to do a regression,a method for fitting a curve through a set of pointsby following a goodness-of-fit criterion.
Goal: Formulate mathematically what we do internally:Make the discrepancies between the data and the curve small.
� Make the sum of the set of absolute deviations small. (Pic!)
minimize over all f the sum:∑
(xi ,yi )
∣∣yi − f (xi )∣∣
� Make the largest of the set of absolute deviations small.minimize over all f the value: max
(xi ,yi )
∣∣yi − f (xi )∣∣
One or the other might make more sense depending on the situation.
Regression — §3.2 31
Least Squares
A regression method often used is called least squares.
minimize over all f the sum:∑
(xi ,yi )
(yi − f (xi )
)2
Regression — §3.2 31
Least Squares
A regression method often used is called least squares.
minimize over all f the sum:∑
(xi ,yi )
(yi − f (xi )
)2
� A middle ground, giving weight to all discrepancies andmore weight to those that are further from the curve.
Regression — §3.2 31
Least Squares
A regression method often used is called least squares.
minimize over all f the sum:∑
(xi ,yi )
(yi − f (xi )
)2
� A middle ground, giving weight to all discrepancies andmore weight to those that are further from the curve.
� Easy to analyze mathematically because this is a smooth function.
Regression — §3.2 31
Least Squares
A regression method often used is called least squares.
minimize over all f the sum:∑
(xi ,yi )
(yi − f (xi )
)2
� A middle ground, giving weight to all discrepancies andmore weight to those that are further from the curve.
� Easy to analyze mathematically because this is a smooth function.
Calculating minima of smooth functions: (You know how!)
�2�1
01
2
�2
�10
12
0
2
4
6
8
Regression — §3.2 31
Least Squares
A regression method often used is called least squares.
minimize over all f the sum:∑
(xi ,yi )
(yi − f (xi )
)2
� A middle ground, giving weight to all discrepancies andmore weight to those that are further from the curve.
� Easy to analyze mathematically because this is a smooth function.
Calculating minima of smooth functions: (You know how!)
� Differentiate with respect to each variable,and set equal to zero.
�2�1
01
2
�2
�10
12
0
2
4
6
8
Regression — §3.2 31
Least Squares
A regression method often used is called least squares.
minimize over all f the sum:∑
(xi ,yi )
(yi − f (xi )
)2
� A middle ground, giving weight to all discrepancies andmore weight to those that are further from the curve.
� Easy to analyze mathematically because this is a smooth function.
Calculating minima of smooth functions: (You know how!)
� Differentiate with respect to each variable,and set equal to zero.
� Solve the resulting system of equations.�2
�10
12
�2
�10
12
0
2
4
6
8
Regression — §3.2 31
Least Squares
A regression method often used is called least squares.
minimize over all f the sum:∑
(xi ,yi )
(yi − f (xi )
)2
� A middle ground, giving weight to all discrepancies andmore weight to those that are further from the curve.
� Easy to analyze mathematically because this is a smooth function.
Calculating minima of smooth functions: (You know how!)
� Differentiate with respect to each variable,and set equal to zero.
� Solve the resulting system of equations.
� Check to see if the solutions are local minima.�2
�10
12
�2
�10
12
0
2
4
6
8
Regression — §3.2 32
Least Squares Example
Example. Use the least-squares criterion to fit a line y = mx + bto the data: {(1.0, 3.6), (2.1, 2.9), (3.5, 2.2), (4.0, 1.7)}.
Regression — §3.2 32
Least Squares Example
Example. Use the least-squares criterion to fit a line y = mx + bto the data: {(1.0, 3.6), (2.1, 2.9), (3.5, 2.2), (4.0, 1.7)}.Intution / Expectations?
Regression — §3.2 32
Least Squares Example
Example. Use the least-squares criterion to fit a line y = mx + bto the data: {(1.0, 3.6), (2.1, 2.9), (3.5, 2.2), (4.0, 1.7)}.Intution / Expectations?
Solution. We need to calculate the sum S =∑
(xi ,yi)
[yi − (mxi + b)
]2.
Regression — §3.2 32
Least Squares Example
Example. Use the least-squares criterion to fit a line y = mx + bto the data: {(1.0, 3.6), (2.1, 2.9), (3.5, 2.2), (4.0, 1.7)}.Intution / Expectations?
Solution. We need to calculate the sum S =∑
(xi ,yi)
[yi − (mxi + b)
]2.
S = (3.6−1.0m−b)2 +(2.9−2.1m−b)2+(2.2−3.5m−b)2+(1.7−4.0m−b)2
Regression — §3.2 32
Least Squares Example
Example. Use the least-squares criterion to fit a line y = mx + bto the data: {(1.0, 3.6), (2.1, 2.9), (3.5, 2.2), (4.0, 1.7)}.Intution / Expectations?
Solution. We need to calculate the sum S =∑
(xi ,yi)
[yi − (mxi + b)
]2.
S = (3.6−1.0m−b)2 +(2.9−2.1m−b)2+(2.2−3.5m−b)2+(1.7−4.0m−b)2
Expanding, S = 29.1 − 20.8b + 4b2 − 48.38m + 21.2bm + 33.66m2
Regression — §3.2 32
Least Squares Example
Example. Use the least-squares criterion to fit a line y = mx + bto the data: {(1.0, 3.6), (2.1, 2.9), (3.5, 2.2), (4.0, 1.7)}.Intution / Expectations?
Solution. We need to calculate the sum S =∑
(xi ,yi)
[yi − (mxi + b)
]2.
S = (3.6−1.0m−b)2 +(2.9−2.1m−b)2+(2.2−3.5m−b)2+(1.7−4.0m−b)2
Expanding, S = 29.1 − 20.8b + 4b2 − 48.38m + 21.2bm + 33.66m2
Calculating the partial derivatives and setting equal to zero:{∂S∂b = −20.8 + 8b + 21.2m = 0∂S∂m = −48.38 + 21.2b + 67.32m = 0
Regression — §3.2 32
Least Squares Example
Example. Use the least-squares criterion to fit a line y = mx + bto the data: {(1.0, 3.6), (2.1, 2.9), (3.5, 2.2), (4.0, 1.7)}.Intution / Expectations?
Solution. We need to calculate the sum S =∑
(xi ,yi)
[yi − (mxi + b)
]2.
S = (3.6−1.0m−b)2 +(2.9−2.1m−b)2+(2.2−3.5m−b)2+(1.7−4.0m−b)2
Expanding, S = 29.1 − 20.8b + 4b2 − 48.38m + 21.2bm + 33.66m2
Calculating the partial derivatives and setting equal to zero:{∂S∂b = −20.8 + 8b + 21.2m = 0∂S∂m = −48.38 + 21.2b + 67.32m = 0
Solving the system of equations gives: {b = 4.20332,m = −0.605027}
Regression — §3.2 32
Least Squares Example
Example. Use the least-squares criterion to fit a line y = mx + bto the data: {(1.0, 3.6), (2.1, 2.9), (3.5, 2.2), (4.0, 1.7)}.Intution / Expectations?
Solution. We need to calculate the sum S =∑
(xi ,yi)
[yi − (mxi + b)
]2.
S = (3.6−1.0m−b)2 +(2.9−2.1m−b)2+(2.2−3.5m−b)2+(1.7−4.0m−b)2
Expanding, S = 29.1 − 20.8b + 4b2 − 48.38m + 21.2bm + 33.66m2
Calculating the partial derivatives and setting equal to zero:{∂S∂b = −20.8 + 8b + 21.2m = 0∂S∂m = −48.38 + 21.2b + 67.32m = 0
Solving the system of equations gives: {b = 4.20332,m = −0.605027}That is, the line that gives the least-squares fit for the data is
y = −0.605027x + 4.20332.
Regression — §3.2 33
Notes on the Method of Least Squares
� Least squares becomes messy when there are many data points.
Regression — §3.2 33
Notes on the Method of Least Squares
� Least squares becomes messy when there are many data points.
� We chose least squares because it was easy. Is it really the“right” method for the job?
Regression — §3.2 33
Notes on the Method of Least Squares
� Least squares becomes messy when there are many data points.
� We chose least squares because it was easy. Is it really the“right” method for the job?
� Least squares isn’t always easy, for example: y = Cekx .
Regression — §3.2 33
Notes on the Method of Least Squares
� Least squares becomes messy when there are many data points.
� We chose least squares because it was easy. Is it really the“right” method for the job?
� Least squares isn’t always easy, for example: y = Cekx .
� You can use least squares on transformed data, but the resultis NOT a least-squares curve for the original data.
Regression — §3.2 33
Notes on the Method of Least Squares
� Least squares becomes messy when there are many data points.
� We chose least squares because it was easy. Is it really the“right” method for the job?
� Least squares isn’t always easy, for example: y = Cekx .
� You can use least squares on transformed data, but the resultis NOT a least-squares curve for the original data.
� Multivariable least squares can also be done: w =ax + by + cz + d(Would want to minimize: .)
Regression — §3.2 33
Notes on the Method of Least Squares
� Least squares becomes messy when there are many data points.
� We chose least squares because it was easy. Is it really the“right” method for the job?
� Least squares isn’t always easy, for example: y = Cekx .
� You can use least squares on transformed data, but the resultis NOT a least-squares curve for the original data.
� Multivariable least squares can also be done: w =ax + by + cz + d(Would want to minimize: .)
� Least squares measures distance vertically.A better measure would probably be perpendicular distance.
Regression — §3.2 33
Notes on the Method of Least Squares
� Least squares becomes messy when there are many data points.
� We chose least squares because it was easy. Is it really the“right” method for the job?
� Least squares isn’t always easy, for example: y = Cekx .
� You can use least squares on transformed data, but the resultis NOT a least-squares curve for the original data.
� Multivariable least squares can also be done: w =ax + by + cz + d(Would want to minimize: .)
� Least squares measures distance vertically.A better measure would probably be perpendicular distance.
� You need to understand the concept of least squaresand know how to do least squares by hand for small examples.
Regression — §3.2 33
Notes on the Method of Least Squares
� Least squares becomes messy when there are many data points.
� We chose least squares because it was easy. Is it really the“right” method for the job?
� Least squares isn’t always easy, for example: y = Cekx .
� You can use least squares on transformed data, but the resultis NOT a least-squares curve for the original data.
� Multivariable least squares can also be done: w =ax + by + cz + d(Would want to minimize: .)
� Least squares measures distance vertically.A better measure would probably be perpendicular distance.
� You need to understand the concept of least squaresand know how to do least squares by hand for small examples.
� We’ll learn how to use Mathematica to do this for us!
Examples Using Least Squares — §2.3.3 and §3.2 34
Price – Demand Curve (p. 111–114)
Example. A company is trying to determine how demand for anew product depends on its price and collect the following data:
price p $9 $10 $11
demand d 1200/mo. 1000/mo. 975/mo.
The company has reason to believe that price and demand areinversely proportional, that is, d = c
p for some constant c .
Examples Using Least Squares — §2.3.3 and §3.2 34
Price – Demand Curve (p. 111–114)
Example. A company is trying to determine how demand for anew product depends on its price and collect the following data:
price p $9 $10 $11
demand d 1200/mo. 1000/mo. 975/mo.
The company has reason to believe that price and demand areinversely proportional, that is, d = c
p for some constant c .
→ Use the method of least squares to determine this constant c .
9.0 9.5 10.0 10.5 11.0
1000
1050
1100
1150
1200Demand as a Function of Price
Examples Using Least Squares — §2.3.3 and §3.2 35
Price – Demand Curve (p. 111–114)
Solution. Since f (p) =c
p, then the sum S =
∑(pi ,di )
[di −
(c
pi
)]2
.
Examples Using Least Squares — §2.3.3 and §3.2 35
Price – Demand Curve (p. 111–114)
Solution. Since f (p) =c
p, then the sum S =
∑(pi ,di )
[di −
(c
pi
)]2
.
Specifying datapoints gives
S =
[1200 − c
9
]2
+
[1000 − c
10
]2
+
[975 − c
11
]2
Examples Using Least Squares — §2.3.3 and §3.2 35
Price – Demand Curve (p. 111–114)
Solution. Since f (p) =c
p, then the sum S =
∑(pi ,di )
[di −
(c
pi
)]2
.
Specifying datapoints gives
S =
[1200 − c
9
]2
+
[1000 − c
10
]2
+
[975 − c
11
]2
Setting the derivative equal to zero gives
dS
dc=
−2
9
[1200 − c
9
]+
−2
10
[1000 − c
10
]+
−2
11
[975 − c
11
]= 0
Examples Using Least Squares — §2.3.3 and §3.2 35
Price – Demand Curve (p. 111–114)
Solution. Since f (p) =c
p, then the sum S =
∑(pi ,di )
[di −
(c
pi
)]2
.
Specifying datapoints gives
S =
[1200 − c
9
]2
+
[1000 − c
10
]2
+
[975 − c
11
]2
Setting the derivative equal to zero gives
dS
dc=
−2
9
[1200 − c
9
]+
−2
10
[1000 − c
10
]+
−2
11
[975 − c
11
]= 0
Solving for c gives c ≈ 10517.
9.0 9.5 10.0 10.5 11.0 11.5
1000
1050
1100
1150
1200
Demand as a Function of Price
Examples Using Least Squares — §2.3.3 and §3.2 36
New York City Temperature (similar to p. 158)
0.0 0.5 1.0 1.5 2.0 2.5 3.0
30
40
50
60
70
80
Weekly Average Temperature in NYCThe graph of average weekly temperature inNew York City from Jan. 2006 to Dec. 2008gives the distinct impression of a .
Examples Using Least Squares — §2.3.3 and §3.2 36
New York City Temperature (similar to p. 158)
0.0 0.5 1.0 1.5 2.0 2.5 3.0
30
40
50
60
70
80
Weekly Average Temperature in NYCThe graph of average weekly temperature inNew York City from Jan. 2006 to Dec. 2008gives the distinct impression of a .
We need to determine the constants in:
Temp(t) = A + B sin(C (t − D)).
Examples Using Least Squares — §2.3.3 and §3.2 36
New York City Temperature (similar to p. 158)
0.0 0.5 1.0 1.5 2.0 2.5 3.0
30
40
50
60
70
80
Weekly Average Temperature in NYCThe graph of average weekly temperature inNew York City from Jan. 2006 to Dec. 2008gives the distinct impression of a .
We need to determine the constants in:
Temp(t) = A + B sin(C (t − D)).
Let’s simplify our model to only determine amplitude B andvertical shift A. We must make assumptions about Cand D. We can assume that C = .
Examples Using Least Squares — §2.3.3 and §3.2 36
New York City Temperature (similar to p. 158)
0.0 0.5 1.0 1.5 2.0 2.5 3.0
30
40
50
60
70
80
Weekly Average Temperature in NYCThe graph of average weekly temperature inNew York City from Jan. 2006 to Dec. 2008gives the distinct impression of a .
We need to determine the constants in:
Temp(t) = A + B sin(C (t − D)).
Let’s simplify our model to only determine amplitude B andvertical shift A. We must make assumptions about Cand D. We can assume that C = .
For D, find when the sine passes through zero.Since January is cold and July is hot, thezero should occur in April; guess D ≈ 4
12 .
Examples Using Least Squares — §2.3.3 and §3.2 36
New York City Temperature (similar to p. 158)
0.0 0.5 1.0 1.5 2.0 2.5 3.0
30
40
50
60
70
80
Weekly Average Temperature in NYCThe graph of average weekly temperature inNew York City from Jan. 2006 to Dec. 2008gives the distinct impression of a .
We need to determine the constants in:
Temp(t) = A + B sin(C (t − D)).
Let’s simplify our model to only determine amplitude B andvertical shift A. We must make assumptions about Cand D. We can assume that C = .
For D, find when the sine passes through zero.Since January is cold and July is hot, thezero should occur in April; guess D ≈ 4
12 .
Fitting to Temp(t) = A + B sin[t − 412 ]
gives: Temp(t) = 13.9 + 11.8 sin[t − 412 ] 0.0 0.5 1.0 1.5 2.0 2.5 3.0
30
40
50
60
70
80
Weekly Average Temperature in NYC