30
Advice for applying machine learning Deciding what to try next Machine Learning

Docs slides-lecture10

Embed Size (px)

Citation preview

Page 1: Docs slides-lecture10

Advice  for  applying  machine  learning  

Deciding  what  to  try  next  

Machine  Learning  

Page 2: Docs slides-lecture10

Andrew  Ng  

Debugging  a  learning  algorithm:  Suppose  you  have  implemented  regularized  linear  regression  to  predict  housing  prices.        However,  when  you  test  your  hypothesis  on  a  new  set  of  houses,  you  find  that  it  makes  unacceptably  large  errors  in  its  predicDons.  What  should  you  try  next?    

-­‐  Get  more  training  examples  -­‐  Try  smaller  sets  of  features  -­‐  Try  geJng  addiDonal  features  -­‐  Try  adding  polynomial  features  -­‐  Try  decreasing  -­‐  Try  increasing  

Page 3: Docs slides-lecture10

Andrew  Ng  

Machine  learning  diagnos5c:  DiagnosDc:  A  test  that  you  can  run  to  gain  insight  what  is/isn’t  working  with  a  learning  algorithm,  and  gain  guidance  as  to  how  best  to  improve  its  performance.  

DiagnosDcs  can  take  Dme  to  implement,  but  doing  so  can  be  a  very  good  use  of  your  Dme.  

Page 4: Docs slides-lecture10

Advice  for  applying  machine  learning  

EvaluaDng  a  hypothesis  

Machine  Learning  

Page 5: Docs slides-lecture10

Andrew  Ng  

Evalua5ng  your  hypothesis  

Fails  to  generalize  to  new  examples  not  in  training  set.  

size  

price  

size  of  house  no.  of  bedrooms  no.  of  floors  age  of  house  

kitchen  size  average  income  in  neighborhood  

Page 6: Docs slides-lecture10

Andrew  Ng  

Evalua5ng  your  hypothesis  Dataset:  

Size   Price  

2104   400  1600   330  2400   369  1416   232  3000   540  1985   300  1534   315  1427   199  1380   212  1494   243  

Page 7: Docs slides-lecture10

Andrew  Ng  

Training/tes5ng  procedure  for  linear  regression  

-­‐  Learn  parameter          from  training  data  (minimizing  training  error                    )  

-­‐  Compute  test  set  error:  

 

Page 8: Docs slides-lecture10

Andrew  Ng  

Training/tes5ng  procedure  for  logis5c  regression  

-­‐  Learn  parameter          from  training  data  -­‐  Compute  test  set  error:  

 -­‐  MisclassificaDon  error  (0/1  misclassificaDon  error):  

 

Page 9: Docs slides-lecture10

Advice  for  applying  machine  learning  

Model  selecDon  and  training/validaDon/test  sets  

Machine  Learning  

Page 10: Docs slides-lecture10

Andrew  Ng  

OverfiBng  example  

size  

price   Once  parameters  were  fit  to  some  set  of  data  (training  set),  the  error  of  the  parameters  as  measured  on  that  data  (the  training  error                        xxxxx)  is  likely  to  be  lower  than  the  actual  generalizaDon  error.  

Page 11: Docs slides-lecture10

Andrew  Ng  

Model  selec5on  1.  2.  3.    10.  

Choose  How  well  does  the  model  generalize?  Report  test  set  error                                          .  Problem:                                            is  likely  to  be  an  opDmisDc  esDmate  of  generalizaDon  error.  I.e.  our  extra  parameter  (        =  degree  of  polynomial)  is  fit  to  test  set.  

Page 12: Docs slides-lecture10

Andrew  Ng  

Evalua5ng  your  hypothesis  Dataset:  

Size   Price  

2104   400  1600   330  2400   369  1416   232  3000   540  1985   300  1534   315  1427   199  1380   212  1494   243  

Page 13: Docs slides-lecture10

Andrew  Ng  

Train/valida5on/test  error  Training  error:  

Cross  ValidaDon  error:  

Test  error:  

Page 14: Docs slides-lecture10

Andrew  Ng  

Model  selec5on  

1.  2.  3.    10.  

Pick  EsDmate  generalizaDon  error    for  test  set  

Page 15: Docs slides-lecture10

Advice  for  applying  machine  learning  

Diagnosing  bias  vs.  variance  

Machine  Learning  

Page 16: Docs slides-lecture10

Andrew  Ng  

Bias/variance  

High  bias  (underfit)  

“Just  right”   High  variance  (overfit)  

Price  

Size  

Price  

Size  

Price  

Size  

Page 17: Docs slides-lecture10

Andrew  Ng  

Bias/variance  

degree  of  polynomial  d  

error  

Training  error:  

Cross  validaDon  error:  

size  

price  

Page 18: Docs slides-lecture10

Andrew  Ng  

Diagnosing  bias  vs.  variance  

degree  of  polynomial  d  

error  

Suppose  your  learning  algorithm  is  performing  less  well  than  you  were  hoping.  (                          or                                    is  high.)    Is  it  a  bias  problem  or  a  variance  problem?  

(cross  validaDon    error)  

(training  error)  

Bias  (underfit):  

Variance  (overfit):  

Page 19: Docs slides-lecture10

Advice  for  applying  machine  learning  

RegularizaDon  and  bias/variance  

Machine  Learning  

Page 20: Docs slides-lecture10

Andrew  Ng  

Linear  regression  with  regulariza5on  

Large  xx  High  bias  (underfit)  

Intermediate  xx  “Just  right”  

Small  xx  High  variance  (overfit)  

Model:  Price  

Size  

Price  

Size  

Price  

Size  

Page 21: Docs slides-lecture10

Andrew  Ng  

Choosing  the  regulariza5on  parameter    

Page 22: Docs slides-lecture10

Andrew  Ng  

1.  Try  2.  Try  3.  Try  4.  Try  5.  Try    12. Try  

Model:  

Choosing  the  regulariza5on  parameter    

Pick  (say)              .    Test  error:  

Page 23: Docs slides-lecture10

Andrew  Ng  

Bias/variance  as  a  func5on  of  the  regulariza5on  parameter  

Page 24: Docs slides-lecture10

Advice  for  applying  machine  learning  

Learning  curves  

Machine  Learning  

Page 25: Docs slides-lecture10

Andrew  Ng  

Learning  curves  

(training  set  size)  

error  

Page 26: Docs slides-lecture10

Andrew  Ng  

High  bias  

(training  set  size)  

error  

size  

price  

size  

price  

If  a  learning  algorithm  is  suffering  from   high   bias,   geJng   more  training   data   will   not   (by   itself)  help  much.  

Page 27: Docs slides-lecture10

Andrew  Ng  

High  variance  

(training  set  size)  

error  

size  

price  

size  

price  

If  a  learning  algorithm  is  suffering  from   high   variance,   geJng  more  training  data  is  likely  to  help.  

(and  small          )  

Page 28: Docs slides-lecture10

Advice  for  applying  machine  learning  

Deciding  what  to  try  next  (revisited)  

Machine  Learning  

Page 29: Docs slides-lecture10

Andrew  Ng  

Debugging  a  learning  algorithm:  Suppose  you  have  implemented  regularized  linear  regression  to  predict  housing  prices.  However,  when  you  test  your  hypothesis  in  a  new  set  of  houses,  you  find  that  it  makes  unacceptably  large  errors  in  its  predicDon.  What  should  you  try  next?    

-­‐  Get  more  training  examples  -­‐  Try  smaller  sets  of  features  -­‐  Try  geJng  addiDonal  features  -­‐  Try  adding  polynomial  features  -­‐  Try  decreasing  -­‐  Try  increasing  

Page 30: Docs slides-lecture10

Andrew  Ng  

Neural  networks  and  overfiBng  “Small”  neural  network  (fewer  parameters;  more  prone  to  underfiJng)  

“Large”  neural  network  (more  parameters;  more  prone  

to  overfiJng)  

ComputaDonally  more  expensive.    

Use  regularizaDon  (      )  to  address  overfiJng.  

ComputaDonally  cheaper