Upload
helen-craig
View
307
Download
0
Embed Size (px)
Citation preview
BakeAgain
Helen CraigInsight Data Science Fellow2015
To improve one’s recipes and encourage people to bake again.
Algorithm
natural language processing
Gaussian Naive Bayes: modifications from comments
TF-IDF + Multinomial Naive Bayes:Comment trending topics
BakeAgain
2k-9k comments per recipes
Algorithm
natural language processing
Gaussian Naive Bayes: modifications from comments
TF-IDF + Multinomial Naive Bayes:Comment trending topics
compare recipe reviews to movie reviews
movie review: ‘wonderful/terrible’food review: ‘chewy/mushy’ BakeAgain
2k-9k comments per recipes
The Data
“I did as suggested in other reviews and increased the flour by 1/2 cup and baking soda by 1/2 tablespoon, cut the salt to 1/2 tsp. This made the absolutely best cookies ever!! I made them this afternoon thinking they'd last the rest of the week. No way! Between my husband and two children, they'll be gone by tomorrow night. They're fantastic. I'll make these alot!”
The Data
“I did as suggested in other reviews and increased the flour by 1/2 cup and baking soda by 1/2 tablespoon, cut the salt to 1/2 tsp. This made the absolutely best cookies ever!! I made them this afternoon thinking they'd last the rest of the week. No way! Between my husband and two children, they'll be gone by tomorrow night. They're fantastic. I'll make these alot!”
The Complication
“I added 1 teaspoon salt.”
Original Amount:2 teaspoons
Action Words used with Salt in Chewy Chocolate Chip Oatmeal Cookies
“Add”
“Use”
“Instead of”“Decrease”
Decrease Salt
"I also added less salt than it called for, about 1/4 teaspoon."
"I also only used 1/4 t salt since I always use margarine."
"For those complaining it's too salty-try 1/2 tsp salt instead of 1 tsp salt."
"My only suggestion is, if you are using salted butter, be sure to omit the extra salt."
“Increase”
Gaussian Naive Bayes fit
Word Use Frequencies in CommentsDecrease Butter Increase Flour Add Walnuts
“Use”
“Instead of”
“Decrease”
“Add”“Add”
“Increase”
accuracy 70%-80%5 (by hand) tagged recipesleave one out cross validation
2k-9k comments per recipes
Helen Craig
screen shots
backup
reduce modification-related words
Algorithm
classify ingredients-related/modification-related words and adjectives
Find ngrams with classified words
Gaussian Naive Bayes to decide overall meaning in comments: add, inc, dec, mix
halfremoveomit
decrease
‘use walnuts’‘reduce salt’
Multinomial Naive Bayes:Food Description
Other classifiers
K Nearest NeighborsLinear SVMRBF SVMDecision Tree--reasonableRandom Forest--reasonableAdaBoost--reasonableLinear Discriminant Analysis--goodQuadratic Discriminant Analysis--good
Taking the sentiment out of sentiment analysis pos review: ‘wonderful’neg review: ‘terrible’
movie review: ‘wonderful/terrible’pos review: ‘chewy’neg review: ‘mushy’
movie
negative
positive
TF-IDF +Multinomial Naive Bayes
to-do
Reviews that mention other reviews
word2vec to find more action words
modifications vs. ratings