16
Chad Mills Program Manager Windows Live Safety Platform Microsoft

Chad Mills Program Manager Windows Live Safety Platform Microsoft

Embed Size (px)

Citation preview

Page 1: Chad Mills Program Manager Windows Live Safety Platform Microsoft

Chad MillsProgram Manager

Windows Live Safety PlatformMicrosoft

Page 2: Chad Mills Program Manager Windows Live Safety Platform Microsoft
Page 3: Chad Mills Program Manager Windows Live Safety Platform Microsoft

features

Training Time

score

feature weights

(feature,weight) pairs

features

Run Time

training

Assumption: Spam words continue to appear in spam messages Good words continue to appear in good messages

milliondollarstransferguardian

Marchcommunit

ysocialfellow

(dollars, 0.2)(million, 0.1)(transfer, 0.1)(community, -

0.01)(social, -0.01)(fellow, -0.01)(guardian, 0.03)(March, -0.08)

0.37

-0.11

Page 4: Chad Mills Program Manager Windows Live Safety Platform Microsoft

From: "Chelsea Clark" <[email protected]>

Subject: Get PaidFor yourOpinion

<style>…<br Bij board bar atteindre jYST GCS re sonrisa fuse Kiviuq padded />

<br Star Honolulu />

<br Ons apporter />

opens NRSU syringe />

<br Jerusalem comfort HTTPS 2604 confidence Miles />

<br 27 mails Qty backwards Meditations bans sedative ect salve <br insightful />

Korean relations header greeting Airllines Phantom CVS Rae 504 1009 perf<br graphiques />

undertaking paced Liquidation reduction />…

Page 5: Chad Mills Program Manager Windows Live Safety Platform Microsoft

Overall Group of words

Good newsletter peers month select these

Good late click commissioner media

Good smoothly off close support before

Good okay sponsor rock go by ads

Good none cases text membership

Page 6: Chad Mills Program Manager Windows Live Safety Platform Microsoft

Good Message

+Free

NigeriaViagra

Spammy Words

= Borderline Spam

Message

+Borderline

Spam

lateclick

commissioner

Unknown Words

=lateclick

commissionerGood

WordsInbox

+Borderline

Spam

newsletter

selectmonthUnknown Words

=newslett

erselectmonthNon-Good Words

Junk Folder

Page 7: Chad Mills Program Manager Windows Live Safety Platform Microsoft
Page 8: Chad Mills Program Manager Windows Live Safety Platform Microsoft

Chaff Spam [spam content] newsletter peers month select these late click commissioner media smoothly off close support before okay sponsor rock go by ads none cases text membership

Legitimate MailMarch is all about the Zune community. This month,

you can help create a new featurefor The Social, get tips from a fellow Zuneuser and find out the winners of theYour Zune Your Choice Awards.

Page 9: Chad Mills Program Manager Windows Live Safety Platform Microsoft
Page 10: Chad Mills Program Manager Windows Live Safety Platform Microsoft
Page 11: Chad Mills Program Manager Windows Live Safety Platform Microsoft

Sum of weights (content filter score) Average weight Standard Deviation Percent of words that are good Percent of words that are spam Number of features Maximum feature weight Number of strong spam words Etc.

Page 12: Chad Mills Program Manager Windows Live Safety Platform Microsoft

features

features(feature,weight)

pairs

Metafeatures

score

Metafeature weights

(Metafeature,weight)Pairs

feature weights

Metafeatures

Training Time

Run Time

training

training

metafeature extraction

metafeature extraction

milliondollarstransferguardian

Marchcommunit

ysocialfellow

(dollars, 0.2)(million, 0.1)(transfer, 0.1)(community, -

0.01)(social, -0.01)(fellow, -0.01)(guardian, 0.03)(March, -0.08)

Sum: 0.37

σ: 0.09Max: 0.2

Sum: -0.11

σ: 0.04Max: -0.1

Features

(feature, weight)

Metafeatures

(Metafeature, weight)(Sum: 0.37,

1.0)(σ: 0.09, 0.8)(Max: 0.2, 0.1)

(Sum: -0.11, -0.8)(σ: 0.04, -0.6)(Max: -0.1, -0.3)

-1.7

1.9

Page 13: Chad Mills Program Manager Windows Live Safety Platform Microsoft

Hotmail Feedback Loop◦ Messages classified by recipients

Training Set: 1,800,000 messages◦ Ending on 5/20/07

Evaluation Set: 50,000 messages◦ Data from 5/21/07

Page 14: Chad Mills Program Manager Windows Live Safety Platform Microsoft

45% improvement in TP at low FP levels

Page 15: Chad Mills Program Manager Windows Live Safety Platform Microsoft

At a reasonable False Positive rate:◦ 98% of unique catches are chaff spam◦ Caught 99.5% of chaff spam missed by regular

content filter◦ Similar types of False Positives as regular filter

Challenges Remaining◦ Primarily just helped on spam with chaff◦ Relies on base content filter to detect spam with

obfuscated content (e.g. v1agra) or naïve spam without any chaff

Page 16: Chad Mills Program Manager Windows Live Safety Platform Microsoft

Spam messages with good word chaff have unnatural weight distributions

Metafeatures is able to identify and catch these messages

This resulted in a 45% improvement in TP Gains were limited to spam with good word

chaff