Modeling Identity in Archival Collections of Email: A Preliminary studyTamer Elsayed and Douglas W. OardConference on Email and Anti-Spam (CEAS), July 28th, 2006Department of Computer ScienceCollege of Information StudiesInstitute for Advanced Computer Studies
Modeling Identity in Archival Collections of Email: A Preliminary Study
Real ProblemClinton White HouseTobacco Policysearch requesthired 25 persons32 million emails200,00080,000for 6 months
Modeling Identity in Archival Collections of Email: A Preliminary Study
Email SearchMeaning Modeling ContentPeople Modeling IdentitySearcher
Modeling Identity in Archival Collections of Email: A Preliminary Study
Identity~~~~~~~~~~~Email~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~sent email toSenderReceiversMentionedsentreceivedmentionsmentionedmentioned toEmail AddressNameNicknameEmail AddressNameNicknameEmail AddressNameNickname
Modeling Identity in Archival Collections of Email: A Preliminary Study
OutlineProblemIdentity Resolution ArchitectureEvaluationConclusion
Modeling Identity in Archival Collections of Email: A Preliminary Study
Entity [email protected] BruceBobRobert E. BruceSenior CounselEnron North America Corp.T (713) 345-7780F (713) [email protected] Signature (140)
Main Headers (915)Quoted Headers (8)Salutations (7)Free Signatures (9)NameEmail AddressNicknameSignature Block
Modeling Identity in Archival Collections of Email: A Preliminary Study
Enron CollectionExample of large organizational collectionCMU versionabout half million emails133,581 unique email addresses~52% of emails are duplicates!same address, subject, body
Modeling Identity in Archival Collections of Email: A Preliminary Study
Typical Enron EmailMessage HeaderMain BodySalutationSignature BlockQuoted HeaderQuotedTextMessage BodyQuoted SignatureQuoted Main Body-----Original Message-----From: [email protected]@ENRONSent: Monday, July 30, 2001 2:24 PMTo: Sager, Elizabeth; Murphy, Harlan; [email protected]; [email protected]: [email protected]:Shhhh.... it's a SURPRISE !Message-ID: Date: Mon, 30 Jul 2001 12:40:48 -0700 (PDT)From: [email protected]: [email protected]: RE: Shhhh.... it's a SURPRISE !X-From: Sager, Elizabeth X-To: '[email protected]@ENRON'Hope all is well.Count me in for the group present.See ya next week if not earlierPlease call me (713) 207-5233Liza
Elizabeth Sager713-853-6349Hi ShariThanks!
Shari
Modeling Identity in Archival Collections of Email: A Preliminary Study
Identity Resolution ArchitectureDuplicate DetectionExtraction from Main HeaderExtraction from Quoted Header Body and Quoted Text SeparationSignature Line DetectionSalutation Line DetectionNickname ExtractionMain bodySalutation linesSignature linesAddress-Nickname AssociationsAddress-Name AssociationsAddress-Address AssociationsClustering AssociationsEntitiesUnique emailsQuoted headers
Modeling Identity in Archival Collections of Email: A Preliminary Study
Extraction From Main HeadersMessage-ID: Date: Wed, 26 Sep 2001 09:25:19 -0700 (PDT)From: [email protected]: [email protected], [email protected], [email protected], o'[email protected], [email protected]: New Email AddressX-From: Jim Mathes X-To: Vandini, Mark , Urbon Steve , Tony Sapienza , Tom O'Rourke , Tom Lyons , Tom Hodgson X-cc: X-bcc:
We have just launched our "New & Improved Website",www.newbedfordchamber.com and I have a new email address:
Please make the appropriate changes in your email address book.
Thank you,
Jim Mathes, PresidentNew Bedford Area Chamber of CommerceName-Address AssociationName-Address AssociationAddress-Address Association
Modeling Identity in Archival Collections of Email: A Preliminary Study
Extraction From Quoted HeadersHi Jeff,
Did you get our registration packet? If not, stop by and pick one upbecause you need it. Make sure you get the one for new students.
Shawn
On Wednesday, November 03, 1999 11:18 AM, Jeff Dasovich[SMTP:[email protected]] wrote:>>> ok, don't shoot me, but what's the deadline for scheduling for classes?>> signed,> cluelessName-Address Association---------------------- Forwarded by Elizabeth Sager/HOU/ECT on 02/09/2000 12:02 PM ---------------------------
"Patricia Young" on 02/09/2000 08:50:59 AMTo: Elizabeth Sager/HOU/ECT@ECTcc: Subject: If possible, would you forward your resume to me electronically? Thanks.
If possible, would you forward your resume to me electronically? Thanks.
Name-Address Association
Modeling Identity in Archival Collections of Email: A Preliminary Study
Signature & Salutation DetectionFrom: [email protected] kiddies are going back to school already so now would be a good time to plan a trip to D.C. at last. Maybe early Sept?Also I'd be game for a girls' trip to Destin.
Time to work!Love,-Sooz
Procurement, Logistics, and ContractsEnron Broadband Services, Inc.1400 Smith, Suite EB-4573AHouston, TX 77002The week is going OK. All the tennis and swimming has left me with sore muscles so this is my night off. Am planning to do some more house chores so I do not end up with another weekend like the last.
I'm still planning on coming to Austin next weekend, I'm just not sure when, but I'll let you know.
Call if you get lonely!
Love,Sooz
Procurement, Logistics, and ContractsEnron Broadband Services, Inc.1400 Smith, Suite EB-4573AHouston, TX 77002Had another sleepless night Sun. and finally took some Unisom and had a good night's sleep last night. What a relief. I have really never had this problem before. It's good to have a lot of energy, but you have to shut down sometime.
Am sending you my travel schedule for next week. The following week (May 29 - June 2) I'm planning to be in SF also, but I'm not sure I'll actually have to be there that long.
Have a good afternoon!
love,sooz
Procurement, Logistics, and ContractsEnron Broadband Services, Inc.1400 Smith, Suite EB-4573AHouston, TX 77002
Modeling Identity in Archival Collections of Email: A Preliminary Study
Nickname Extraction3,151 address-nickname associationsHad another sleepless night Sun. and finally took some Unisom and had a good night's sleep last night. What a relief. I have really never had this problem before. It's good to have a lot of energy, but you have to shut down sometime.
Am sending you my travel schedule for next week. The following week (May 29 - June 2) I'm planning to be in SF also, but I'm not sure I'll actually have to be there that long.
Have a good afternoon!
love,sooz
Procurement, Logistics, and ContractsEnron Broadband Services, Inc.1400 Smith, Suite EB-4573AHouston, TX 77002nicknameFrom: [email protected]
Modeling Identity in Archival Collections of Email: A Preliminary Study
Identifying [email protected] BruceBobRobert E. BruceSenior CounselEnron North America Corp.T (713) 345-7780F (713) [email protected] Signature (140)Main Headers (915)Quoted Headers (8)Salutations (7)Free Signatures (9)NameEmail AddressNicknameSignature [email protected] AddressRobertNameQuoted Headers (5)Main Headers (7)82,084addr-name3,151 addr-nickname19,708 addr-addr66,715 entities
Modeling Identity in Archival Collections of Email: A Preliminary Study
OutlineProblemIdentity Resolution ArchitectureEvaluationConclusionFuture Work
Modeling Identity in Archival Collections of Email: A Preliminary Study
Stratified Sampling
Modeling Identity in Archival Collections of Email: A Preliminary Study
Judgment [email protected] "home email"[email protected] "alexis james-petty"[email protected] june [email protected] [email protected] "terrie covarrubias"[email protected] "randy"[email protected] "phyllis"[email protected] "tom"IncorrectCorrect but not informativeCorrect and somewhat informativeCorrect and very informative
Modeling Identity in Archival Collections of Email: A Preliminary Study
Evaluation MeasuresJudged AssociationsCorrectInformativeVery Informative
Modeling Identity in Archival Collections of Email: A Preliminary Study
AccuracyAddress-Name AssociationsAddress-Nickname AssociationsAddress-Address Associations100% accuracy with multiple sources of evidence.Address-name association was nearly perfect80% minimum accuracy in address-nickname96.7% entity accuracy
Chart18
97.916666666797.938473337697.9591836735
100100100
Both100Both
Overall98.4698782722Overall
Weakest evidence
Average evidence
Stronger evidence
Percent Accuracy
addr-addr
main freqqtd freqstrengthEmail AddressEmail AddressJudgmentjud typeCommentsrc strengthjud diificultyAccuracy (ignoring can't tellInformativenesssurprising
[email protected]@nationaljournal.comCan't tell01no evidence
[email protected]@cmsenergy.comCorrect and somehow helpful10
[email protected]@aol.comCorrect and somehow helpful20
[email protected]@ymcahouston.orgCorrect and so helpful32
[email protected]@conedsolutions.comCorrect and so helpful447So Weak6514210010095.9183673469
[email protected]@sfx.comCan't tell01no evidence
[email protected]@enron.comCorrect and somehow helpful10
[email protected]@pacificorpCorrect and so helpful20addr not complete
[email protected]@aol.comCorrect and so helpful31
[email protected]@hotmail.comCorrect and so helpful448Not So Weak4194210010097.959183673510010096.7176934278
AccuracyMain Headers
Percent InformativeMain Headers
Percent Very InformativeMain Headers
addr-addr
000
000
000
Accuracy
Percent Informative
Percent Very Informative
Weakest evidence
Average evidence
Good evidence
addr-name
000
Weakest evidence
Average evidence
Good evidence
Percent Informative
addr-nick
000
Weakest evidence
Average evidence
Good evidence
Percent Very Informative
21
44
Weakest evidence
Good evidence
Judgment Difficulty (%)
000
Weakest evidence
Average evidence
Good evidence
Percent Accuracy
Main HdrQtd Hdrsrc typemain freqqtd freqstrengthEmail AddressNameJudgmentjud typecountsrc strengthfreqjud diificultyAccuracy (ignoring can't tellInformativenesssurprisingWeighted Average (accuracy)Weighted Average (inform)Weighted Average (surprising)
[email protected]'t tell02
[email protected] james-pettyIncorrect11
[email protected] hoganCorrect but not helpful220
[email protected] haukCorrect and somehow helpful31
X.Main Headers [email protected] brookmanCorrect and so helpful426So Weak29677297.916666666757.446808510655.3191489362
[email protected]'t tell01
[email protected] emailIncorrect11
[email protected] but not helpful221
[email protected] pepperCorrect and somehow helpful32
X.Main [email protected] de la rosaCorrect and so helpful425Not So Weak31248197.959183673556.2552.083333333397.938473337656.832973921553.6595220842
[email protected] maguireCan't tell04
[email protected] sidlerCorrect but not helpful10
[email protected] smithCorrect but not helpful215
[email protected] marcumCorrect and somehow helpful33
.XQuoted [email protected] h. vedermanCorrect and so helpful428So Weak8042410067.391304347860.8695652174
[email protected] ellisCan't tell04
[email protected] to usaIncorrect10
[email protected] gaunderCorrect but not helpful213
[email protected] renzCorrect and somehow helpful32
.XQuoted [email protected] williamsCorrect and so helpful431Not So Weak3828410071.739130434867.391304347810068.793450789362.9727848797
[email protected] cashCorrect but not helpful00
[email protected] larreaCorrect but not helpful10
[email protected] riveraCorrect but not helpful212
[email protected] crownoverCorrect and somehow helpful33
[email protected] diersCorrect and so helpful43501007670
Overall
Main Headers
Salutations
Signatures
22
02
10
0
AccuracyInformVery Inform
6092597.938473337656.832973921553.6595220842
1187010068.793450789362.9727848797
92891007670
total avg (addr-name)98.469878272260.731582245556.8554570866
73790.677872234450.052766470736.9576360621
192490.939708939746.281775086121.7634858939
490100164
total avg (addr-nick)92.287398234442.454783927222.5549744962
000
000
0
0
Weakest evidence
Average evidence
Good evidence
Accuracy (%)
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Informative
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Very Informative
00
00
00
00
00
Addr-Name
Addr-Nick
Addr-Addr
Weakest evidence
Stronger evidence
Unjudged Associations
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Accuracy
00
00
Weakest evidence
Stronger evidence
Unjudged Associations
00
00
Weakest evidence
Good evidence
00
Weakest evidence
Good evidence
SalutSignatsrc typesal freqsug freqstrengthEmail AddressName/Nick NameJudgmentjud typecountfreqjud diificultyAccuracyInformativenesssurprisingWeighted Average (accuracy)Weighted Average (inform)Weighted Average (surprising)
[email protected] umCan't tell01
[email protected] reportIncorrect14
[email protected] but not helpful225
[email protected] and somehow helpful34
[email protected] and so helpful416So Weak272291.8444.4435.56
[email protected] bauerIncorrect00
[email protected] messageIncorrect15
[email protected] but not helpful221
[email protected] and somehow helpful37
[email protected] and so helpful417Not So Weak465090.0053.3337.7890.6850.0536.96
[email protected]'t tell05
[email protected] but not helpful217
[email protected] and somehow helpful33
[email protected] and so helpful416So Weak1701080.0052.7844.44
[email protected] but not helpful225
[email protected] and somehow helpful312
[email protected] and so helpful49Not So Weak1754092.0045.6519.5790.9446.2821.76
[email protected] but not helpful00
[email protected] but not helpful10
[email protected] but not helpful242
[email protected] and somehow helpful36
[email protected] and so helpful420100.0016.004.00
Overall
737.0090.6850.0536.96
1924.0090.9446.2821.76
490.00100.0016.004.00
total avg (addr-nick)92.2942.4522.55
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Very Informative
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Accuracy
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Informative
Chart19
91.836734693990.677872234490
8090.939708939792
Both100Both
Overall92.2873982344Overall
Weakest evidence
Average evidence
Stronger evidence
Percent Accuracy
addr-addr
main freqqtd freqstrengthEmail AddressEmail AddressJudgmentjud typeCommentsrc strengthjud diificultyAccuracy (ignoring can't tellInformativenesssurprising
[email protected]@nationaljournal.comCan't tell01no evidence
[email protected]@cmsenergy.comCorrect and somehow helpful10
[email protected]@aol.comCorrect and somehow helpful20
[email protected]@ymcahouston.orgCorrect and so helpful32
[email protected]@conedsolutions.comCorrect and so helpful447So Weak6514210010095.9183673469
[email protected]@sfx.comCan't tell01no evidence
[email protected]@enron.comCorrect and somehow helpful10
[email protected]@pacificorpCorrect and so helpful20addr not complete
[email protected]@aol.comCorrect and so helpful31
[email protected]@hotmail.comCorrect and so helpful448Not So Weak4194210010097.959183673510010096.7176934278
AccuracyMain Headers
Percent InformativeMain Headers
Percent Very InformativeMain Headers
addr-addr
000
000
000
Accuracy
Percent Informative
Percent Very Informative
Weakest evidence
Average evidence
Good evidence
addr-name
000
Weakest evidence
Average evidence
Good evidence
Percent Informative
addr-nick
000
Weakest evidence
Average evidence
Good evidence
Percent Very Informative
21
44
Weakest evidence
Good evidence
Judgment Difficulty (%)
000
Weakest evidence
Average evidence
Good evidence
Percent Accuracy
Main HdrQtd Hdrsrc typemain freqqtd freqstrengthEmail AddressNameJudgmentjud typecountsrc strengthfreqjud diificultyAccuracy (ignoring can't tellInformativenesssurprisingWeighted Average (accuracy)Weighted Average (inform)Weighted Average (surprising)
[email protected]'t tell02
[email protected] james-pettyIncorrect11
[email protected] hoganCorrect but not helpful220
[email protected] haukCorrect and somehow helpful31
X.Main Headers [email protected] brookmanCorrect and so helpful426So Weak29677297.916666666757.446808510655.3191489362
[email protected]'t tell01
[email protected] emailIncorrect11
[email protected] but not helpful221
[email protected] pepperCorrect and somehow helpful32
X.Main [email protected] de la rosaCorrect and so helpful425Not So Weak31248197.959183673556.2552.083333333397.938473337656.832973921553.6595220842
[email protected] maguireCan't tell04
[email protected] sidlerCorrect but not helpful10
[email protected] smithCorrect but not helpful215
[email protected] marcumCorrect and somehow helpful33
.XQuoted [email protected] h. vedermanCorrect and so helpful428So Weak8042410067.391304347860.8695652174
[email protected] ellisCan't tell04
[email protected] to usaIncorrect10
[email protected] gaunderCorrect but not helpful213
[email protected] renzCorrect and somehow helpful32
.XQuoted [email protected] williamsCorrect and so helpful431Not So Weak3828410071.739130434867.391304347810068.793450789362.9727848797
[email protected] cashCorrect but not helpful00
[email protected] larreaCorrect but not helpful10
[email protected] riveraCorrect but not helpful212
[email protected] crownoverCorrect and somehow helpful33
[email protected] diersCorrect and so helpful43501007670
Overall
Main Headers
Salutations
Signatures
22
02
10
0
AccuracyInformVery Inform
6092597.938473337656.832973921553.6595220842
1187010068.793450789362.9727848797
92891007670
total avg (addr-name)98.469878272260.731582245556.8554570866
73790.677872234450.052766470736.9576360621
192490.939708939746.281775086121.7634858939
490100164
total avg (addr-nick)92.287398234442.454783927222.5549744962
000
000
0
0
Weakest evidence
Average evidence
Good evidence
Accuracy (%)
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Informative
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Very Informative
00
00
00
00
00
Addr-Name
Addr-Nick
Addr-Addr
Weakest evidence
Stronger evidence
Unjudged Associations
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Accuracy
00
00
Weakest evidence
Stronger evidence
Unjudged Associations
00
00
Weakest evidence
Good evidence
00
Weakest evidence
Good evidence
SalutSignatsrc typesal freqsug freqstrengthEmail AddressName/Nick NameJudgmentjud typecountfreqjud diificultyAccuracyInformativenesssurprisingWeighted Average (accuracy)Weighted Average (inform)Weighted Average (surprising)
[email protected] umCan't tell01
[email protected] reportIncorrect14
[email protected] but not helpful225
[email protected] and somehow helpful34
[email protected] and so helpful416So Weak272291.8444.4435.56
[email protected] bauerIncorrect00
[email protected] messageIncorrect15
[email protected] but not helpful221
[email protected] and somehow helpful37
[email protected] and so helpful417Not So Weak465090.0053.3337.7890.6850.0536.96
[email protected]'t tell05
[email protected] but not helpful217
[email protected] and somehow helpful33
[email protected] and so helpful416So Weak1701080.0052.7844.44
[email protected] but not helpful225
[email protected] and somehow helpful312
[email protected] and so helpful49Not So Weak1754092.0045.6519.5790.9446.2821.76
[email protected] but not helpful00
[email protected] but not helpful10
[email protected] but not helpful242
[email protected] and somehow helpful36
[email protected] and so helpful420100.0016.004.00
Overall
737.0090.6850.0536.96
1924.0090.9446.2821.76
490.00100.0016.004.00
total avg (addr-nick)92.2942.4522.55
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Very Informative
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Accuracy
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Informative
Chart20
100100100
Weakest evidence
Average evidence
Good evidence
Percent Accuracy
addr-addr
main freqqtd freqstrengthEmail AddressEmail AddressJudgmentjud typeCommentsrc strengthjud diificultyAccuracy (ignoring can't tellInformativenesssurprising
[email protected]@nationaljournal.comCan't tell01no evidence
[email protected]@cmsenergy.comCorrect and somehow helpful10
[email protected]@aol.comCorrect and somehow helpful20
[email protected]@ymcahouston.orgCorrect and so helpful32
[email protected]@conedsolutions.comCorrect and so helpful447So Weak6514210010095.9183673469
[email protected]@sfx.comCan't tell01no evidence
[email protected]@enron.comCorrect and somehow helpful10
[email protected]@pacificorpCorrect and so helpful20addr not complete
[email protected]@aol.comCorrect and so helpful31
[email protected]@hotmail.comCorrect and so helpful448Not So Weak4194210010097.959183673510010096.7176934278
AccuracyMain Headers
Percent InformativeMain Headers
Percent Very InformativeMain Headers
addr-addr
000
000
000
Accuracy
Percent Informative
Percent Very Informative
Weakest evidence
Average evidence
Good evidence
addr-name
000
Weakest evidence
Average evidence
Good evidence
Percent Informative
addr-nick
000
Weakest evidence
Average evidence
Good evidence
Percent Very Informative
21
44
Weakest evidence
Good evidence
Judgment Difficulty (%)
000
Weakest evidence
Average evidence
Good evidence
Percent Accuracy
Main HdrQtd Hdrsrc typemain freqqtd freqstrengthEmail AddressNameJudgmentjud typecountsrc strengthfreqjud diificultyAccuracy (ignoring can't tellInformativenesssurprisingWeighted Average (accuracy)Weighted Average (inform)Weighted Average (surprising)
[email protected]'t tell02
[email protected] james-pettyIncorrect11
[email protected] hoganCorrect but not helpful220
[email protected] haukCorrect and somehow helpful31
X.Main Headers [email protected] brookmanCorrect and so helpful426So Weak29677297.916666666757.446808510655.3191489362
[email protected]'t tell01
[email protected] emailIncorrect11
[email protected] but not helpful221
[email protected] pepperCorrect and somehow helpful32
X.Main [email protected] de la rosaCorrect and so helpful425Not So Weak31248197.959183673556.2552.083333333397.938473337656.832973921553.6595220842
[email protected] maguireCan't tell04
[email protected] sidlerCorrect but not helpful10
[email protected] smithCorrect but not helpful215
[email protected] marcumCorrect and somehow helpful33
.XQuoted [email protected] h. vedermanCorrect and so helpful428So Weak8042410067.391304347860.8695652174
[email protected] ellisCan't tell04
[email protected] to usaIncorrect10
[email protected] gaunderCorrect but not helpful213
[email protected] renzCorrect and somehow helpful32
.XQuoted [email protected] williamsCorrect and so helpful431Not So Weak3828410071.739130434867.391304347810068.793450789362.9727848797
[email protected] cashCorrect but not helpful00
[email protected] larreaCorrect but not helpful10
[email protected] riveraCorrect but not helpful212
[email protected] crownoverCorrect and somehow helpful33
[email protected] diersCorrect and so helpful43501007670
Overall
Main Headers
Salutations
Signatures
22
02
10
0
AccuracyInformVery Inform
6092597.938473337656.832973921553.6595220842
1187010068.793450789362.9727848797
92891007670
total avg (addr-name)98.469878272260.731582245556.8554570866
73790.677872234450.052766470736.9576360621
192490.939708939746.281775086121.7634858939
490100164
total avg (addr-nick)92.287398234442.454783927222.5549744962
000
000
0
0
Weakest evidence
Average evidence
Good evidence
Accuracy (%)
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Informative
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Very Informative
00
00
00
00
00
Addr-Name
Addr-Nick
Addr-Addr
Weakest evidence
Stronger evidence
Unjudged Associations
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Accuracy
00
00
Weakest evidence
Stronger evidence
Unjudged Associations
00
00
Weakest evidence
Good evidence
00
Weakest evidence
Good evidence
SalutSignatsrc typesal freqsug freqstrengthEmail AddressName/Nick NameJudgmentjud typecountfreqjud diificultyAccuracyInformativenesssurprisingWeighted Average (accuracy)Weighted Average (inform)Weighted Average (surprising)
[email protected] umCan't tell01
[email protected] reportIncorrect14
[email protected] but not helpful225
[email protected] and somehow helpful34
[email protected] and so helpful416So Weak272291.8444.4435.56
[email protected] bauerIncorrect00
[email protected] messageIncorrect15
[email protected] but not helpful221
[email protected] and somehow helpful37
[email protected] and so helpful417Not So Weak465090.0053.3337.7890.6850.0536.96
[email protected]'t tell05
[email protected] but not helpful217
[email protected] and somehow helpful33
[email protected] and so helpful416So Weak1701080.0052.7844.44
[email protected] but not helpful225
[email protected] and somehow helpful312
[email protected] and so helpful49Not So Weak1754092.0045.6519.5790.9446.2821.76
[email protected] but not helpful00
[email protected] but not helpful10
[email protected] but not helpful242
[email protected] and somehow helpful36
[email protected] and so helpful420100.0016.004.00
Overall
737.0090.6850.0536.96
1924.0090.9446.2821.76
490.00100.0016.004.00
total avg (addr-nick)92.2942.4522.55
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Very Informative
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Accuracy
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Informative
Modeling Identity in Archival Collections of Email: A Preliminary Study
InformativenessAddress-Name AssociationsAddress-Nickname AssociationsAddress-Address Associations
Chart8
57.446808510656.832973921556.25
67.391304347868.793450789371.7391304348
Both76Both
Overall60.7315822455Overall
Weakest evidence
Average evidence
Stronger evidence
Percent Informative
addr-addr
main freqqtd freqstrengthEmail AddressEmail AddressJudgmentjud typeCommentsrc strengthjud diificultyAccuracy (ignoring can't tellInformativenesssurprising
[email protected]@nationaljournal.comCan't tell01no evidence
[email protected]@cmsenergy.comCorrect and somehow helpful10
[email protected]@aol.comCorrect and somehow helpful20
[email protected]@ymcahouston.orgCorrect and so helpful32
[email protected]@conedsolutions.comCorrect and so helpful447So Weak6514210010095.9183673469
[email protected]@sfx.comCan't tell01no evidence
[email protected]@enron.comCorrect and somehow helpful10
[email protected]@pacificorpCorrect and so helpful20addr not complete
[email protected]@aol.comCorrect and so helpful31
[email protected]@hotmail.comCorrect and so helpful448Not So Weak4194210010097.959183673510010096.7176934278
AccuracyMain Hdrs.
Percent InformativeMain Hdrs.
Percent Very InformativeMain Hdrs.
addr-addr
000
000
000
Accuracy
Percent Informative
Percent Very Informative
Weakest evidence
Average evidence
Good evidence
addr-name
000
Weakest evidence
Average evidence
Good evidence
Percent Informative
addr-nick
000
Weakest evidence
Average evidence
Good evidence
Percent Very Informative
21
44
Weakest evidence
Good evidence
Judgment Difficulty (%)
000
Weakest evidence
Average evidence
Good evidence
Percent Accuracy
Main HdrQtd Hdrsrc typemain freqqtd freqstrengthEmail AddressNameJudgmentjud typecountsrc strengthfreqjud diificultyAccuracy (ignoring can't tellInformativenesssurprisingWeighted Average (accuracy)Weighted Average (inform)Weighted Average (surprising)
[email protected]'t tell02
[email protected] james-pettyIncorrect11
[email protected] hoganCorrect but not helpful220
[email protected] haukCorrect and somehow helpful31
X.Main Headers [email protected] brookmanCorrect and so helpful426So Weak29677297.916666666757.446808510655.3191489362
[email protected]'t tell01
[email protected] emailIncorrect11
[email protected] but not helpful221
[email protected] pepperCorrect and somehow helpful32
X.Main [email protected] de la rosaCorrect and so helpful425Not So Weak31248197.959183673556.2552.083333333397.938473337656.832973921553.6595220842
[email protected] maguireCan't tell04
[email protected] sidlerCorrect but not helpful10
[email protected] smithCorrect but not helpful215
[email protected] marcumCorrect and somehow helpful33
.XQuoted [email protected] h. vedermanCorrect and so helpful428So Weak8042410067.391304347860.8695652174
[email protected] ellisCan't tell04
[email protected] to usaIncorrect10
[email protected] gaunderCorrect but not helpful213
[email protected] renzCorrect and somehow helpful32
.XQuoted [email protected] williamsCorrect and so helpful431Not So Weak3828410071.739130434867.391304347810068.793450789362.9727848797
[email protected] cashCorrect but not helpful00
[email protected] larreaCorrect but not helpful10
[email protected] riveraCorrect but not helpful212
[email protected] crownoverCorrect and somehow helpful33
[email protected] diersCorrect and so helpful43501007670
Overall
Main Headers
Salutations
Signatures
22
02
10
0
AccuracyInformVery Inform
6092597.938473337656.832973921553.6595220842
1187010068.793450789362.9727848797
92891007670
total avg (addr-name)98.469878272260.731582245556.8554570866
73790.677872234450.052766470736.9576360621
192490.939708939746.281775086121.7634858939
490100164
total avg (addr-nick)92.287398234442.454783927222.5549744962
000
000
0
0
Weakest evidence
Average evidence
Good evidence
Accuracy (%)
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Informative
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Very Informative
00
00
00
00
00
Addr-Name
Addr-Nick
Addr-Addr
Weakest evidence
Stronger evidence
Unjudged Associations
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Accuracy
00
00
Weakest evidence
Stronger evidence
Unjudged Associations
00
00
Weakest evidence
Good evidence
00
Weakest evidence
Good evidence
SalutSignatsrc typesal freqsug freqstrengthEmail AddressName/Nick NameJudgmentjud typecountfreqjud diificultyAccuracyInformativenesssurprisingWeighted Average (accuracy)Weighted Average (inform)Weighted Average (surprising)
[email protected] umCan't tell01
[email protected] reportIncorrect14
[email protected] but not helpful225
[email protected] and somehow helpful34
[email protected] and so helpful416So Weak272291.8444.4435.56
[email protected] bauerIncorrect00
[email protected] messageIncorrect15
[email protected] but not helpful221
[email protected] and somehow helpful37
[email protected] and so helpful417Not So Weak465090.0053.3337.7890.6850.0536.96
[email protected]'t tell05
[email protected] but not helpful217
[email protected] and somehow helpful33
[email protected] and so helpful416So Weak1701080.0052.7844.44
[email protected] but not helpful225
[email protected] and somehow helpful312
[email protected] and so helpful49Not So Weak1754092.0045.6519.5790.9446.2821.76
[email protected] but not helpful00
[email protected] but not helpful10
[email protected] but not helpful242
[email protected] and somehow helpful36
[email protected] and so helpful420100.0016.004.00
Overall
737.0090.6850.0536.96
1924.0090.9446.2821.76
490.00100.0016.004.00
total avg (addr-nick)92.2942.4522.55
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Very Informative
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Accuracy
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Informative
Chart16
44.444444444450.052766470753.3333333333
52.777777777846.281775086145.652173913
Both16Both
Overall42.4547839272Overall
Weakest evidence
Average evidence
Stronger evidence
Percent Informative
addr-addr
main freqqtd freqstrengthEmail AddressEmail AddressJudgmentjud typeCommentsrc strengthjud diificultyAccuracy (ignoring can't tellInformativenesssurprising
[email protected]@nationaljournal.comCan't tell01no evidence
[email protected]@cmsenergy.comCorrect and somehow helpful10
[email protected]@aol.comCorrect and somehow helpful20
[email protected]@ymcahouston.orgCorrect and so helpful32
[email protected]@conedsolutions.comCorrect and so helpful447So Weak6514210010095.9183673469
[email protected]@sfx.comCan't tell01no evidence
[email protected]@enron.comCorrect and somehow helpful10
[email protected]@pacificorpCorrect and so helpful20addr not complete
[email protected]@aol.comCorrect and so helpful31
[email protected]@hotmail.comCorrect and so helpful448Not So Weak4194210010097.959183673510010096.7176934278
AccuracyMain Hdrs.
Percent InformativeMain Hdrs.
Percent Very InformativeMain Hdrs.
addr-addr
000
000
000
Accuracy
Percent Informative
Percent Very Informative
Weakest evidence
Average evidence
Good evidence
addr-name
000
Weakest evidence
Average evidence
Good evidence
Percent Informative
addr-nick
000
Weakest evidence
Average evidence
Good evidence
Percent Very Informative
21
44
Weakest evidence
Good evidence
Judgment Difficulty (%)
000
Weakest evidence
Average evidence
Good evidence
Percent Accuracy
Main HdrQtd Hdrsrc typemain freqqtd freqstrengthEmail AddressNameJudgmentjud typecountsrc strengthfreqjud diificultyAccuracy (ignoring can't tellInformativenesssurprisingWeighted Average (accuracy)Weighted Average (inform)Weighted Average (surprising)
[email protected]'t tell02
[email protected] james-pettyIncorrect11
[email protected] hoganCorrect but not helpful220
[email protected] haukCorrect and somehow helpful31
X.Main Headers [email protected] brookmanCorrect and so helpful426So Weak29677297.916666666757.446808510655.3191489362
[email protected]'t tell01
[email protected] emailIncorrect11
[email protected] but not helpful221
[email protected] pepperCorrect and somehow helpful32
X.Main [email protected] de la rosaCorrect and so helpful425Not So Weak31248197.959183673556.2552.083333333397.938473337656.832973921553.6595220842
[email protected] maguireCan't tell04
[email protected] sidlerCorrect but not helpful10
[email protected] smithCorrect but not helpful215
[email protected] marcumCorrect and somehow helpful33
.XQuoted [email protected] h. vedermanCorrect and so helpful428So Weak8042410067.391304347860.8695652174
[email protected] ellisCan't tell04
[email protected] to usaIncorrect10
[email protected] gaunderCorrect but not helpful213
[email protected] renzCorrect and somehow helpful32
.XQuoted [email protected] williamsCorrect and so helpful431Not So Weak3828410071.739130434867.391304347810068.793450789362.9727848797
[email protected] cashCorrect but not helpful00
[email protected] larreaCorrect but not helpful10
[email protected] riveraCorrect but not helpful212
[email protected] crownoverCorrect and somehow helpful33
[email protected] diersCorrect and so helpful43501007670
Overall
Main Headers
Salutations
Signatures
22
02
10
0
AccuracyInformVery Inform
6092597.938473337656.832973921553.6595220842
1187010068.793450789362.9727848797
92891007670
total avg (addr-name)98.469878272260.731582245556.8554570866
73790.677872234450.052766470736.9576360621
192490.939708939746.281775086121.7634858939
490100164
total avg (addr-nick)92.287398234442.454783927222.5549744962
000
000
0
0
Weakest evidence
Average evidence
Good evidence
Accuracy (%)
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Informative
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Very Informative
00
00
00
00
00
Addr-Name
Addr-Nick
Addr-Addr
Weakest evidence
Stronger evidence
Unjudged Associations
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Accuracy
00
00
Weakest evidence
Stronger evidence
Unjudged Associations
00
00
Weakest evidence
Good evidence
00
Weakest evidence
Good evidence
SalutSignatsrc typesal freqsug freqstrengthEmail AddressName/Nick NameJudgmentjud typecountfreqjud diificultyAccuracyInformativenesssurprisingWeighted Average (accuracy)Weighted Average (inform)Weighted Average (surprising)
[email protected] umCan't tell01
[email protected] reportIncorrect14
[email protected] but not helpful225
[email protected] and somehow helpful34
[email protected] and so helpful416So Weak272291.8444.4435.56
[email protected] bauerIncorrect00
[email protected] messageIncorrect15
[email protected] but not helpful221
[email protected] and somehow helpful37
[email protected] and so helpful417Not So Weak465090.0053.3337.7890.6850.0536.96
[email protected]'t tell05
[email protected] but not helpful217
[email protected] and somehow helpful33
[email protected] and so helpful416So Weak1701080.0052.7844.44
[email protected] but not helpful225
[email protected] and somehow helpful312
[email protected] and so helpful49Not So Weak1754092.0045.6519.5790.9446.2821.76
[email protected] but not helpful00
[email protected] but not helpful10
[email protected] but not helpful242
[email protected] and somehow helpful36
[email protected] and so helpful420100.0016.004.00
Overall
737.0090.6850.0536.96
1924.0090.9446.2821.76
490.00100.0016.004.00
total avg (addr-nick)92.2942.4522.55
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Very Informative
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Accuracy
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Informative
Chart21
100100100
Weakest evidence
Average evidence
Good evidence
Percent Informative
addr-addr
main freqqtd freqstrengthEmail AddressEmail AddressJudgmentjud typeCommentsrc strengthjud diificultyAccuracy (ignoring can't tellInformativenesssurprising
[email protected]@nationaljournal.comCan't tell01no evidence
[email protected]@cmsenergy.comCorrect and somehow helpful10
[email protected]@aol.comCorrect and somehow helpful20
[email protected]@ymcahouston.orgCorrect and so helpful32
[email protected]@conedsolutions.comCorrect and so helpful447So Weak6514210010095.9183673469
[email protected]@sfx.comCan't tell01no evidence
[email protected]@enron.comCorrect and somehow helpful10
[email protected]@pacificorpCorrect and so helpful20addr not complete
[email protected]@aol.comCorrect and so helpful31
[email protected]@hotmail.comCorrect and so helpful448Not So Weak4194210010097.959183673510010096.7176934278
AccuracyMain Headers
Percent InformativeMain Headers
Percent Very InformativeMain Headers
addr-addr
000
000
000
Accuracy
Percent Informative
Percent Very Informative
Weakest evidence
Average evidence
Good evidence
addr-name
000
Weakest evidence
Average evidence
Good evidence
Percent Informative
addr-nick
000
Weakest evidence
Average evidence
Good evidence
Percent Very Informative
21
44
Weakest evidence
Good evidence
Judgment Difficulty (%)
000
Weakest evidence
Average evidence
Good evidence
Percent Accuracy
Main HdrQtd Hdrsrc typemain freqqtd freqstrengthEmail AddressNameJudgmentjud typecountsrc strengthfreqjud diificultyAccuracy (ignoring can't tellInformativenesssurprisingWeighted Average (accuracy)Weighted Average (inform)Weighted Average (surprising)
[email protected]'t tell02
[email protected] james-pettyIncorrect11
[email protected] hoganCorrect but not helpful220
[email protected] haukCorrect and somehow helpful31
X.Main Headers [email protected] brookmanCorrect and so helpful426So Weak29677297.916666666757.446808510655.3191489362
[email protected]'t tell01
[email protected] emailIncorrect11
[email protected] but not helpful221
[email protected] pepperCorrect and somehow helpful32
X.Main [email protected] de la rosaCorrect and so helpful425Not So Weak31248197.959183673556.2552.083333333397.938473337656.832973921553.6595220842
[email protected] maguireCan't tell04
[email protected] sidlerCorrect but not helpful10
[email protected] smithCorrect but not helpful215
[email protected] marcumCorrect and somehow helpful33
.XQuoted [email protected] h. vedermanCorrect and so helpful428So Weak8042410067.391304347860.8695652174
[email protected] ellisCan't tell04
[email protected] to usaIncorrect10
[email protected] gaunderCorrect but not helpful213
[email protected] renzCorrect and somehow helpful32
.XQuoted [email protected] williamsCorrect and so helpful431Not So Weak3828410071.739130434867.391304347810068.793450789362.9727848797
[email protected] cashCorrect but not helpful00
[email protected] larreaCorrect but not helpful10
[email protected] riveraCorrect but not helpful212
[email protected] crownoverCorrect and somehow helpful33
[email protected] diersCorrect and so helpful43501007670
Overall
Main Headers
Salutations
Signatures
22
02
10
0
AccuracyInformVery Inform
6092597.938473337656.832973921553.6595220842
1187010068.793450789362.9727848797
92891007670
total avg (addr-name)98.469878272260.731582245556.8554570866
73790.677872234450.052766470736.9576360621
192490.939708939746.281775086121.7634858939
490100164
total avg (addr-nick)92.287398234442.454783927222.5549744962
000
000
0
0
Weakest evidence
Average evidence
Good evidence
Accuracy (%)
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Informative
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Very Informative
00
00
00
00
00
Addr-Name
Addr-Nick
Addr-Addr
Weakest evidence
Stronger evidence
Unjudged Associations
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Accuracy
00
00
Weakest evidence
Stronger evidence
Unjudged Associations
00
00
Weakest evidence
Good evidence
00
Weakest evidence
Good evidence
SalutSignatsrc typesal freqsug freqstrengthEmail AddressName/Nick NameJudgmentjud typecountfreqjud diificultyAccuracyInformativenesssurprisingWeighted Average (accuracy)Weighted Average (inform)Weighted Average (surprising)
[email protected] umCan't tell01
[email protected] reportIncorrect14
[email protected] but not helpful225
[email protected] and somehow helpful34
[email protected] and so helpful416So Weak272291.8444.4435.56
[email protected] bauerIncorrect00
[email protected] messageIncorrect15
[email protected] but not helpful221
[email protected] and somehow helpful37
[email protected] and so helpful417Not So Weak465090.0053.3337.7890.6850.0536.96
[email protected]'t tell05
[email protected] but not helpful217
[email protected] and somehow helpful33
[email protected] and so helpful416So Weak1701080.0052.7844.44
[email protected] but not helpful225
[email protected] and somehow helpful312
[email protected] and so helpful49Not So Weak1754092.0045.6519.5790.9446.2821.76
[email protected] but not helpful00
[email protected] but not helpful10
[email protected] but not helpful242
[email protected] and somehow helpful36
[email protected] and so helpful420100.0016.004.00
Overall
737.0090.6850.0536.96
1924.0090.9446.2821.76
490.00100.0016.004.00
total avg (addr-nick)92.2942.4522.55
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Very Informative
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Accuracy
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Informative
Chart9
55.319148936253.659522084252.0833333333
60.869565217462.972784879767.3913043478
Both70Both
Overall56.8554570866Overall
Weakest evidence
Average evidence
Stronger evidence
Percent Very Informative
addr-addr
main freqqtd freqstrengthEmail AddressEmail AddressJudgmentjud typeCommentsrc strengthjud diificultyAccuracy (ignoring can't tellInformativenesssurprising
[email protected]@nationaljournal.comCan't tell01no evidence
[email protected]@cmsenergy.comCorrect and somehow helpful10
[email protected]@aol.comCorrect and somehow helpful20
[email protected]@ymcahouston.orgCorrect and so helpful32
[email protected]@conedsolutions.comCorrect and so helpful447So Weak6514210010095.9183673469
[email protected]@sfx.comCan't tell01no evidence
[email protected]@enron.comCorrect and somehow helpful10
[email protected]@pacificorpCorrect and so helpful20addr not complete
[email protected]@aol.comCorrect and so helpful31
[email protected]@hotmail.comCorrect and so helpful448Not So Weak4194210010097.959183673510010096.7176934278
AccuracyMain Hdrs.
Percent InformativeMain Hdrs.
Percent Very InformativeMain Hdrs.
addr-addr
000
000
000
Accuracy
Percent Informative
Percent Very Informative
Weakest evidence
Average evidence
Good evidence
addr-name
000
Weakest evidence
Average evidence
Good evidence
Percent Informative
addr-nick
000
Weakest evidence
Average evidence
Good evidence
Percent Very Informative
21
44
Weakest evidence
Good evidence
Judgment Difficulty (%)
000
Weakest evidence
Average evidence
Good evidence
Percent Accuracy
Main HdrQtd Hdrsrc typemain freqqtd freqstrengthEmail AddressNameJudgmentjud typecountsrc strengthfreqjud diificultyAccuracy (ignoring can't tellInformativenesssurprisingWeighted Average (accuracy)Weighted Average (inform)Weighted Average (surprising)
[email protected]'t tell02
[email protected] james-pettyIncorrect11
[email protected] hoganCorrect but not helpful220
[email protected] haukCorrect and somehow helpful31
X.Main Headers [email protected] brookmanCorrect and so helpful426So Weak29677297.916666666757.446808510655.3191489362
[email protected]'t tell01
[email protected] emailIncorrect11
[email protected] but not helpful221
[email protected] pepperCorrect and somehow helpful32
X.Main [email protected] de la rosaCorrect and so helpful425Not So Weak31248197.959183673556.2552.083333333397.938473337656.832973921553.6595220842
[email protected] maguireCan't tell04
[email protected] sidlerCorrect but not helpful10
[email protected] smithCorrect but not helpful215
[email protected] marcumCorrect and somehow helpful33
.XQuoted [email protected] h. vedermanCorrect and so helpful428So Weak8042410067.391304347860.8695652174
[email protected] ellisCan't tell04
[email protected] to usaIncorrect10
[email protected] gaunderCorrect but not helpful213
[email protected] renzCorrect and somehow helpful32
.XQuoted [email protected] williamsCorrect and so helpful431Not So Weak3828410071.739130434867.391304347810068.793450789362.9727848797
[email protected] cashCorrect but not helpful00
[email protected] larreaCorrect but not helpful10
[email protected] riveraCorrect but not helpful212
[email protected] crownoverCorrect and somehow helpful33
[email protected] diersCorrect and so helpful43501007670
Overall
Main Headers
Salutations
Signatures
22
02
10
0
AccuracyInformVery Inform
6092597.938473337656.832973921553.6595220842
1187010068.793450789362.9727848797
92891007670
total avg (addr-name)98.469878272260.731582245556.8554570866
73790.677872234450.052766470736.9576360621
192490.939708939746.281775086121.7634858939
490100164
total avg (addr-nick)92.287398234442.454783927222.5549744962
000
000
0
0
Weakest evidence
Average evidence
Good evidence
Accuracy (%)
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Informative
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Very Informative
00
00
00
00
00
Addr-Name
Addr-Nick
Addr-Addr
Weakest evidence
Stronger evidence
Unjudged Associations
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Accuracy
00
00
Weakest evidence
Stronger evidence
Unjudged Associations
00
00
Weakest evidence
Good evidence
00
Weakest evidence
Good evidence
SalutSignatsrc typesal freqsug freqstrengthEmail AddressName/Nick NameJudgmentjud typecountfreqjud diificultyAccuracyInformativenesssurprisingWeighted Average (accuracy)Weighted Average (inform)Weighted Average (surprising)
[email protected] umCan't tell01
[email protected] reportIncorrect14
[email protected] but not helpful225
[email protected] and somehow helpful34
[email protected] and so helpful416So Weak272291.8444.4435.56
[email protected] bauerIncorrect00
[email protected] messageIncorrect15
[email protected] but not helpful221
[email protected] and somehow helpful37
[email protected] and so helpful417Not So Weak465090.0053.3337.7890.6850.0536.96
[email protected]'t tell05
[email protected] but not helpful217
[email protected] and somehow helpful33
[email protected] and so helpful416So Weak1701080.0052.7844.44
[email protected] but not helpful225
[email protected] and somehow helpful312
[email protected] and so helpful49Not So Weak1754092.0045.6519.5790.9446.2821.76
[email protected] but not helpful00
[email protected] but not helpful10
[email protected] but not helpful242
[email protected] and somehow helpful36
[email protected] and so helpful420100.0016.004.00
Overall
737.0090.6850.0536.96
1924.0090.9446.2821.76
490.00100.0016.004.00
total avg (addr-nick)92.2942.4522.55
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Very Informative
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Accuracy
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Informative
Chart14
35.555555555636.957636062137.7777777778
44.444444444421.763485893919.5652173913
Both4Both
Overall22.5549744962Overall
Weakest evidence
Average evidence
Stronger evidence
Percent Very Informative
addr-addr
main freqqtd freqstrengthEmail AddressEmail AddressJudgmentjud typeCommentsrc strengthjud diificultyAccuracy (ignoring can't tellInformativenesssurprising
[email protected]@nationaljournal.comCan't tell01no evidence
[email protected]@cmsenergy.comCorrect and somehow helpful10
[email protected]@aol.comCorrect and somehow helpful20
[email protected]@ymcahouston.orgCorrect and so helpful32
[email protected]@conedsolutions.comCorrect and so helpful447So Weak6514210010095.9183673469
[email protected]@sfx.comCan't tell01no evidence
[email protected]@enron.comCorrect and somehow helpful10
[email protected]@pacificorpCorrect and so helpful20addr not complete
[email protected]@aol.comCorrect and so helpful31
[email protected]@hotmail.comCorrect and so helpful448Not So Weak4194210010097.959183673510010096.7176934278
AccuracyMain Hdrs.
Percent InformativeMain Hdrs.
Percent Very InformativeMain Hdrs.
addr-addr
000
000
000
Accuracy
Percent Informative
Percent Very Informative
Weakest evidence
Average evidence
Good evidence
addr-name
000
Weakest evidence
Average evidence
Good evidence
Percent Informative
addr-nick
000
Weakest evidence
Average evidence
Good evidence
Percent Very Informative
21
44
Weakest evidence
Good evidence
Judgment Difficulty (%)
000
Weakest evidence
Average evidence
Good evidence
Percent Accuracy
Main HdrQtd Hdrsrc typemain freqqtd freqstrengthEmail AddressNameJudgmentjud typecountsrc strengthfreqjud diificultyAccuracy (ignoring can't tellInformativenesssurprisingWeighted Average (accuracy)Weighted Average (inform)Weighted Average (surprising)
[email protected]'t tell02
[email protected] james-pettyIncorrect11
[email protected] hoganCorrect but not helpful220
[email protected] haukCorrect and somehow helpful31
X.Main Headers [email protected] brookmanCorrect and so helpful426So Weak29677297.916666666757.446808510655.3191489362
[email protected]'t tell01
[email protected] emailIncorrect11
[email protected] but not helpful221
[email protected] pepperCorrect and somehow helpful32
X.Main [email protected] de la rosaCorrect and so helpful425Not So Weak31248197.959183673556.2552.083333333397.938473337656.832973921553.6595220842
[email protected] maguireCan't tell04
[email protected] sidlerCorrect but not helpful10
[email protected] smithCorrect but not helpful215
[email protected] marcumCorrect and somehow helpful33
.XQuoted [email protected] h. vedermanCorrect and so helpful428So Weak8042410067.391304347860.8695652174
[email protected] ellisCan't tell04
[email protected] to usaIncorrect10
[email protected] gaunderCorrect but not helpful213
[email protected] renzCorrect and somehow helpful32
.XQuoted [email protected] williamsCorrect and so helpful431Not So Weak3828410071.739130434867.391304347810068.793450789362.9727848797
[email protected] cashCorrect but not helpful00
[email protected] larreaCorrect but not helpful10
[email protected] riveraCorrect but not helpful212
[email protected] crownoverCorrect and somehow helpful33
[email protected] diersCorrect and so helpful43501007670
Overall
Main Headers
Salutations
Signatures
22
02
10
0
AccuracyInformVery Inform
6092597.938473337656.832973921553.6595220842
1187010068.793450789362.9727848797
92891007670
total avg (addr-name)98.469878272260.731582245556.8554570866
73790.677872234450.052766470736.9576360621
192490.939708939746.281775086121.7634858939
490100164
total avg (addr-nick)92.287398234442.454783927222.5549744962
000
000
0
0
Weakest evidence
Average evidence
Good evidence
Accuracy (%)
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Informative
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Very Informative
00
00
00
00
00
Addr-Name
Addr-Nick
Addr-Addr
Weakest evidence
Stronger evidence
Unjudged Associations
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Accuracy
00
00
Weakest evidence
Stronger evidence
Unjudged Associations
00
00
Weakest evidence
Good evidence
00
Weakest evidence
Good evidence
SalutSignatsrc typesal freqsug freqstrengthEmail AddressName/Nick NameJudgmentjud typecountfreqjud diificultyAccuracyInformativenesssurprisingWeighted Average (accuracy)Weighted Average (inform)Weighted Average (surprising)
[email protected] umCan't tell01
[email protected] reportIncorrect14
[email protected] but not helpful225
[email protected] and somehow helpful34
[email protected] and so helpful416So Weak272291.8444.4435.56
[email protected] bauerIncorrect00
[email protected] messageIncorrect15
[email protected] but not helpful221
[email protected] and somehow helpful37
[email protected] and so helpful417Not So Weak465090.0053.3337.7890.6850.0536.96
[email protected]'t tell05
[email protected] but not helpful217
[email protected] and somehow helpful33
[email protected] and so helpful416So Weak1701080.0052.7844.44
[email protected] but not helpful225
[email protected] and somehow helpful312
[email protected] and so helpful49Not So Weak1754092.0045.6519.5790.9446.2821.76
[email protected] but not helpful00
[email protected] but not helpful10
[email protected] but not helpful242
[email protected] and somehow helpful36
[email protected] and so helpful420100.0016.004.00
Overall
737.0090.6850.0536.96
1924.0090.9446.2821.76
490.00100.0016.004.00
total avg (addr-nick)92.2942.4522.55
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Very Informative
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Accuracy
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Informative
Chart22
95.918367346996.717693427897.9591836735
Weakest evidence
Average evidence
Good evidence
Percent Very Informative
addr-addr
main freqqtd freqstrengthEmail AddressEmail AddressJudgmentjud typeCommentsrc strengthjud diificultyAccuracy (ignoring can't tellInformativenesssurprising
[email protected]@nationaljournal.comCan't tell01no evidence
[email protected]@cmsenergy.comCorrect and somehow helpful10
[email protected]@aol.comCorrect and somehow helpful20
[email protected]@ymcahouston.orgCorrect and so helpful32
[email protected]@conedsolutions.comCorrect and so helpful447So Weak6514210010095.9183673469
[email protected]@sfx.comCan't tell01no evidence
[email protected]@enron.comCorrect and somehow helpful10
[email protected]@pacificorpCorrect and so helpful20addr not complete
[email protected]@aol.comCorrect and so helpful31
[email protected]@hotmail.comCorrect and so helpful448Not So Weak4194210010097.959183673510010096.7176934278
AccuracyMain Headers
Percent InformativeMain Headers
Percent Very InformativeMain Headers
addr-addr
000
000
000
Accuracy
Percent Informative
Percent Very Informative
Weakest evidence
Average evidence
Good evidence
addr-name
000
Weakest evidence
Average evidence
Good evidence
Percent Informative
addr-nick
000
Weakest evidence
Average evidence
Good evidence
Percent Very Informative
21
44
Weakest evidence
Good evidence
Judgment Difficulty (%)
000
Weakest evidence
Average evidence
Good evidence
Percent Accuracy
Main HdrQtd Hdrsrc typemain freqqtd freqstrengthEmail AddressNameJudgmentjud typecountsrc strengthfreqjud diificultyAccuracy (ignoring can't tellInformativenesssurprisingWeighted Average (accuracy)Weighted Average (inform)Weighted Average (surprising)
[email protected]'t tell02
[email protected] james-pettyIncorrect11
[email protected] hoganCorrect but not helpful220
[email protected] haukCorrect and somehow helpful31
X.Main Headers [email protected] brookmanCorrect and so helpful426So Weak29677297.916666666757.446808510655.3191489362
[email protected]'t tell01
[email protected] emailIncorrect11
[email protected] but not helpful221
[email protected] pepperCorrect and somehow helpful32
X.Main [email protected] de la rosaCorrect and so helpful425Not So Weak31248197.959183673556.2552.083333333397.938473337656.832973921553.6595220842
[email protected] maguireCan't tell04
[email protected] sidlerCorrect but not helpful10
[email protected] smithCorrect but not helpful215
[email protected] marcumCorrect and somehow helpful33
.XQuoted [email protected] h. vedermanCorrect and so helpful428So Weak8042410067.391304347860.8695652174
[email protected] ellisCan't tell04
[email protected] to usaIncorrect10
[email protected] gaunderCorrect but not helpful213
[email protected] renzCorrect and somehow helpful32
.XQuoted [email protected] williamsCorrect and so helpful431Not So Weak3828410071.739130434867.391304347810068.793450789362.9727848797
[email protected] cashCorrect but not helpful00
[email protected] larreaCorrect but not helpful10
[email protected] riveraCorrect but not helpful212
[email protected] crownoverCorrect and somehow helpful33
[email protected] diersCorrect and so helpful43501007670
Overall
Main Headers
Salutations
Signatures
22
02
10
0
AccuracyInformVery Inform
6092597.938473337656.832973921553.6595220842
1187010068.793450789362.9727848797
92891007670
total avg (addr-name)98.469878272260.731582245556.8554570866
73790.677872234450.052766470736.9576360621
192490.939708939746.281775086121.7634858939
490100164
total avg (addr-nick)92.287398234442.454783927222.5549744962
000
000
0
0
Weakest evidence
Average evidence
Good evidence
Accuracy (%)
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Informative
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Very Informative
00
00
00
00
00
Addr-Name
Addr-Nick
Addr-Addr
Weakest evidence
Stronger evidence
Unjudged Associations
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Accuracy
00
00
Weakest evidence
Stronger evidence
Unjudged Associations
00
00
Weakest evidence
Good evidence
00
Weakest evidence
Good evidence
SalutSignatsrc typesal freqsug freqstrengthEmail AddressName/Nick NameJudgmentjud typecountfreqjud diificultyAccuracyInformativenesssurprisingWeighted Average (accuracy)Weighted Average (inform)Weighted Average (surprising)
[email protected] umCan't tell01
[email protected] reportIncorrect14
[email protected] but not helpful225
[email protected] and somehow helpful34
[email protected] and so helpful416So Weak272291.8444.4435.56
[email protected] bauerIncorrect00
[email protected] messageIncorrect15
[email protected] but not helpful221
[email protected] and somehow helpful37
[email protected] and so helpful417Not So Weak465090.0053.3337.7890.6850.0536.96
[email protected]'t tell05
[email protected] but not helpful217
[email protected] and somehow helpful33
[email protected] and so helpful416So Weak1701080.0052.7844.44
[email protected] but not helpful225
[email protected] and somehow helpful312
[email protected] and so helpful49Not So Weak1754092.0045.6519.5790.9446.2821.76
[email protected] but not helpful00
[email protected] but not helpful10
[email protected] but not helpful242
[email protected] and somehow helpful36
[email protected] and so helpful420100.0016.004.00
Overall
737.0090.6850.0536.96
1924.0090.9446.2821.76
490.00100.0016.004.00
total avg (addr-nick)92.2942.4522.55
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Very Informative
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Accuracy
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Informative
Chart19
91.836734693990.677872234490
8090.939708939792
Both100Both
Overall92.2873982344Overall
Weakest evidence
Average evidence
Stronger evidence
Percent Accuracy
addr-addr
main freqqtd freqstrengthEmail AddressEmail AddressJudgmentjud typeCommentsrc strengthjud diificultyAccuracy (ignoring can't tellInformativenesssurprising
[email protected]@nationaljournal.comCan't tell01no evidence
[email protected]@cmsenergy.comCorrect and somehow helpful10
[email protected]@aol.comCorrect and somehow helpful20
[email protected]@ymcahouston.orgCorrect and so helpful32
[email protected]@conedsolutions.comCorrect and so helpful447So Weak6514210010095.9183673469
[email protected]@sfx.comCan't tell01no evidence
[email protected]@enron.comCorrect and somehow helpful10
[email protected]@pacificorpCorrect and so helpful20addr not complete
[email protected]@aol.comCorrect and so helpful31
[email protected]@hotmail.comCorrect and so helpful448Not So Weak4194210010097.959183673510010096.7176934278
AccuracyMain Headers
Percent InformativeMain Headers
Percent Very InformativeMain Headers
addr-addr
000
000
000
Accuracy
Percent Informative
Percent Very Informative
Weakest evidence
Average evidence
Good evidence
addr-name
000
Weakest evidence
Average evidence
Good evidence
Percent Informative
addr-nick
000
Weakest evidence
Average evidence
Good evidence
Percent Very Informative
21
44
Weakest evidence
Good evidence
Judgment Difficulty (%)
000
Weakest evidence
Average evidence
Good evidence
Percent Accuracy
Main HdrQtd Hdrsrc typemain freqqtd freqstrengthEmail AddressNameJudgmentjud typecountsrc strengthfreqjud diificultyAccuracy (ignoring can't tellInformativenesssurprisingWeighted Average (accuracy)Weighted Average (inform)Weighted Average (surprising)
[email protected]'t tell02
[email protected] james-pettyIncorrect11
[email protected] hoganCorrect but not helpful220
[email protected] haukCorrect and somehow helpful31
X.Main Headers [email protected] brookmanCorrect and so helpful426So Weak29677297.916666666757.446808510655.3191489362
[email protected]'t tell01
[email protected] emailIncorrect11
[email protected] but not helpful221
[email protected] pepperCorrect and somehow helpful32
X.Main [email protected] de la rosaCorrect and so helpful425Not So Weak31248197.959183673556.2552.083333333397.938473337656.832973921553.6595220842
[email protected] maguireCan't tell04
[email protected] sidlerCorrect but not helpful10
[email protected] smithCorrect but not helpful215
[email protected] marcumCorrect and somehow helpful33
.XQuoted [email protected] h. vedermanCorrect and so helpful428So Weak8042410067.391304347860.8695652174
[email protected] ellisCan't tell04
[email protected] to usaIncorrect10
[email protected] gaunderCorrect but not helpful213
[email protected] renzCorrect and somehow helpful32
.XQuoted [email protected] williamsCorrect and so helpful431Not So Weak3828410071.739130434867.391304347810068.793450789362.9727848797
[email protected] cashCorrect but not helpful00
[email protected] larreaCorrect but not helpful10
[email protected] riveraCorrect but not helpful212
[email protected] crownoverCorrect and somehow helpful33
[email protected] diersCorrect and so helpful43501007670
Overall
Main Headers
Salutations
Signatures
22
02
10
0
AccuracyInformVery Inform
6092597.938473337656.832973921553.6595220842
1187010068.793450789362.9727848797
92891007670
total avg (addr-name)98.469878272260.731582245556.8554570866
73790.677872234450.052766470736.9576360621
192490.939708939746.281775086121.7634858939
490100164
total avg (addr-nick)92.287398234442.454783927222.5549744962
000
000
0
0
Weakest evidence
Average evidence
Good evidence
Accuracy (%)
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Informative
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Very Informative
00
00
00
00
00
Addr-Name
Addr-Nick
Addr-Addr
Weakest evidence
Stronger evidence
Unjudged Associations
000
000
0
0
Weakest evidence
Average evidence
Stronger evidence
Percent Accuracy
00
00
Weakest evidence
Stronger evidence
Unjudged Associations
00
00
Weakest evidence
Good evidence
00
Weakest evidence
Good evidence
SalutSignatsrc typesal freqsug freqstrengthEmail AddressName/Nick NameJudgmentjud typecountfreqjud diificultyAccuracyInformativenesssurprisingWeighted Average (accuracy)Weighted Average (inform)Weighted Average (surprising)
[email protected] umCan't tell01
[email protected] reportIncorrect14
[email protected] but not helpful225
[email protected] and somehow helpful34
[email protected] and so helpful416So Weak272291.8444.4435.56
[email protected] bauerIncorrect00
[email protected] messageIncorrect15
[email protected] but not helpful221
[email protected] and somehow helpful37