62
Crowd++: Unsupervised Speaker Count with Smartphones Chenren Xu , Sugang Li, Gang Liu, Yanyong Zhang, Emiliano Miluzzo, Yih-Farn Chen, Jun Li, Bernhard Firner ACM UbiComp’13 September 10 th , 2013

Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Crowd++: Unsupervised Speaker Count with Smartphones

Chenren Xu, Sugang Li, Gang Liu, Yanyong Zhang, Emiliano Miluzzo, Yih-Farn Chen, Jun Li, Bernhard Firner

ACM UbiComp’13 September 10th, 2013

Page 2: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Scenario 1: Dinner time, where to go?

2

I will choose this one!

Page 3: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Scenario 2: Is your kid social?

3

Page 4: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Scenario 2: Is your kid social?

4

Page 5: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Scenario 3: Which class is attractive?

5

She will be my choice!

Page 6: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Solutions?

q  Speaker count!

q Dinner time, where to go?

q  Find the place where has most people talking!

q  Is your kid social?

q  Find how many (different) people they talked with!

q Which class is more attractive?

q  Check how many students ask you questions!

Microphone + microcomputer

6

Page 7: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

The era of ubiquitous listening

7

Glass Necklace

Watch

Phone

Brace

Page 8: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

What we already have

8

Next thing: Voice-centric sensing

Page 9: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Voice-centric sensing

9

Stressful

Bob Family life Alice

Speech recognition

Emotion detection

2

3

Speaker identification

Speaker count

Page 10: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

How to count?

10

Challenge No prior knowledge of speakers

Background noise

Speech overlap

Energy efficiency

Privacy concern

Page 11: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

How to count?

11

Challenge Solution No prior knowledge of speakers Unique features extraction

Background noise Frequency-based filter

Speech overlap Overlap detection

Energy efficiency Coarse-grained modeling

Privacy concern On-device computation

Page 12: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Overview of Crowd++

12

Page 13: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Speech detection

q  Pitch-based filter

q Determined by the vibratory frequency of the vocal folds

q Human voice statistics: spans from 50 Hz to 450 Hz

13

f (Hz) 50 450

Human voice

Page 14: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Speaker features

q  MFCC

q Speaker identification/verification

q  Alice or Bob, or else?

q Emotion/stress sensing

q  Happy, or sad, stressful, or fear, or anger?

q Speaker counting

q  No prior information

14

Supervised

Unsupervised

Page 15: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Speaker features

q  Same speaker or not?

q MFCC + cosine similarity distance metric

15

θ MFCC 1

MFCC 2

We use the angle θ to capture the distance between speech segments.

Page 16: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Speaker features

q  MFCC + cosine similarity distance metric

16

θs Bob’s MFCC in speech segment 1

Bob’s MFCC in speech segment 2

Alice’s MFCC in speech segment 3

θd

θd > θs

Page 17: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Speaker features

q  MFCC + cosine similarity distance metric

17

histogram of θs histogram of θd

1 second speech segment

2-second speech segment

3-second speech segment

10-second speech segment

.

.

.

10 seconds is not natural in conversation!

We use 3-second for basic speech unit.

Page 18: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Speaker features

q  MFCC + cosine similarity distance metric

18

3-second speech segment

30 15

Page 19: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Speaker features

q  Pitch + gender statistics

19

(Hz)

Page 20: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Same speaker or not?

20

Same speaker

Different speakers

Not sure

IF MFCC cosine similarity score < 15

AND

Pitch indicates they are same gender

ELSEIF MFCC cosine similarity score > 30

OR

Pitch indicates they are different genders

ELSE

Page 21: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Conversation example

21

Speaker A

Speaker B

Speaker C

Conversation

Overlap Overlap Silence

Time

Page 22: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

q  Phase 1: pre-clustering

q Merge the speech segments from same speakers

Unsupervised speaker counting

22

Page 23: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Unsupervised speaker counting

23

Ground truth

Segmentation

Pre-clustering

Same speaker

Page 24: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Unsupervised speaker counting

24

Ground truth

Segmentation

Pre-clustering

Merge

Page 25: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Unsupervised speaker counting

25

Ground truth

Segmentation

Pre-clustering

Keep going …

Page 26: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Unsupervised speaker counting

26

Ground truth

Segmentation

Pre-clustering

Same speaker

Page 27: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Unsupervised speaker counting

27

Ground truth

Segmentation

Pre-clustering

Merge

Page 28: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Unsupervised speaker counting

28

Ground truth

Segmentation

Pre-clustering

Keep going …

Page 29: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Unsupervised speaker counting

29

Ground truth

Segmentation

Pre-clustering

Same speaker

Page 30: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Unsupervised speaker counting

30

Ground truth

Segmentation

Pre-clustering

Merge

Page 31: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Unsupervised speaker counting

31

Ground truth

Segmentation

Pre-clustering

Merge

Pre-clustering

.

.

.

Page 32: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

q  Phase 1: pre-clustering

q Merge the speech segments from same speakers

q  Phase 2: counting

q Only admit new speaker when its speech segment is

different from all the admitted speakers.

q Dropping uncertain speech segments.

Unsupervised speaker counting

32

Page 33: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Unsupervised speaker counting

33

Counting: 0

Page 34: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Unsupervised speaker counting

34

Counting: 1

Admit first speaker

Page 35: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Unsupervised speaker counting

35

Counting: 1

Counting: 1 Not sure

Page 36: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Unsupervised speaker counting

36

Counting: 1

Counting: 1 Drop it

Page 37: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Unsupervised speaker counting

37

Counting: 1

Counting: 1 Different speaker

Page 38: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Unsupervised speaker counting

38

Counting: 2

Counting: 1 Admit new speaker!

Page 39: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Unsupervised speaker counting

39

Counting: 2

Counting: 2

Counting: 1

Same speaker

Page 40: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Unsupervised speaker counting

40

Counting: 2

Counting: 2

Counting: 1

Different speaker

Same speaker

Page 41: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Unsupervised speaker counting

41

Counting: 2

Counting: 2

Counting: 1

Merge

Page 42: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Unsupervised speaker counting

42

Counting: 2

Counting: 2

Counting: 1

Not sure

Page 43: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Unsupervised speaker counting

43

Counting: 2

Counting: 2

Counting: 1

Not sure

Not sure

Page 44: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Unsupervised speaker counting

44

Counting: 2

Counting: 2

Counting: 1

Drop

Page 45: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Unsupervised speaker counting

45

Counting: 2

Counting: 2

Counting: 1

Different speaker

Page 46: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Unsupervised speaker counting

46

Counting: 2

Counting: 2

Counting: 1

Different speaker

Different speaker

Page 47: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Unsupervised speaker counting

47

Counting: 2

Counting: 3

Counting: 1

Admit new speaker!

Page 48: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Unsupervised speaker counting

48

Counting: 2

Counting: 3

Counting: 1

Ground truth

We have three speakers in this conversation!

Page 49: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Evaluation metric

q  Error count distance

q The difference between the estimated count and the

ground truth.

49

Page 50: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Benchmark results

50

Page 51: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Benchmark results

51

Phone’s position on the table does not matter much

Page 52: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Benchmark results

52

Phones on the table are better than inside the pockets

Page 53: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Benchmark results

53

Collaboration among multiple phones provide better results

Page 54: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Large scale crowdsourcing effort

q  120 users from university and industry contribute

109 audio clips of 1034 minutes in total.

54

Private indoor Public indoor Outdoor

Page 55: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Large scale crowdsourcing results

55

Sample number

Error count distance

Private indoor 40 1.07

Public indoor 44 1.35

Outdoor 25 1.83

The error count results in all environments are reasonable.

Page 56: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Computational latency

56

Latency (msec)

HTC EVO 4g

Samsung Galaxy S2

Samsung Galaxy S3

Google Nexus 4

Google Nexus 7

MFCC 42.90 36.71 24.41 22.86 23.14

Pitch 102.71 80.36 58.11 47.93 58.33

Count 175.16 150.47 89.01 83.53 70.23

Total 320.77 267.54 171.53 154.32 151.7 It takes less than 1 minute to process a 5-minute conversation.

Page 57: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Energy efficiency

57

Page 58: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Use case 1: Crowd estimation

58

Group 1 Group 2 Group 1

Group 2 Group 3

Page 59: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Use case 2: Social log

59

Home

Class

Meeting

Lunch

Research Research

Sleeping Research Dinner

Ph.D student life snapshot before UbiComp’13 submission

His paper is accepted!

Page 60: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Use case 3: Speaker count patterns

60

Different talks show very different speaker count patterns as time goes.

Page 61: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Conclusion

q  Smartphones can count the number of speakers

with reasonable accuracies in different

environments.

q  Crowd++ can enable different social sensing

applications.

61

Page 62: Crowd++: Unsupervised Speaker Count with Smartphoneseceweb1.rutgers.edu/~yyzhang/talks/ubicomp13-xu.pdfSamsung Galaxy S2 Samsung Galaxy S3 Google Nexus 4 Google Nexus 7 MFCC 42.90

Chenren Xu [email protected]

Thank you

62

Chenren Xu WINLAB/ECE

Rutgers University

Sugang Li WINLAB/ECE

Rutgers University

Gang Liu CRSS

UT Dallas

Yanyong Zhang WINLAB/ECE

Rutgers University

Emiliano Miluzzo AT&T Labs Research

Yih-Farn Chen AT&T Labs Research

Jun Li Interdigital

Communication

Bernhard Firner WINLAB/ECE

Rutgers University