87
Wei-Ta Chu 2010/9/30 Video Syntax Analysis 1 Multimedia Content Analysis, CSIE, CCU

Lecture 3 Video Syntax Analysis - National Chung Cheng

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture 3 Video Syntax Analysis - National Chung Cheng

Wei-Ta Chu

2010/9/30

Video Syntax Analysis1

Multimedia Content Analysis, CSIE, CCU

Page 2: Lecture 3 Video Syntax Analysis - National Chung Cheng

Types of Shot Change

Multimedia Content Analysis, CSIE, CCU

2

Abrupt change (hard cut) Cut occurs in a single frame when stopping and restarting the

camera Gradual transition

Fade-in: gradual increase in intensity starting from a black frame Fade-out: gradual decrease in intensity resulting a black frame Dissolve: transiting from the end of one clip to the beginning of

another Wipe: One image is replaced by another with a distinct edge

that forms a shape.…

Page 3: Lecture 3 Video Syntax Analysis - National Chung Cheng

Examples of Shot Changes

Multimedia Content Analysis, CSIE, CCU

3

Li and Lee. “Effective detection of various wipe transitions” IEEE Trans. on Circuits and Systems for Video Technology, vol. 17, no. 6, pp. 663-673, 2007.

Cut

Dissolve

Wipe

Page 4: Lecture 3 Video Syntax Analysis - National Chung Cheng

Examples of Fade

Multimedia Content Analysis, CSIE, CCU

4

Cernekova, et al., “Information theory-based shot cut/fade detection and video summarization” IEEE Trans. on Circuits and Systems for Video Technology, vol. 16, no. 1, pp. 82-91, 2006.

Fade out

Fade in

Page 5: Lecture 3 Video Syntax Analysis - National Chung Cheng

Different Types of Wipe5

Li and Lee. “Effective detection of various wipe transitions” IEEE Trans. on Circuits and Systems for Video Technology, vol. 17, no. 6, pp. 663-673, 2007.

Video example: http://en.wikipedia.org/wiki/Wipe_%28transition%29

Page 6: Lecture 3 Video Syntax Analysis - National Chung Cheng

Detection Process

Multimedia Content Analysis, CSIE, CCU

6

Extractfeatures

Similaritycalculating

Boundarydecision

Video

Shot 1 Shot 2 Shot 3 Shot 4

Page 7: Lecture 3 Video Syntax Analysis - National Chung Cheng

Features

Multimedia Content Analysis, CSIE, CCU

7

Pixel difference Statistical difference Histograms Compression differences Edge Motion

Page 8: Lecture 3 Video Syntax Analysis - National Chung Cheng

Pixel Difference

Multimedia Content Analysis, CSIE, CCU

8

Count the number of pixels that change in valuemore than some threshold.

May be sensitive to camera motion.

Page 9: Lecture 3 Video Syntax Analysis - National Chung Cheng

1. Pair-wise comparison

Multimedia Content Analysis, CSIE, CCU

9

Compare the corresponding pixels in two frames.

Problems: sensitive to camera movementE.g. camera panning Improvement: smoothing by a 3x3 window before

comparisonZhang, et al., “Automatic partitioning of full-motion video” Multimedia Systems Journal, vol. 1, pp. 10-28, 1993.

Page 10: Lecture 3 Video Syntax Analysis - National Chung Cheng

2. Histogram Comparison

Multimedia Content Analysis, CSIE, CCU

10

Less sensitive to object motion, since it ignores thespatial changes in a frame.

Hi(j): the histogram value for the ith frame, where jis one of the G grey levels.

Page 11: Lecture 3 Video Syntax Analysis - National Chung Cheng

2. Histogram Comparison–Example11

Example video sequence

The intensity histogram ofthe first three frames

Page 12: Lecture 3 Video Syntax Analysis - National Chung Cheng

2. Histogram Comparison

Multimedia Content Analysis, CSIE, CCU

12

Color histogram difference

pi(r,g,b) is the number of pixels of color (r,g,b) in frame Ii of N pixels.Each color component is discritized to 2B different values.

Page 13: Lecture 3 Video Syntax Analysis - National Chung Cheng

3. Likelihood Ratio

Multimedia Content Analysis, CSIE, CCU

13

Compare corresponding regions (blocks) in two successiveframes based on second-order statistical characteristics oftheir intensity values.

Then a camera break can be declared whenever the totalnumber of sample areas whose likelihood ratio exceeds thethreshold is sufficiently large

Raise the tolerance of slow and small object motion from frameto frame.

mi: mean intensity value for a given regionSi: variances for a given region

Page 14: Lecture 3 Video Syntax Analysis - National Chung Cheng

4. Edge Change Ratio

Multimedia Content Analysis, CSIE, CCU

14

Zabih, et al., “A feature-based algorithm for detecting and classifying scene breaks” Proc. Of ACM Multimedia, pp. 189-200,1995.

Page 15: Lecture 3 Video Syntax Analysis - National Chung Cheng

4. Edge Change Ratio

Multimedia Content Analysis, CSIE, CCU

15

Page 16: Lecture 3 Video Syntax Analysis - National Chung Cheng

4. Edge Change Ratio16

Edge change ratio

Page 17: Lecture 3 Video Syntax Analysis - National Chung Cheng

5. Motion Vectors17

Using the direction of motionprediction to be the cues for shotchange detection

Pei, et al., “Scene-effect detection and insertion MPEGencoding scheme for video browsing and error concealment” IEEE Trans. on Multimedia, vol. 7, no. 4, pp. 606-614, 2005.

Page 18: Lecture 3 Video Syntax Analysis - National Chung Cheng

5. Motion Vectors

Multimedia Content Analysis, CSIE, CCU

18

Using motion vector information to filter out falsepositives

Zhang, et al., “Automatic partitioning of full-motion video” Multimedia Systems Journal, vol. 1, pp. 10-28, 1993.

Page 19: Lecture 3 Video Syntax Analysis - National Chung Cheng

6. Differences in DCT domain

Multimedia Content Analysis, CSIE, CCU

19

Discrete Cosine Transform (DCT) coefficients 1. Select subset of blocks 2. Select subset of DCT coefficients of these blocks 3. Concatenate selected coefficients of selected blocks as a

vector 4. Calculate the similarity of two coefficient vectors

Arman, et al., “Image processing on encoded video sequences” Multimedia Systems Journal, vol. 1, no. 5, pp. 211-219, 1994.

Page 20: Lecture 3 Video Syntax Analysis - National Chung Cheng

Gradual Transition Detection

Multimedia Content Analysis, CSIE, CCU

20

Cuts or abrupt change

Gradual transition

Page 21: Lecture 3 Video Syntax Analysis - National Chung Cheng

1. Twin-Comparison Approach

Multimedia Content Analysis, CSIE, CCU

21

Zhang, et al., “Automatic partitioning of full-motion video” Multimedia Systems Journal, vol.1, pp. 10-28, 1993.

Page 22: Lecture 3 Video Syntax Analysis - National Chung Cheng

2. Edge Change Ratio22

Lienhart, R., “Comparison of automatic shot boundary detectionalgorithms” Proc. of SPIE Storage and Retrieval for Image and VideoDatabases VII, vol. 3656, pp. 290-301, 1999.

Page 23: Lecture 3 Video Syntax Analysis - National Chung Cheng

2. Edge Change Ratio23

Page 24: Lecture 3 Video Syntax Analysis - National Chung Cheng

3. Characterizing a Wipe Transition

Multimedia Content Analysis, CSIE, CCU

24

Page 25: Lecture 3 Video Syntax Analysis - National Chung Cheng

Evaluation

Multimedia Content Analysis, CSIE, CCU

25

Precision The percentage of retrieved items that are desired items

Recall The percentage of desired items that are retrieved.

Precision =# Correctly retrieved items

# All retrieved items=

# Correctly retrieved items

# Correctly retrieved items + # Falsely retrieved items

Recall =# Correctly retrieved items

# All relevant items=

# Correctly retrieved items

# Correctly retrieved items + # Items that are not retrieved

Page 26: Lecture 3 Video Syntax Analysis - National Chung Cheng

Evaluation–Other Terms

Multimedia Content Analysis, CSIE, CCU

26

Miss # Items that are not retrieved

True positive (TP) # Correctly retrieved items

False positive (FP) # Falsely retrieved items

True negative (TN) # Correctly missed items

False negative (FN) # Items that are not retrieved

Actualpositive

Actualnegative

Predictedpositive

TP FP

Predictednegative

FN TN

Page 27: Lecture 3 Video Syntax Analysis - National Chung Cheng

Evaluation

Multimedia Content Analysis, CSIE, CCU

27

Actualpositive

Actualnegative

Predictedpositive

TP FP

Predictednegative

FN TN

Detected(retrieved)

Relevant(ground truth)

TPFP FN

TN

Page 28: Lecture 3 Video Syntax Analysis - National Chung Cheng

Relationship between Precision & Recall

Multimedia Content Analysis, CSIE, CCU

28

Precision-Recall (PR) curve

Page 29: Lecture 3 Video Syntax Analysis - National Chung Cheng

Relationship between True Positive andFalse Positive

Multimedia Content Analysis, CSIE, CCU

29

Receiver Operator Characteristic (ROC) curve

Page 30: Lecture 3 Video Syntax Analysis - National Chung Cheng

Using PR or ROC Curves?

Multimedia Content Analysis, CSIE, CCU

30

ROC curves can present an overly optimistic view of analgorithm’s performance if there is a large skew in the class distribution.

Number of true negative examples greatly exceeds thenumber of positive examples. Thus a large change in thenumber in false positives can lead to a small change in thefalse positive rate.

Precision compares false positives to true positives and bettercaptures the algorithm’s performance.

Davis, et al., “The relationship between precision-recall and ROC curves” Proc. of International Conference on Machine Learning, pp. 233-240, 2006.

Page 31: Lecture 3 Video Syntax Analysis - National Chung Cheng

Comparison of Shot BoundaryDetection Techniques

Multimedia Content Analysis, CSIE, CCU

31

MethodsHistograms, region histograms, running histograms,

motion-compensated pixel differences, DCT coefficientdifferences

Evaluation dataVideo type # Frames Cuts Gradual transitions

TV 133204 831 42

News 81595 293 99

Movie 142507 564 95

Commercial 51733 755 254

Misc. 10706 64 16

Total 419745 2507 506

Page 32: Lecture 3 Video Syntax Analysis - National Chung Cheng

Methods Compared

Multimedia Content Analysis, CSIE, CCU

32

Histogram (64-bin gray-level) difference, single threshold Region (block) histogram

16 blocks, 64 gray-scale histograms, difference threshold for each block, and countthreshold for changed blocks

Running histogram (Twin method) 64 gray-scale histogram for each frame, twin thresholds Compute motion vectors. If excessive motion, reject gradual changes

Motion compensated pixel difference 12 blocks per frame, motion vector for each block Compute average residual errors, if larger than high threshold, detected as a cut Use cumulative errors to detect gradual changes (similar to above) Use motion vectors to reject false gradual changes

DCT difference Concatenate 15 coefficients of same locations from different blocks to form a vector Compute (1-inner product of two vectors from consecutive frames)

Page 33: Lecture 3 Video Syntax Analysis - National Chung Cheng

PR Curve for TV program

Multimedia Content Analysis, CSIE, CCU

33

Page 34: Lecture 3 Video Syntax Analysis - National Chung Cheng

PR Curve for News program

Multimedia Content Analysis, CSIE, CCU

34

Page 35: Lecture 3 Video Syntax Analysis - National Chung Cheng

PR Curve for Movie Videos

Multimedia Content Analysis, CSIE, CCU

35

Page 36: Lecture 3 Video Syntax Analysis - National Chung Cheng

PR Curve for Commercials

Multimedia Content Analysis, CSIE, CCU

36

Page 37: Lecture 3 Video Syntax Analysis - National Chung Cheng

PR Curve for All Data

Multimedia Content Analysis, CSIE, CCU

37

Page 38: Lecture 3 Video Syntax Analysis - National Chung Cheng

PR Curve for All Data–Cut Only

Multimedia Content Analysis, CSIE, CCU

38

Page 39: Lecture 3 Video Syntax Analysis - National Chung Cheng

Observations

Multimedia Content Analysis, CSIE, CCU

39

Histogram-based method is consistent Produced the first or second best precision Simplicity & straightforward

Region algorithm seems to be the best Where recall is not the highest priority

Running algorithm seems to be the best Where recall is important Motion vector is helpful to reduce false positives

DCT the worst Large number of false positives in black frames

Page 40: Lecture 3 Video Syntax Analysis - National Chung Cheng

References

Multimedia Content Analysis, CSIE, CCU

40

J.S. Boreczky, et al., "Comparison of video shot boundary detectiontechniques" Proc. of SPIE Conference on Storage and Retrieval forImage and Video Databases, vol. 2670, 1996. (must read)

R. Lienhart, "Comparison of automatic shot boundary detectionalgorithms" Proc. of SPIE Storage and Retrieval for Image andVideo Databases VII, vol. 3656, pp. 290-301, 1999.

J. Yuan, et al., "A formal study of shot boundary detection" IEEETrans. on Circuits and Systems for Video Technology, vol. 17, no. 2,pp. 168-186, 2007.

A. Hanjalic, "Shot-boundary detection: unraveled or resolved?" IEEETrans. on Circuits and Systems for Video Technology, vol. 12, no. 2,pp. 90-105, 2002.

Page 41: Lecture 3 Video Syntax Analysis - National Chung Cheng

Edge41

Multimedia Content Analysis, CSIE, CCU

Page 42: Lecture 3 Video Syntax Analysis - National Chung Cheng

Edge42

An edge is a set of connected pixels that lie on the boundarybetween two regions.

Chapters 10 of “Digital Image Processing” by R.C. Gonzalez and R.E. Woods, Prentice Hall, 2nd

edition, 2001

Page 43: Lecture 3 Video Syntax Analysis - National Chung Cheng

Edge

Multimedia Content Analysis, CSIE, CCU

43

Page 44: Lecture 3 Video Syntax Analysis - National Chung Cheng

Gradient Operators44

Roberts cross-gradient operators:

Prewitt operators:

Sobel operators:

Page 45: Lecture 3 Video Syntax Analysis - National Chung Cheng

Edge Examples

Multimedia Content Analysis, CSIE, CCU

45

Page 46: Lecture 3 Video Syntax Analysis - National Chung Cheng

Edge Examples–after smoothing

Multimedia Content Analysis, CSIE, CCU

46

Page 47: Lecture 3 Video Syntax Analysis - National Chung Cheng

Edge Examples

Multimedia Content Analysis, CSIE, CCU

47

Page 48: Lecture 3 Video Syntax Analysis - National Chung Cheng

Canny Edge Detectors48

Step 1: the image is smoothed by Gaussian convolution Step 2: a 2D first derivative operator is applied to the

smoothed image Step 3: non-maximal suppression

Edges give rise to ridges in the gradient magnitude image. Thealgorithm tracks along the top of these ridges and sets to zero all pixelsthat are not actually on the ridge.

http://homepages.inf.ed.ac.uk/rbf/HIPR2/canny.htm

Page 49: Lecture 3 Video Syntax Analysis - National Chung Cheng

Very Brief Introduction of DiscreteCosine Transform

49

Multimedia Content Analysis, CSIE, CCU

Page 50: Lecture 3 Video Syntax Analysis - National Chung Cheng

Spatial Frequency and DCT

Multimedia Content Analysis, CSIE, CCU

50

Page 51: Lecture 3 Video Syntax Analysis - National Chung Cheng

Definition of DCT

Multimedia Content Analysis, CSIE, CCU

51

Page 52: Lecture 3 Video Syntax Analysis - National Chung Cheng

2D DCT

Multimedia Content Analysis, CSIE, CCU

52

Page 53: Lecture 3 Video Syntax Analysis - National Chung Cheng

1D DCT53

Page 54: Lecture 3 Video Syntax Analysis - National Chung Cheng

DCT Basis

Multimedia Content Analysis, CSIE, CCU

54

Page 55: Lecture 3 Video Syntax Analysis - National Chung Cheng

DCT Basis

Multimedia Content Analysis, CSIE, CCU

55

Page 56: Lecture 3 Video Syntax Analysis - National Chung Cheng

Example

Multimedia Content Analysis, CSIE, CCU

56

Page 57: Lecture 3 Video Syntax Analysis - National Chung Cheng

Example

Multimedia Content Analysis, CSIE, CCU

57

Page 58: Lecture 3 Video Syntax Analysis - National Chung Cheng

Example

Multimedia Content Analysis, CSIE, CCU

58

Page 59: Lecture 3 Video Syntax Analysis - National Chung Cheng

Example

Multimedia Content Analysis, CSIE, CCU

59

Page 60: Lecture 3 Video Syntax Analysis - National Chung Cheng

Discrete Cosine Transform

Multimedia Content Analysis, CSIE, CCU

60

DCT converts a block of pixelsinto a block of transformcoefficients, which representthe spatial frequency.

Each coefficient is a weightapplied to an appropriatebasis function.

Any gray-scale 8x8 pixel blockcan be fully represented by aweighted sum of these 64 basisfunctions.

Increasing horizontal frequency

Increasingverticalfrequency

“DC” basis function

Page 61: Lecture 3 Video Syntax Analysis - National Chung Cheng

Intra-Frame Encoding (JPEG Compression)

Multimedia Content Analysis, CSIE, CCU

61

Page 62: Lecture 3 Video Syntax Analysis - National Chung Cheng

Scene Transition Graph62

Multimedia Content Analysis, CSIE, CCU

Yeung, et al. “Segmentation of video by clustering and graph analysis” Computer Vision and Image Understanding, vol. 71, no. 1, pp. 94-109, 1998.

Page 63: Lecture 3 Video Syntax Analysis - National Chung Cheng

Observations

Multimedia Content Analysis, CSIE, CCU

63

Shots in a scene are often repetitive. We are ableto classify shots by grouping shots of similar visualcontents.

Often, a scene is made up of temporally adjacentshots indicating their interrelationships.

Page 64: Lecture 3 Video Syntax Analysis - National Chung Cheng

Similarity of Video Shots

Multimedia Content Analysis, CSIE, CCU

64

D(.,.) measures the dissimilarity between two image frames.

Page 65: Lecture 3 Video Syntax Analysis - National Chung Cheng

Similarity of Video Shots65

Dissimilarity based on color histogram intersection

Dissimilarity based on luminance projection

Yeungand Liu, “Efficient matching and clustering of video shots” Proc. of IEEE International Conference on Image Processing,vol. 1, pp. 338-341, 1995.

Page 66: Lecture 3 Video Syntax Analysis - National Chung Cheng

Representative Image Setfor a Video Shot

66

Selection of representative set is achieved by nonlineartemporal sampling

Page 67: Lecture 3 Video Syntax Analysis - National Chung Cheng

Representative Image Setfor a Video Shot

Multimedia Content Analysis, CSIE, CCU

67

Only 2 to 5% of frames are needed in comparisonto achieve good matching results.

In addition to temporal subsampling, spatialsubsampling can also be used to improve matchingefficiency.

Page 68: Lecture 3 Video Syntax Analysis - National Chung Cheng

Clustering of Video Shots

Multimedia Content Analysis, CSIE, CCU

68

Shots in the same cluster are similar Any other shot outside of the cluster must have a

dissimilarity greater than the dissimilarity betweenany shot in the cluster.

Ci: the ith cluster

Page 69: Lecture 3 Video Syntax Analysis - National Chung Cheng

Clustering of Video Shots

Multimedia Content Analysis, CSIE, CCU

69

Dissimilarity between two clusters:

Using the shot pair, in which two shotsare in two different clusters, that hasthe largest dissimilarity value.

Dissimilarity between two clustersshould be updated at each iteration.

Page 70: Lecture 3 Video Syntax Analysis - National Chung Cheng

Clustering of Video Shots

Multimedia Content Analysis, CSIE, CCU

70

Page 71: Lecture 3 Video Syntax Analysis - National Chung Cheng

Time-Constrained Clustering71

Any two shots that are far apart in time, even if they sharesimilar visual contents, they potentially represent differentcontents or occur in different scenes.

Temporal distance between two shotsThe distance in number of framesfrom the end of the earlier shot to thebeginning of the latter one.

Page 72: Lecture 3 Video Syntax Analysis - National Chung Cheng

Scene Transition Graph

Multimedia Content Analysis, CSIE, CCU

72

A scene transition graph is a directed graph with the propertyG=(V,E,F)

V: each node represents a cluster of shots E: a directed edge is drawn from node U to W if there is a

shot represented by node U that immediately precedes anyshots represented by node W.

F: a mapping that partitions the set of shots into clusters STG is able to represent compactly the structures of shots and

the temporal flow of the story for many video programs.

Page 73: Lecture 3 Video Syntax Analysis - National Chung Cheng

Example of STG

Multimedia Content Analysis, CSIE, CCU

73

3 scenes of 9 shots

Sample clustering results

Scene transition graph

Page 74: Lecture 3 Video Syntax Analysis - National Chung Cheng

Cut Edges

Multimedia Content Analysis, CSIE, CCU

74

An edge is a cut edge, if when is removed, results in two disconnected graphs.

Each partitioned STG Gi represents the interactions of shots in a story unit.

Page 75: Lecture 3 Video Syntax Analysis - National Chung Cheng

STG After Time Confining and CutEdges Finding

Multimedia Content Analysis, CSIE, CCU

75

Page 76: Lecture 3 Video Syntax Analysis - National Chung Cheng

Framework76

Shot segmentation Time-constrained

clustering Building of scene

transition graph Scene segmentation

Page 77: Lecture 3 Video Syntax Analysis - National Chung Cheng

Influences of Parameters

Multimedia Content Analysis, CSIE, CCU

77

Without the knowledge of how long each individual scene lasts,T cannot be approximated well. If T is too large, shots from different scenes are clustered together. If T is too small, shots in the same scene may be separated into

different scenes.

It’s less detrimental to have several story units represent a scene than to have one story unit represent several scenes.

Page 78: Lecture 3 Video Syntax Analysis - National Chung Cheng

Influences of Time Constraints78

T = 20s. dt(B1,B3) > T

Clustering results are {B1,B2},{A1,A2,A3},{B3,B4},{C1},{D1}

Story unit results are {B1,A1,B2,A2,B3,A3,B4},{C1},{D1}

B1B2

A1A2A3

B3B4 C1

D1

STG

{Bi} are not clustered into one cluster because thereare at least a pair of shots, one from each cluster, that has a temporaldistance dt > T*.

Page 79: Lecture 3 Video Syntax Analysis - National Chung Cheng

Influences of Time Constraints79

T = 20s.

Clustering results are {B1},{B2,B3},{A1,A2,A3},{B4},{C1},{D1}

Story unit results are {B1},{A1,B2,A2,B3,A3},{B4},{C1},{D1}

B1

A1A2A3

B2B3 C1

D1

B4

STG

Page 80: Lecture 3 Video Syntax Analysis - National Chung Cheng

Refined Analysis

Multimedia Content Analysis, CSIE, CCU

80

Make the time-window more elasticCompute the duration of each story unit and adjust

Given a story unit, examination of the next storyunit by relaxing the temporal windows andreclustering the shots in these two units. If there exists at least one new cluster that contains

shots from the two units, two story units are merged intoone.

Page 81: Lecture 3 Video Syntax Analysis - National Chung Cheng

Refined Analysis

Multimedia Content Analysis, CSIE, CCU

81

Page 82: Lecture 3 Video Syntax Analysis - National Chung Cheng

Example

Multimedia Content Analysis, CSIE, CCU

82

{B1,B2},{A1,A2,A3},{B3,B4},{C1},{D1}

{B1},{B2,B3},{A1,A2,A3},{B4},{C1},{D1}

Page 83: Lecture 3 Video Syntax Analysis - National Chung Cheng

Results

Multimedia Content Analysis, CSIE, CCU

83

STG constructed from the sitcom “Friends”. There are 35575 frames, each at a spatial resolution of 320x240.There are 313 shots.

Page 84: Lecture 3 Video Syntax Analysis - National Chung Cheng

Results

Multimedia Content Analysis, CSIE, CCU

84

Time-constrained clustering of video shots is able to identifyindividual story units.

The resulting STG permits rapid nonlinear browsing of longvideo programs.

Page 85: Lecture 3 Video Syntax Analysis - National Chung Cheng

Variations of Clustering Parameters85

Smaller delta values result in more clusters and thus more story units.Users often prefer over-segmentation rather than under-segmentation.

Page 86: Lecture 3 Video Syntax Analysis - National Chung Cheng

Refining the Segmentation Results

Multimedia Content Analysis, CSIE, CCU

86

The first two story units in Scene 1 are merged into one.The number of story units in Scene 6 is reduced from 4 to 2.

Page 87: Lecture 3 Video Syntax Analysis - National Chung Cheng

Conclusion

Multimedia Content Analysis, CSIE, CCU

87

Analysis based on time-constrained clustering andscene transition graph analysis has contributed tothe extraction of story units.

The building of story structure provides nonlinearaccess to video contents.

Identification, integration, and application ofdomain-dependent and semantic features tend toimprove segmentation accuracy.