29
RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features Alessandro Neri, Federico Colangelo, Federica Battisti. , Marco Carli

4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

4 CAMERAS Unsupervised video orchestration based on Aesthetic features

Alessandro Neri, Federico Colangelo,

Federica Battisti. , Marco Carli

Page 2: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

Outline

OUTLINE

MOTIVATION

01

02

VIDEO AESTHETICS

03

MULTI-OBJECTIVE OPTIMIZATION

04

EXPERIMENTAL RESULTS

05

CONCLUSION

06

Page 3: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

Problem statement

Input: N synchronized cameras, shooting the same event E

Output: Editing script for the event E (e.g. from second 2 to 4 use camera 3)

Page 4: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

Approach

Key Idea

combine multiple

camera

contributions based

on their aesthetic

value

Video Editing

Automatic cut based on

Aesthetic criteria

Dynamic aspects exploiting

the temporal information on

video

Page 5: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

Problem Statement

Key Idea

combine multiple

camera

contributions based

on their aesthetic

value

Video Editing

Automatic cut based on

Aesthetic criteria

Dynamic aspects exploiting

the temporal information on

video

Page 6: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

Frame-level aesthetic features

1

SimplicityUncluttered images have more aestheticvalue

2

ColorfulnessColor distribution

3

Sharpness

4

PatternTexture and Shapes

5

Composition1/3 rule

Page 7: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

Frame-level aesthetic features

1

SimplicityUncluttered images have more aestheticvalue

2

ColorfulnessColor distribution

3

SharpnessColor distribution

4

PatternTexture and Shapes

5

Composition1/3 rule

Page 8: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

Frame-level aesthetic features

1

SimplicityUncluttered images have more aestheticvalue

2

ColorfulnessColor distribution

3

Sharpness

4

PatternTexture and Shapes

5

Composition1/3 rule

Page 9: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

Frame-level aesthetic features

1

SimplicityUncluttered images have more aestheticvalue

2

ColorfulnessColor distribution

3

Sharpness

4

PatternTexture and Shapes

5

Composition1/3 rule

Page 10: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

AESTHETIC SCORE

Use of a bank of classifiers, each tuned to a specific feature group

Aesthetic value should be evaluated taking the context into account

1Train Classifiers per type of content

2Apply Classifiers

CUHKPQ dataset17690 images7 categories of subjects (animals, architecture, human, landscape, night, plant, static)Rated as high or low quality based on subjective experiments

FeatureExtraction

#1

FeatureExtraction

#2

FeatureExtraction

#3

FeatureExtraction

#4

FeatureExtraction

#5

SVM #1

SVM #2

SVM #3

SVM #4

BlockAveraging

SVM #5

BlockAveraging

BlockAveraging

BlockAveraging

BlockAveraging

Page 11: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

Orchestration

For each interval, select the camera with the highest aesthetic value

What if there is a tie between cameras?

1

Partition the timeline into fixed time-length intervals (e.g. 2 sec)

Problem: tradeoff between dynamism and user confusion

2

Resolve ties based on the temporal

information

TI = max𝑡𝑖𝑚𝑒{𝜎𝑠𝑝𝑎𝑐𝑒(𝐹𝑛 𝑖, 𝑗 − 𝐹𝑛−1 𝑖, 𝑗 )}

Key idea: more dynamic shots tend to

be less boring for the viewers

3

Page 12: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

Results

Random Orchestration Aesthetic Criterion

Randomlyorchestrated

Aesthetic-Criterion basedorchestration

MOS 3 4.1

ConfidenceInterval

0.7 0.5

Page 13: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

MULTI-OBJECTIVE ORCHESTRATION

Problem:

• The aesthetic score of a shot can consistently outclass the others

▪ E.g. more expert the operators … better the view of the scene…

▪ The video can become too static, favoring a single camera

• Can we make temporal segmentation dynamic?

• Can we leverage traditional video editing theory?

use multi-objective optimization and introduce a second function to penalize camera re-use

Multi objective optimization through the genetic

algorithm

Page 14: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

Cutting Patterns

CONVENTIONAL

1

Wide Shot

2

Medium Shot

3

Close-Up

begins with the wide shot and then cuts to a medium shot and finally a close-up, working closer towards the subject or character

Professional directors use a combination of different shots types to keep the audience interested.

Page 15: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

Cutting Patterns

REVEAL

1

Wide Shot

2

Medium Shot

3

Close-Up

Start with a tight shot and then use progressively wider shots to supply context, with a variety of angles by moving the camera around the subject.

Professional directors use a combination of different shots types to keep the audience interested.

Page 16: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

Video editing

To evaluate the effectiveness of the editing with respect to the story telling a set of features is extracted from the shot sequence:

• shot type,

• camera angle▪ Eye-level-angle: the point of view is put on the same footing with the subject

▪ High angle: the camera looks down upon the subject

▪ Low angle: the camera is looking up to the subject

▪ Bird’s-eye view: the camera provides an elevated view of the subject from above

▪ Worm’s-eye view; the camera gives a view of the subject from below.

• camera position▪ frontal view,

▪ three quarter view,

▪ side view,

▪ back view

• camera movement.

Page 17: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

Markov chains

Markovian approach• :

Model the state of the system as a combination of camera attributes1.

Shot type, camera angle, camera position, camera movement▪

Assign transition probabilities according to video editing principles2.

• Smooth step between shot sizes• Change position as well as shot size

(30 degrees rule)• Shoot in opposite directions• Stay on one side of an imaginary line• Vary pacing to create moods or

atmospheres• Select shot length by how much

information it conveys.

MEDIUM0°

CLOSE-UP0°

WIDE0°

MEDIUM30°

CLOSE-UP30°

WIDE30°

MEDIUM60°

CLOSE-UP60°

WIDE60°

0.01

0.45

0.15

0.14

0.15

0.10

Page 18: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

MULTI-OBJECTIVE ORCHETRATION

Content Analysis

AestheticsScore

Assignment

Multi Objective

Opt.

Fine PacingTuning

• Randomly select the starting population (i.e. a set of random editings)1

2

3

4

5

Page 19: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

MULTI-OBJECTIVE ORCHETRATION

Content Analysis

AestheticsScore

Assignment

Multi Objective

Opt.

Fine PacingTuning

• Randomly select the starting population (i.e. a set of random editings)

• Determine the fitness based on the diversity and the aesthetic value

1

2

3

4

5

Page 20: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

MULTI-OBJECTIVE ORCHETRATION

Content Analysis

AestheticsScore

Assignment

Multi Objective

Opt.

Fine PacingTuning

• Randomly select the starting population (i.e. a set of random editings)

• Determine the fitness based on the diversity and the aesthetic value

• Obtain a set of Pareto-Optimal proposed editings• Re-use penalty: Kullback-Leiber divergence between the empirical probability

distribution induced by the proposed camera selection and the uniform distribution • Use a different set of aesthetic features to achieve better performances

• Lo, K. Y., Liu, K. H., & Chen, C. S. (2012, November). Assessment of photo aesthetics with efficiency. In Pattern Recognition (ICPR), 2012 21st International Conference on (pp. 2186-2189). IEEE.

1

2

3

4

5

Page 21: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

MULTI-OBJECTIVE ORCHETRATION

Content Analysis

AestheticsScore

Assignment

Multi Objective

Opt.

Fine PacingTuning

• Randomly select the starting population (i.e. a set of random editings)

• Determine the fitness based on the diversity and the aesthetic value

• Obtain a set of Pareto-Optimal proposed editings

• Estimate content density of each shot through edge density

1

2

3

4

5

Page 22: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

MULTI-OBJECTIVE ORCHETRATION

Content Analysis

AestheticsScore

Assignment

Multi Objective

Opt.

Fine PacingTuning

• Randomly select the starting population (i.e. a set of random editings)

• Determine the fitness based on the diversity and the aesthetic value

• Obtain a set of Pareto-Optimal proposed editings

• Estimate content density of each shot through edge density

• Apply Fine Pacing Tuning: subtract time from the less dense shots to give more camera to richer scenes

1

2

3

4

5

Page 23: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

aesthetic features Revised

1

Colorfulness(f1)

2

Layout Composition(f2-f5)Distances of H, S, V, H+S+V distributionsfrom templates,

3

Edge Composition(f6-f9)

4

Global Texture(f10-f17)

5

General features(f18-f24)Blur, contrast, non-zero elements of HSV histogram

Page 24: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

Results

Random Orchestration Aesthetic Criterion

Page 25: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

Future works

• Neural style transfer uses deep neural networks to transform a picture

according to a given style

Train a deep neural network to transfer the editing style on the video based on video from Masters

Page 26: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

Future works

• Imposing constraint on MAX time duration

▪ Emphasis on VIDEO SUMMARIZATION

▪ Jointly MAXIMIZE

o Aesthetics

o Representativeness

o Diversity

o Interestingness

o Importance

Page 27: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

Conclusions

• The experimental results can be considered the basis for an

in-depth analysis of the proposed approach.

• On the other side, the computational cost of feature vector

extraction limits the use of this system to non real time

scenarios.

• Future versions of the system will focus on obtaining

automatic segmentation using boundary detection algorithms

and motion-based summarization, as well as on using feature

sets characterized by limited computational overhead, for

allowing a real-time implementation.

Page 28: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

THANKS

The End

Page 29: 4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

RTSP 2017EURASIP TUTORIAL DAYJuly 10, 2017, Bucharest

References

• [1] V. Mezaris E. Mavridaki, “A comprehensive aesthetic quality assessment method for natural images using basic rules of photography,” in Proceedings of IEEE International Conference on Image Processing, (ICIP 2015), 2015, pp. 887 – 891.

• [2] Lo, K. Y., Liu, K. H., & Chen, C. S. (2012, November). Assessment of photo aesthetics with efficiency. In Pattern Recognition (ICPR), 2012 21st International Conference on (pp. 2186-2189). IEEE.

• [3] B. Gong, W. Chao, K. Grauman, and F. Sha, “Diverse sequential subset selection for supervised video summarization,” in Proceedings of the Neural Information Processing Systems (NIPS), 2014.

• [4] R. Kaiser, P. Torres, and M. Ho ̈ffernig, “The interaction ontology: Low- level cue processing in real-time group conversations,” in 2nd ACM International Workshop on Events in Multimedia. EiMM ’10, ACM.

• [5] W. Taylor and F. Z. Qureshi, “Automatic video editing for sensor-rich videos,” in 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), March 2016.

• [6] E. S. d. Lima, B. Feij, A. L. Furtado, A. Ciarlini, and C. Pozzer, “Automatic video editing for video-based interactive storytelling,” in 2012 IEEE International Conference on Multimedia and Expo, July 2012, pp. 806–811.

• [7] K. Dancyger, The Technique of Film and Video Editing History, Theory, and Practice.

• [8] A. K. Moorthy, P. Obrador, and N. Oliver, “Towards computational models of the visual aesthetic appeal of consumer videos,” in Proc. of Computer Vision, ECCV 2010.