46
An open source platform for multimedia processing http://marsyas.sourceforge.net Nov 2010 Saturday, 20 November, 2010

Marsyas

Embed Size (px)

Citation preview

An open source platform for multimedia processing

http://marsyas.sourceforge.net

Nov 2010

Saturday, 20 November, 2010

Notice

This work is licensed under the Creative Commons Attribution-Share Alike 2.5 Portugal License. To view a copy of this license, visit

http://creativecommons.org/licenses/by-sa/2.5/pt/

or send a letter to

Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.

Saturday, 20 November, 2010

Marsyas Overview

Users & Applications

Usage Scenarios

Architecture

Interoperability

MarsyasX

Future Work

Summary

http://marsyas.sourceforge.net

Saturday, 20 November, 2010

Marsyas OverviewSoftware framework for media analysis, synthesis and retrieval

Open source (GPL license)

Efficient and extensible framework design

original emphasis on Music Information Retrieval (MIR) ; now Multimedia!

C++, OOP

Multiplatform (Linux, MacOSX®, MS Windows®, …)

Provides a variety of building blocks for performing common audio tasks:

sound/video file IO, audio/video IO, signal processing and machine learning modules

blocks can be combined into data flow networks that can be modified and controlled dynamically while they process data in soft real-time.

WAV source

FFT

GMM

KNNLPC

WAV source

Hanning FFT

http://www.fsf.org

Saturday, 20 November, 2010

Marsyas OverviewMarsyas Brief History

1998 ~2000

Created by George Tzanetakis during his PhD activities at Princeton

2000 ~2002

Marsyas 0.1

First stable revisions of the toolkit

Distributions hosted at SourceForge

Creation of a developer community

User and Developer Mailing lists

2002 ~ …

Marsyas 0.2

Major framework revision

SourceForge SubVersion

http://marsyas.sourceforge.net/ http://www.cs.princeton.edu/~gtzan

Saturday, 20 November, 2010

Marsyas OverviewMarsyas Core Developers (Present and past...)

George Tzanetakis

Main culprit (also main designer and developer)

Luís Gustavo Martins

Refactoring freak

Graham Percieval

Documentation overlord

Luís Filipe Teixeira

Marsyas X motor-head

Mathieu Lagrange

Biggest network award

Steven Ness

Ruby and Web guy

Check updated list at the website!

http://marsyas.info/community/people

Saturday, 20 November, 2010

Marsyas Community

Google Analytics

Saturday, 20 November, 2010

Marsyas Community

Google Analytics(Nov 2010)

Saturday, 20 November, 2010

Marsyas CommunityMailing lists

Developer

https://lists.sourceforge.net/lists/listinfo/marsyas-developers

User

https://lists.sourceforge.net/lists/listinfo/marsyas-users

Tracker

Feature Requests

Saturday, 20 November, 2010

Marsyas Community

SourceForge Community Choice Awards 2009 Finalisthttp://community.sourceforge.net/cca_winners/bestacademia.html

Saturday, 20 November, 2010

Marsyas Project AnalysisMarsyas value?

Ohloh Analysis

http://www.ohloh.net/p/marsyas

Saturday, 20 November, 2010

Users & ApplicationsMusicream (Masataka Goto)

Music playback system with similarity capabilities

Uses Marsyas as its music similarity engine

http://staff.aist.go.jp/m.goto/Musicream/

Saturday, 20 November, 2010

Users & ApplicationsSndPeek (Princeton)

Uses Marsyas 0.1 for:

FFT magnitude spectrum

real-time spectral feature extraction

centroid

rms

flux

rolloff

http://www.cs.princeton.edu/sound/software/sndpeek/

Saturday, 20 November, 2010

Users & ApplicationsMarsyas @ INESC Porto

Audio Analysis Software prototypes:

Feature Extraction

Audio segmentation/classification

Audio fingerprinting

Speaker Segmentation

Music and Auditory Scene Analysis

Video Analysis Software prototypes:

Background modeling and subtraction

Local feature extraction for object matching

http://www.inescporto.pt/~lfpthttp://www.inescporto.pt/~lmartins

http://www.inescporto.pt

Saturday, 20 November, 2010

Users & ApplicationsDesert Island

Undergraduate work at the Univ. Missouri

Kansas Jared Hoberock

Dan Kelly Ben Tietgen

Saturday, 20 November, 2010

Related WorkOpen Source frameworks

CLAM (http://clam.iua.upf.edu/)

STK (http://ccrma.stanford.edu/software/stk/)

Chuck (http://chuck.cs.princeton.edu/)

PureData (Pd) (http://crca.ucsd.edu/~msp/software.html)

Open Sound Control (OSC) (http://cnmat.berkeley.edu/OpenSoundControl/)

FAUST (http://faudiostream.sourceforge.net/)

EyesWeb (http://www.infomus.dist.unige.it/EywMain.html)

...

Commercial toolkits

MAX/MSP® (http://www.cycling74.com/)

MATLAB® Simulink® (http://www.mathworks.com/products/simulink/)

LabView® (http://www.ni.com/labview/)

DirectShow® GraphEdit (http://www.microsoft.com)

...

Saturday, 20 November, 2010

Usage ScenariosMarsyas command line tools

Demonstrate key capabilities of the framework

Some are actually research tools

Efficient and can execute in real-time

ANSI C++ only core

several optional libraries

MP3 reading/writing (libMad)

Tools and examples:sfplay

bextract

phasevocoder

sfplugin

Saturday, 20 November, 2010

Usage ScenariosPlaying audio files

sfplay

-s : start time for playback

-l : length of playback

-r : repeat times

-g : volume (gain) value

-p : playback.mpl (save playback network as a .mpl plugin file)

> sfplay foo.wav> sfplay –s 10.0 –l 3.2 –r 2.5 –g 0.5 foo1.wav foo2.au –f output.wav> sfplay –l 3.0 foo.wav> sfplay foo.wav –p playback.mpl>

Saturday, 20 November, 2010

Usage ScenariosMachine Learning: feature extraction and training of classifiers

bextract

-e STFTMFCC

extracts spectral and MFCC features

music.mf, speech.mf

lists of sound files (collections)

-w myWeka.arff

WEKA file with extracted features

-p ms.mpl

“trained” Marsyas plug-in for realtime music/speech classification

> bextract -e STFTMFCC music.mf speech.mf -p ms.mpl -w myweka.arff>

Saturday, 20 November, 2010

Usage ScenariosMarsyas plugins (.mpl files)

Allow to dynamically recreate a processing network in runtime

sfplugin

e.g. Audio playback

e.g. Realtime audio classification

> sfplugin –p ms.mpl unknownAudioSignal.wav>

> sfplugin –p playback.mpl foo.wav>

Saturday, 20 November, 2010

Usage ScenariosDigital Signal Processing

e.g. phasevocoder

e.g. onset detector

> phasevocoder –p 1.4 -s 100>

> mudbox -t onsets myMusic.wav>

http://eceserv0.ece.wisc.edu/~sethares/vocoders/phasevocoder.html

Saturday, 20 November, 2010

ArchitectureMarsyas 0.2

New dataflow model of audio computation

hierarchical messaging system used to control the dataflow network

inspired on Open Sound Control (OSC)

general matrices instead of 1-D arrays as data

Saturday, 20 November, 2010

ArchitectureMarSystem Slices

Separating things that happen at the same time from things that happen in different times

Correct semantics for

spectral processing

Saturday, 20 November, 2010

Implicit Patching VS Explicit Patching

Architecture

# IMPLICIT PATCHINGsource, F1, F2, F3, destinationFanout(F1, F2, F3)Series(source, Fanout, destination);

# EXPLICIT PATCHINGsource, F1, F2, F3, destination# connect the appropriate in/out portsconnect(source, F1);connect(source, F2);connect(source, F3);connect(F1, destination);connect(F2, destination);connect(F3, destination);

Saturday, 20 November, 2010

ArchitectureMarSystem Composites

Series

Fanout

Fanin

Parallel

Accumulator

Saturday, 20 November, 2010

Implicit VS Explicit Patching :: Neural Network

Architecture

#IMPLICIT PATCHINGfanoutLayer1(ANN_Node11, …, ANN_Node1N)…fanoutLayerM(ANN_NodeM1, …,ANN_NodeMN)ANN_Series(fanoutLayer1, …, fanoutLayerM)

Saturday, 20 November, 2010

ArchitectureTypical feature extraction

Time-frequency representation

Frequency summarization (MFCC)

Time summarization (Texture features)

Saturday, 20 November, 2010

ArchitectureFeature Extraction using Implicit Patching

MarSystemManager mng;

MarSystem* Series1 = mng.create("Series", “Series1");

MarSystem* Fanout1 = mng.create(“Fanout", “Fanout1");

MarSystem* Series2 = mng.create("Series", “Series2");

MarSystem* Fanout2 = mng.create(“Fanout”, “Fanout2”);

MarSystem* Fanout3 = mng.create(“Fanout”, “Fanout3”);

Fanout3->addMarSystem(mng.create(“Mean”, “Mean”));

Fanout3->addMarSystem(mng.create(“Variance”, “Variance”));

Fanout2->addMarSystem(mng.create(“Centroid”, “Centroid”));

Fanout2->addMarSystem(mng.create(“RollOff”, “Rolloff”));

Fanout2->addMarSystem(mng.create(“Flux”, “Flux”);

Series2->addMarSystem(mng.create(“Spectrum”, “Spectrum”);

Series2->addMarSystem(Fanout2);

Fanout1->addMarSystem(mng.create(“ZeroCrossings”, “ZeroCrossings”);

Fanout1->addMarSystem(Series2);

Series1->addMarSystem(mng.create("SoundFileSource",“Source"));

Series1->addMarSystem(Fanout1);

Series1->addMarSystem(mng.create(“Memory”, “TextureMemory”));

Series1->addMarSystem(Fanout3);

Series1->addMarSystem(mng.create(“classifier”, “Classifier”));

Saturday, 20 November, 2010

InteroperabilityMarsyas Audio and MIDI I/0

RtAudio

Multiplatform C++ API for realtime audio input/output

Linux (native ALSA, JACK, and OSS)

MacOSX®

Windows® (DirectSound® and ASIO®)

SGI®

RtMIDI

Multiplatform C++ API for realtime MIDI input/output

Linux (ALSA)

MacOSX®

Windows® (Multimedia Library)

SGI®

http://www.music.mcgill.ca/~gary/rtaudio/http://www.music.mcgill.ca/~gary/rtmidi/

Saturday, 20 November, 2010

InteroperabilityMarsyas & WEKA

WEKA: Data Mining Software in Java

Marsyas already includes some machine learning blocks

Marsyas outputs extracted features as .arff files (WEKA)

features can be opened in WEKA for further evaluation and data modeling

http://www.cs.waikato.ac.nz/ml/weka/

Saturday, 20 November, 2010

InteroperabilityCalling MATLAB® from C++ Marsyas code:

MATLAB® engine API

exchange data (i.e. matrices) in run-time between C++ and MATLAB®

remotely execute commands in MATLAB® from a C++ routine

Access to all MATLAB® toolboxes, algorithms and available routines

Algorithmic validation of C++ routines

Quick and easy evaluation of proof of concepts

May not allow real-time operation…

Not such a big problem when evaluating or developing algorithms

http://www.mathworks.com

Saturday, 20 November, 2010

InteroperabilityCalling MATLAB® from C++ Marsyas code:

Marsyas::MATLABengine class

Utility class

Wraps MATLAB® engine calls for most POD types and Marsyas data types

Easy to send/receive data to/from MATLAB® from anywhere in the code

// create a std::vector of real numbers std::vector<double> vector_real(4); vector_real[0] = 1.123456789; vector_real[1] = 2.123456789; vector_real[2] = 3.123456789; vector_real[3] = 4.123456789;

// send a std::vector<double> to MATLAB PUTVAR(vector_real, "vector_real");

// do some dummy math in MATLAB EVALUATE("mu = mean(vector_real);"); EVALUATE("sigma = std(vector_real);"); EVALUATE("vector_real = vector_real/max(vector_real);");

// get values from MATLAB double m, s; GETVAR(m, "mu"); GETVAR(s, "sigma"); GETVAR(vector_real, "vector_real");

Saturday, 20 November, 2010

InteroperabilityPython™ Bindings

easily create scripts for rapid testing and prototyping of data flow networks

would require much more development effort in C++

bonus: no compiling overheads

can also be embedded in C++ code, similarly to MATLAB® (TBD)

less tools for signal processing in general, but can be used for many other purposes (“batteries included”)

less licensing headaches

http://www.python.org

Saturday, 20 November, 2010

InteroperabilityMarsyas and Trolltech Qt4®

Qt® Core features optionally used by Marsyas

Multi-platform signal/slot architecture

Multi-platform threads (multithreaded processing)

Multi-platform database access

Multi-platform XML I/O

Qt® GUI Features optionally

used by Marsyas

Multi-platform Widgets

Multi-platform OpenGL

Qt4® is available as GPL open source code for all platforms

http://www.trolltech.com

Saturday, 20 November, 2010

MarsyasXWhat is MarsyasX?

Next step for Marsyas

Evolution rather than an alternative implementation

Generalization of the framework to support processing different media

Key points:

crossmodal processing

expandable support

interoperability

MPEGSink

GMM

MotionContour

WAV source

Hanning FFT AV

source

AV source

ContourMPEGSink

Saturday, 20 November, 2010

MarsyasXMarsyasX brief history:

MarsyasX started as a patch to Marsyas...

...add visual processing support

However, the previous architecture has some restrictions:

Some design options are audio-oriented

No in-place processing - for visual processing this can be too much of a burden

Representing complex data in a single buffer requires inefficient hacks

The slice-based processing paradigm is an elegant solution, but can be very difficult to adapt to a more generic framework

Saturday, 20 November, 2010

sink

MarsyasX - architectureFirst step was the creation of a generic data handling between modules:

Data carried in payloads

Flows define a coherent stream of data and are identified by type and name

Modules are explicitly associated to flows and can handle multiple flows with different behaviours (in, inout, out)

source sinkfiltersource filter

Saturday, 20 November, 2010

MarsyasX - architecturePayload Mechanism:

Payloads are created in factories, associated to a source;

Payloads are carried between modules through channels (created automatically by implicit patching);

When payloads are no longer necessary, they are sent back to its source to be reutilized -> memory is allocated only once.

Saturday, 20 November, 2010

MarsyasX - architectureControls are formally related to the flows and changes are propagated automatically

e.g. visual flow is characterized by width, height and colour space

Saturday, 20 November, 2010

MarsyasX - architectureTiming?

each “tick” corresponds to an elapsed time interval

synchronization is assured by the timing information associated to each payload

How is a module created?

define how many and what type of flows it supports

factories for outputs and flow-related controls are added implicitly

add specific controls, if needed

create a process function

input and output data structures are automatically passed as arguments

Saturday, 20 November, 2010

MarsyasX - architectureModules are now loaded from extensions

implemented separately

dynamically loaded

visual

core

application

audio learning avother

specific

marsyas

Saturday, 20 November, 2010

MarsyasX - interoperabilityPython

Scripting language plays a more central role

Complete bindings to create networks and use core functionalities

Marsyas 0.2

Legacy layer was created to support modules from Marsyas 0.2 transparently

A network can be an hybrid, containing MarsyasX and Marsyas 0.2 modules

MATLAB

Saturday, 20 November, 2010

MarsyasX - statusStatus of development:

pre-alpha

release 0.1 in mid-2011 (tentative)

feature status:

payload mechanism [done]

flow management [done]

definition of API [partly done]

python bindings [partly done]

legacy layer [partly done]

processing modules [needs a lot more]

event management (expressions, GUI) [not fully defined]

Saturday, 20 November, 2010

Main development trunk of the Marsyas project

Larger community:

associating the current users and developers of Marsyas with many others coming from different areas

Distributed MarsyasX

Data and events exchanged transparently

MarsyasX - future

Saturday, 20 November, 2010

Thank you!(Lifetime) Future work…

http://marsyas.sourceforge.net

Saturday, 20 November, 2010