14
Multimedia Data Mining Arvind Balasubramanian [email protected] Multimedia Lab (ECSS 4.416) The University of Texas at Dallas

Multimedia Data Mining Arvind Balasubramanian [email protected] Multimedia Lab (ECSS 4.416) The University of Texas at Dallas

Embed Size (px)

Citation preview

Page 1: Multimedia Data Mining Arvind Balasubramanian arvind@utdallas.edu Multimedia Lab (ECSS 4.416) The University of Texas at Dallas

Multimedia Data MiningArvind Balasubramanian

[email protected] Lab (ECSS 4.416)The University of Texas at Dallas

Page 2: Multimedia Data Mining Arvind Balasubramanian arvind@utdallas.edu Multimedia Lab (ECSS 4.416) The University of Texas at Dallas

Me and My Research• Research Interests: – Machine Learning– Data Mining– Statistical Analysis– Applications of the above in Multimedia

• I am currently working on – developing a clustering algorithm guided by statistical

analysis– deriving a composite grading scale for speech and language

disorders, in collaboration with the UTD Callier Center

Page 3: Multimedia Data Mining Arvind Balasubramanian arvind@utdallas.edu Multimedia Lab (ECSS 4.416) The University of Texas at Dallas

Data Mining and Multimedia

• Uncovering hidden information from data.• Exploiting data to obtain new

knowledge and interpret results.• Immense applications in Multimedia.

Page 4: Multimedia Data Mining Arvind Balasubramanian arvind@utdallas.edu Multimedia Lab (ECSS 4.416) The University of Texas at Dallas

Data Mining Techniques

• Classification• Prediction• Cluster Analysis & Class Discovery• Extraction and Retrieval• Statistical Analysis

Page 5: Multimedia Data Mining Arvind Balasubramanian arvind@utdallas.edu Multimedia Lab (ECSS 4.416) The University of Texas at Dallas

Ideas for ProjectsText Mining• Information Extraction from Domain-specific

documents – involves extracting data from free text pieces and

populating a database– Serves to organize required information available

in unorganized form– Not enough in itself; combine with class

discovery

Page 6: Multimedia Data Mining Arvind Balasubramanian arvind@utdallas.edu Multimedia Lab (ECSS 4.416) The University of Texas at Dallas

Ideas for ProjectsText Mining• New Class Discovery using Clustering

techniques– identifying groups of keywords that do not fall into

known categories– creating new categories and validating them– Possibly employ clustering algorithms with proper

similarity measure or distance functions

Page 7: Multimedia Data Mining Arvind Balasubramanian arvind@utdallas.edu Multimedia Lab (ECSS 4.416) The University of Texas at Dallas

Ideas for ProjectsText Mining (contd.)• Query-based document retrieval system– employ one of several base models such as a

probabilistic model or a vector space model– design an efficient indexing system– include relevance ranking feature– possibly make the system intelligent using

machine learning techniques

Page 8: Multimedia Data Mining Arvind Balasubramanian arvind@utdallas.edu Multimedia Lab (ECSS 4.416) The University of Texas at Dallas

Ideas for ProjectsPattern Recognition in Multimedia Data• Scope– analyze and identify interrelationships within

Multimedia data sets– Derive a composite score from several different sub-

scores• Methods– classic techniques like Principal Component Analysis

(PCA) and Factor Analysis (FA)– Statistical methods such as Regression analysis

Page 9: Multimedia Data Mining Arvind Balasubramanian arvind@utdallas.edu Multimedia Lab (ECSS 4.416) The University of Texas at Dallas

Ideas for ProjectsPattern Recognition in Multimedia Data

(contd.)• Methods– Principal Component Analysis (PCA)

(a) Dimensionality Reduction(b) Efficient Storage and Retrieval of Media data(c) Applications in any multi-dimensional media: Images

(noise reduction), Video (content analysis), Audio (Voice Signature recognition)

Page 10: Multimedia Data Mining Arvind Balasubramanian arvind@utdallas.edu Multimedia Lab (ECSS 4.416) The University of Texas at Dallas

Ideas for ProjectsPattern Recognition in Multimedia Data

(contd.)• Methods– Factor Analysis (FA)

(a) Minimize data redundancy(b) Reveal hidden patterns(c) combining attributes to form a single attribute by

determining the importance and contribution of each attribute

(d) Medical analysis, IQ tests, Personality tests, Software measurement, Multimedia content analysis, Motion Capture Data analysis.

Page 11: Multimedia Data Mining Arvind Balasubramanian arvind@utdallas.edu Multimedia Lab (ECSS 4.416) The University of Texas at Dallas

Ideas for ProjectsPattern Recognition in Multimedia Data

(contd.)• Methods– Statistical Analysis

(a) Correlation analysis to bring out interrelationships between data attributes

(b) Regression analysis to analyze the ability of a set of data attributes to predict other data attributes

Page 12: Multimedia Data Mining Arvind Balasubramanian arvind@utdallas.edu Multimedia Lab (ECSS 4.416) The University of Texas at Dallas

Ideas for ProjectsPrediction and Suggestion Systems• An intelligent media hosting application that– learns from user queries and requests, and

accordingly suggests other media items– Suggested items would be retrieved by querying on

the features of the media features and metadata– Examples: Esnips music hosting– Many machine learning techniques could be

employed: Bayesian reasoning and classification algorithms

Page 13: Multimedia Data Mining Arvind Balasubramanian arvind@utdallas.edu Multimedia Lab (ECSS 4.416) The University of Texas at Dallas

Ideas for Projects

• Ideas for alternative projects having to do with applications of machine learning, data mining and statistical analysis in the domain of multimedia are welcome.

• Tools – Weka, Matlab, Statistical software packages (even Excel helps a lot!!).

Page 14: Multimedia Data Mining Arvind Balasubramanian arvind@utdallas.edu Multimedia Lab (ECSS 4.416) The University of Texas at Dallas

Thank You