Upload
lsx-festival-of-technology
View
1.985
Download
1
Embed Size (px)
DESCRIPTION
Citation preview
Practical Artificial Intelligence and Machine Learning
arturo.servin_at_gmail.comhttp://arturo.servin.googlepages.com/
About this presentation
๏ Some theory on AI and ML
๏ Some practical ideas and simple how to
๏ What's out there using AI
๏ Resources, Kits and Data
Artificial Intelligence
๏ Machine Learning
๏ Natural Language Processing
๏ Knowledge representation
๏ Plannning
๏ Multi-Agent Systems
๏ and some other stuff depending of the author of the book
Machine Learning
๏ A program is learning when it executes a task T and acquires experience E and the measured performance P of T improves with experience E (T. Mitchell, Machine Learning, 1997)
Machine Learning Flavours
๏ Supervised Learning- Programs learn a concept/hypothesis by
means of labeled examples- Examples: Artificial Neural Networks,
Bayesian Methods, Decision Trees
๏ Unsupervised Learning- Programs learn to categorise unlabelled
examples- Examples: Non-negative matrix factorization
and self-organising maps
More flavours
๏ Reinforcement Learning- Programs learn interacting with the
environment, the execution of actions and observing the feedback in the form of + or – rewards
- Examples: SARSA, Q-Learning
Training Examples
๏ Continuous
๏ Discrete
๏ Inputs know as Vectors or Features
๏ Example in Wine Classification: Alcohol level, Malic acid, Ash, Alcalinity of ash, etc.
•Linear and Non-linear feature relations
source: Oracle Data Mining Concepts
More complex feature relations
Decision Trees
๏ Easy to understand and to interpret
๏ Hierarchical structure
๏ They use Entropy and Gini impurity to create groups
๏ Disadvantage: It's an off-line method
๏ Examples: ID3, C4.5
Source: http://www2.cs.uregina.ca/~dbd/cs831/notes/ml/dtrees/c4.5
Decision Trees, example
๏ Create .names and .data files with training data
๏ Generate tree and rules (c4.5 -f <file> and c4.5rules -f <file>)
Outlook Temperature Humidity Windy Play or Don't Play
Sunny 80 90 true Don't Play
Overcast 83 78 false Play
Rain 70 96 false Play
๏ Categorize new data (consult,consultr). Use GPS/Geocoding, Google Maps and Yahoo Weather APIs to enhance
aservin@turin:~/Projects/C45: consult -f golf
C4.5 [release 8] decision tree interpreter Sat Jan 17 00:05:16 2009
------------------------------------------
outlook: sunny
humidity: 80
Decision:
Don't Play CF = 1.00 [ 0.63 - 1.00 ]
Bayesian Classifiers
๏ Bayes Theorem: P(h|D) = P(D|h) P(h) P(D)
๏ P(h) = prior probability of hypothesis h
๏ P(D) = prior probability of training data D
๏ P(h|D) = probability of h given D
๏ P(D|h) = probability of D given h
๏ Naive Bayes Classifier, Fisher Classifier
๏ Commonly used in SPAM filters
Classifying your RSS feeds
๏ Use the unofficial Google Reader API http://blog.gpowered.net/2007/08/google-reader-api-functions.html
๏ Some Python Code (Programming Collective Intelligence, Chapter 6)
๏ Tag interesting and non-interesting items
๏ Train using Naive-bayes or Fisher classsifier
๏ >> cl.train('Google changes favicon','bad')
๏ >> cl.train('SearchWiki: make search your own','good')
๏ New items are tagged as interesting or not
๏ >> cl.classify('Ignite Leeds Today')
๏ Good
๏ You can re-train online
๏ Add more features, try with e-mail
Finding Similarity
๏ Euclidean Distance, Pearson Correlation Score, Manhattan
๏ Document Clustering
๏ Price Prediction
๏ Item similarity
๏ k-Nearest Neighbors, k-means, Hierarchical Clustering, Support-Vector Machines, Kernel Methods
Similar items
Source: http://home.dei.polimi.it/matteucc/Clustering/tutorial_html
Artificial Neural Networks
๏ Mathematical/Computational model based on biological neural networks
๏ Many types. The most common use Backpropagation algorithm for training and Feedforward algorithm to get results/training
Artificial Neural Networks
๏ Input is high-dimensional, discrete or real-valued (e.g. raw sensor input)
๏ Output is discrete or real valued
๏ Output is a vector of values
๏ Perceptron, linear
๏ Sigmoid, non-linear and multi-layer
Example, finding the best price๏ Create training data using Amazon/Ebay API
๏ Laptop prices. Use price, screen size as features
๏ Use a ANN, i.e. Fast Artificial Neural Network (FANN)
๏ struct fann *ann = fann_create_standard(num_layers, num_input, num_neurons_hidden, num_output); #C++
๏ ann = fann.create(connection_rate, (num_input, num_neurons_hidden, num_output)) #Python
๏ $ann = fann_create(array(2, 4, 1),1.0,0.7); // PHP
๏ You can also try k-Nearest Neighbours
๏ Try it!
Resources 1๏ Books
- Practical Artificial Intelligence Programming in Java, Mark Watson http://www.markwatson.com/opencontent/ (There is a Ruby one as well)
- Programming Collective Intelligence, Toby Segaran; O'Reilly
- Artificial Intelligence: A Modern Approach, S. Russell, P. Norvig, J. Canny; Prentice Hall,
- Machine Learning, Tom Mitchell; MIT Press
๏ Online Stuff
- ML course in Stanfordhttp://www.stanford.edu/class/cs229/materials.html
- Statistical ML http://bengio.abracadoudou.com/lectures/
Resources 2๏ Code
- FANN http://leenissen.dk/fann/index.php
- NLP http://opennlp.sourceforge.net/
- C4.5 http://www2.cs.uregina.ca/~dbd/cs831/notes/ml/dtrees/c4.5/tutorial.html
- ML and Javahttp://www.developer.com/java/other/article.php/10936_1559871_1
๏ Data- UC Irvine Machine Learning Repository
http://archive.ics.uci.edu/ml/
- Amazon Public Datasets http://aws.amazon.com/publicdatasets/
More info
๏ For questions, projects and job offers:- arturo.servin \_(at)\_ gmail.com- http://twitter.com/the_real_r2d2- http://arturo.servin.googlepages.com/