Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
1© 2017 The MathWorks, Inc.
Machine Learning for Financial
Applications
Dr. Alexander Diethert, Application Engineer
2
Challenges for Analysts
How to work with your data to .
… analyze for insights?
… identify patterns?
… develop predictive models?
…deploy to collaborators?
… innovate?
Opportunities
3
Customer Examples
4
Agenda
Introduction
Unsupervised Learning: Clustering Bond Data
Supervised Learning: Create Rating System
Neural Network Based Time Series Forecasting
Outlook: Distributing MATLAB Applications
Summary
5
Computational Finance Workflow
Research and Quantify
Data Analysis
& Visualization
Financial
Modeling
Application
Development
Reporting
Applications
Production
Share
Automate
Files
Databases
Datafeeds
Access
6
Machine LearningCharacteristics and Examples
Characteristics
– Lots of data (many variables)
– System too complex to know
the governing equation(e.g., black-box modeling)
Examples
– Pattern recognition (speech, images)
– Financial algorithms (credit scoring, algo trading)
– Energy forecasting (load, price)
– Biology (tumor detection, drug discovery)
93.68%
2.44%
0.14%
0.03%
0.03%
0.00%
0.00%
0.00%
5.55%
92.60%
4.18%
0.23%
0.12%
0.00%
0.00%
0.00%
0.59%
4.03%
91.02%
7.49%
0.73%
0.11%
0.00%
0.00%
0.18%
0.73%
3.90%
87.86%
8.27%
0.82%
0.37%
0.00%
0.00%
0.15%
0.60%
3.78%
86.74%
9.64%
1.84%
0.00%
0.00%
0.00%
0.08%
0.39%
3.28%
85.37%
6.24%
0.00%
0.00%
0.00%
0.00%
0.06%
0.18%
2.41%
81.88%
0.00%
0.00%
0.06%
0.08%
0.16%
0.64%
1.64%
9.67%
100.00%
AAA AA A BBB BB B CCC D
AAA
AA
A
BBB
BB
B
CCC
D
7
Challenges – Machine Learning
Significant technical expertise required
No “one size fits all” solution
Locked into Black Box solutions
Time required to conduct the analysis
8
Overview – Machine Learning
Machine
Learning
Supervised
Learning
Classification
Regression
Unsupervised
LearningClustering
Group and interpretdata based only
on input data
Develop predictivemodel based on bothinput and output data
Type of Learning Categories of Algorithms
9
Agenda
Introduction
Unsupervised Learning: Clustering Bond Data
Supervised Learning: Create Rating System
Neural Network Based Time Series Forecasting
Outlook: Distributing MATLAB Applications
Summary
10
Unsupervised Learning
Clustering
k-Means,
Fuzzy C-Means
Hierarchical
Neural
Networks
Gaussian
Mixture
Hidden Markov
Model
11
Unsupervised Learning
Clustering
k-Means,
Fuzzy C-Means
Hierarchical
Neural
Networks
Gaussian
Mixture
Hidden Markov
Model
12
ClusteringOverview
What is clustering?
– Segment data into groups,
based on data similarity
Why use clustering?
– Identify outliers
– Resulting groups may be
the matter of interest
How is clustering done?
– Can be achieved by various algorithms
– It is an iterative process (involving trial and error)
-0.1 0 0.1 0.2 0.3 0.4 0.5 0.60
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
13
Example – Clustering Corporate Bonds
Goal:
– Cluster similar corporate bonds
together
Approach:
– Cluster the bonds data using distance-
based and probability-based
techniques
14
Example – Clustering Corporate Bonds
Numerous clustering functions with
rich documentation
Interactive visualizations to aid
discovery
Viewable source; not a black box
Rapid exploration & development
15
Classification Learner App
16
Agenda
Introduction
Unsupervised Learning: Clustering Bond Data
Supervised Learning: Create Rating System
Neural Network Based Time Series Forecasting
Outlook: Distributing MATLAB Applications
Summary
17
Supervised Learning
Regression
Non-linear Reg.
(GLM, Logistic)
Linear
RegressionDecision Trees
Ensemble
Methods
Neural
Networks
Classification
Nearest
Neighbor
Discriminant
AnalysisNaive Bayes
Support Vector
Machines
18
Supervised Learning - Workflow
Known data
Known responses
Model
Train the Model
Model
New Data
Predicted
Responses
Use for Prediction
Measure Accuracy
Select Model
Import Data
Explore Data
Data
Prepare Data
Speed up Computations
19
ClassificationOverview
What is classification?
– Predicting the best group for each point
– “Learns” from labeled observations
– Uses input features
Why use classification?
– Accurately group data never seen before
How is classification done?
– Can use several algorithms to build a predictive model
– Good training data is critical
-0.1 0 0.1 0.2 0.3 0.4 0.5 0.60
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Group1
Group2
Group3
Group4
Group5
Group6
Group7
Group8
20
Example: Calibrating a rating system
Key tasks1. Perform ad hoc data analysis (linear regression)
2. Improve existing analysis (logistic regression)
3. Create a robust classifier: TreeBagger
ID WC_TA RE_TA EBIT_TA MVE_BVTD S_TA Industry
60644 0.049 0.220 0.041 2.400 0.489 6
Ratings
AA
Rating System
21
Regression Learner App
22
Agenda
Introduction
Unsupervised Learning: Clustering Bond Data
Supervised Learning: Create Rating System
Neural Network Based Time Series Forecasting
Outlook: Distributing MATLAB Applications
Summary
23
Neural Network Use Cases
Function Approximation and Nonlinear Regression
Pattern Recognition and Classification
Clustering– Self-Organizing Maps
– Competitive Layers
Time Series and Dynamic Systems– Modeling and Prediction with NARX and Time-Delay Networks
– Creating Simulink Models
24
Neural Network Apps
25
Agenda
Introduction
Unsupervised Learning: Clustering Bond Data
Supervised Learning: Create Rating System
Neural Network Based Time Series Forecasting
Outlook: Distributing MATLAB Applications
Summary
26
Share Programs Outside of MATLAB
Deploy your MATLAB code to people who don’t need MATLAB
27
Application Author
End User
1
2
Sharing Standalone Applications
MATLAB
ExcelAdd-in Hadoop
StandaloneApplication
Toolboxes
MATLAB Compiler
MATLAB
Runtime3
28
MATLAB
MATLAB
Compiler SDK
C/C++ExcelAdd-in JavaHadoop .NET
MATLAB
Compiler
MATLABProduction
Server
StandaloneApplication
Which Product will Fit Your Needs?
MATLAB Compiler for sharing MATLAB programs without integration
programming
MATLAB Compiler SDK provides implementation and platform flexibility for
software developers
MATLAB Production Server provides the most efficient development path
for secure and scalable web and enterprise applications
Python
29
The Range of Application Platforms
30
MATLAB Production Server
Web
Server
Application
Server
Database Server
Warehouse
Application 2
Application 3
Application 1
MATLAB Production Server
MATLAB
Component
Web Applications
Desktop Applications
Trade Execution;
Batch Applications
31
Agenda
Introduction
Unsupervised Learning: Clustering Bond Data
Supervised Learning: Create Rating System
Neural Network Based Time Series Forecasting
Outlook: Distributing MATLAB Applications
Summary
32
Challenges MATLAB Solution
Time (loss of productivity) Rapid analysis and application developmentHigh productivity from data preparation, interactive
exploration, visualizations.
Extract value from data Machine learning, Video, Image, and FinancialDepth and breadth of algorithms in classification, clustering,
and regression
Computation speed Fast training and computationParallel computation, Optimized libraries
Time to deploy & integrate Ease of deployment and leveraging enterprisePush-button deployment into production
Technology risk High-quality libraries and supportIndustry-standard algorithms in use in production
Access to support, training and advisory services when
needed
MATLAB for Machine Learning
33
Learn More: Machine Learning with
MATLABmathworks.com/machine-learning
34
Thank you!