Upload
microsoft
View
133
Download
1
Tags:
Embed Size (px)
Citation preview
ACENSI, Tour Monge - 22, Place des Vosges - 92 400 Courbevoie - La Défense 5 - www.acensi.fr
Concurrent programming based on dataflow
TPL DATAFLOW
A new approach to Monte Carlo VAR
09/02/2015Version du document
OVERVIEW
Optimization and multithreading
without getting your hands dirty!
09/02/2015 2Version du document
TPL Dataflow Presentation
Why TPL Dataflow ?
A natural extension of framework 4.0
The library
Use cases
Case study : Monte Carlo Value At Risk (VAR)
What is VAR ?
Monte Carlo VAR: Basic Approach
Monte Carlo VAR: Dataflow Approach
Conclusion
SUMMARY
09/02/2015Version du document
Speakers Presentation
Yves Alexandre SIMON James KOUTHON Julien LEBOT Adina SANDOU
R&D Director Technical Director .Net Expert .Net Expert
Information systems and Microsoft technologies Consulting
WHO ARE WE ?
09/02/2015 4Version du document
Presentation
TPL DATAFLOW
09/02/2015 5Version du document
TPL DATAFLOW: A NATURAL EXTENSION OF FRAMEWORK 4.0
Promotes actor-agent oriented designs through primitives.
Allows developers to create blocks to express computations based on directeddataflow graphs.
09/02/2015Version du document 6
TPL DATAFLOW: THE LIBRARY
Overview TPL Dataflow falls in line with Map/Reduce
Can handle large volumes of data
Ideal for long computations
TPL Dataflow: paradigm shift Tasks are created and linked together as a graph
Each node can receive data as input and/or output data
09/02/2015 7Version du document
TPL DATAFLOW: THE LIBRARY
Source blocks (1): acts like a source of data
ISourceBlock<TOutput>
Target blocks (2): acts like a receiver of data
ITargetBlock<TInput>
Propagator blocks: acts like (1) and (2)
IPropagatorBlock<TInput, TOutput>
09/02/2015 8Version du document
TPL DATAFLOW: THE LIBRARY
Basic blocks BufferBlock: is a queue, a FIFO (First In First Out) buffer.
ActionBlock: like a “foreach”, it executes a delegate for each input item.
ex: var node = new ActionBlock<string>(s => Console.WriteLine(s));
TransformBlock: acts like a “Linq” selectex: var node = new TransformBlock<int, int>(p => p * 100);
Advanced blocks BroadcastBlock: forwards copies of data items as its output.
JoinBlock: collects many inputs and output a tuple
Others
09/02/2015 9Version du document
TPL DATAFLOW: THE LIBRARY
Linking Used to link two blocks together.
Predicates and parallelism options available.
There’s no limit to what you can link.
Completion Status Each block supports an asynchronous form of
completion to propagate finished state.
09/02/2015 10Version du document
WHY TPL DATAFLOW?
TPL Dataflow benefits
Paradigm shift for higher code expressivity
Using multithreading without effort
Boosting performance (optimization) painlessly
Focusing on the 'what' rather than the 'how'
09/02/2015 11Version du document
TPL DATAFLOW: USE CASES
Build more complex systems easilySamples:
Data analysis/mining services
Web-crawlers
Image and Sound processors
Databases engine designs
Financial computation
…
09/02/2015 12Version du document
Monte Carlo Value at Risk (VAR)
CASE STUDY
09/02/2015 13Version du document
WHAT IS VAR?
What is VAR? Value at risk (VAR)
Monitor risk in trading portfolio
Financial Global risk indicator
Our use case Market VAR (VAR on market move)
Intensive computation (especially for Monte Carlo VAR)
09/02/2015 14Version du document
ExampleVAR 99/1D : Maximum lost in 1 day with99% probability
VAR Calculation Methods
Historical VAR
(historicaldata)
ParametricVAR
(formula data)
Monte Carlo VAR (montecarlosimulation
data)
SIMPLE MONTECARLO VAR WORKFLOW
09/02/2015 15Version du document
Start
Portfolios Composition
Market Data
Static Data
Global Position
Position Pricing With MonteCarlo
Calculus
Position Pricing With MonteCarlo
Calculus
Position Pricing With MonteCarlo
Calculus
Statistics on Global
Distribution (VAR)End
1 2 3 4
Basic approach
MONTE CARLO VAR
09/02/2015 16Version du document
MONTE CARLO VAR: BASIC APPROACH
Pipeline:
09/02/2015 17Version du document
StartPortfolios
CompositionMarket Data
Global Position
Position Pricing With MonteCarlo
Calculus
Statistics on Global
Distribution (VAR)End
MONTE CARLO VAR: BASIC APPROACH
Portfolio composition
Fetch portfolios by using the provider
Market data
Get product parameters from market data provider
Global position
Look over all portfolios and nettings and get the positions
09/02/2015 18Version du document
Portfolios = PortfolioProvider.Portfolios;
ProductParameters = ProductParametersProvider.ProductsParameters;
Portfolios Composition
Market Data
Global Position
IEnumerable<KeyValuePair<Product, long>> allTransactions = Portfolios.SelectMany(x => x.Transactions)
.GroupBy(y => y.Product).Select(z => new KeyValuePair<Product, long>
(z.Key, z.Sum(x => x.Position)));
Positions = allTransactions.ToDictionary(t => t.Key, t => t.Value);
MONTE CARLO VAR: BASIC APPROACH
Position pricing For each product, run the Monte Carlo simulation
Statistics on global Multiply the result by the position value and calculate the lost value
09/02/2015 19Version du document
IEnumerable<double> results = StatisticsUtilities.SimulateMonteCarloWithPosition(
new MonteCarloInput{
Parameters = parameters,Position = position,Product = product
}, TotalSimulations);
Position Pricing With MonteCarlo
Calculus
IList<double> totals = new List<double>();
Func<IList<double>, string, IList<double>> sumList = (current, key) => Helpers.SumList(current, lostsValuesByProduct[key].ToList());
MONTE CARLO VAR: BASIC APPROACH
09/02/2015 20Version du document
totals = lostsValuesByProduct.Keys.Aggregate(totals, sumList);
StatisticsUtilities.CalculateVar(totals, 0.99);
Aggregate the lost value for all products
Choose the VAR at 99% for 1 day
Statistics on Global
Distribution (VAR)
Dataflow approach
MONTE CARLO VAR
09/02/2015 21Version du document
MONTE CARLO VAR: DATAFLOW APPROACH
DataFlow Graph
09/02/2015 22Version du document
Portfolios Composition And Market
Data
Global Position
Position Pricing With MonteCarlo
Calculus
Position Pricing With MonteCarlo
Calculus
Position Pricing With MonteCarlo
Calculus
AggregatorStatistics on
Global Distribution (VAR)
DataFlow
MONTE CARLO VAR: DATAFLOW APPROACH
Chosen approach: parallelize per product
09/02/2015 23Version du document
Product
Product
Product
Product
…
N threads
CalculateLoss() x M iterations
CalculateLoss() x M iterations
CalculateLoss() x M iterations
CalculateLoss() x M iterations
MONTE CARLO VAR: DATAFLOW APPROACH
Process overview
09/02/2015 24Version du document
TransformBlock
PriceMean
Standard DevPosition
IN: MonteCarloInput OUT: IEnumerable<double>
Losses
Normal distribution
Calculate Loss
ActionBlock TotalsLosses
IN: IEnumerable<double> OUT: IEnumerable<double>
Aggregator
MONTE CARLO VAR: DATAFLOW APPROACH
TransformBlock runs the Monte Carlo simulation
Key points:
▬ Do only one thing
▬ Keep work data local
▬ Fully enumerate returned data
09/02/2015 25Version du document
var monteCarlo = new TransformBlock<MonteCarloInput, IEnumerable<double>>(input =>
{
var normalDistribution = new NormalEnumerable();
return normalDistribution.Take(TotalSimulations)
.Select(alea => StatisticsUtilities.CalculateLoss(input, alea))
.ToList(); // Very important
}, ExecutionOptions);
Position Pricing With MonteCarlo
Calculus
MONTE CARLO VAR: DATAFLOW APPROACH
ActionBlock aggregates the result
No need to synchronize access to shared data
09/02/2015 26Version du document
var totals = new List<double>();
var aggregate = new ActionBlock<IEnumerable<double>>(doubles =>
{
if (!totals.Any())
{
totals.AddRange(doubles);
}
else
{
var losses = doubles.ToList();
foreach (var i in Enumerable.Range(0, losses.Count()))
{
totals[i] += losses[i];
}
}
});
Aggregator
MONTE CARLO VAR: DATAFLOW APPROACH
Linking the blocks together
Triggering the data flow chain
Data posted asynchronously
09/02/2015 27Version du document
foreach (var portfolio in Portfolios
.SelectMany(x => x.Transactions)
.GroupBy(y => y.Product)
.Select(z => new KeyValuePair<Product, long>(z.Key, z.Sum(x => x.Position))))
{
var position = portfolio.Value;
var parameters = ProductParameters.First(x => x.Product.Equals(portfolio.Key));
monteCarlo.Post(new MonteCarloInput
{
Parameters = parameters,
Position = position
});
}
monteCarlo.LinkTo(aggregate, DataflowLinkOptions);
Global Position
MONTE CARLO VAR: DATAFLOW APPROACH
Completing the tasks
Tricky to get right
▬ Can cause deadlocks
▬ Solution: Automatically propagate completion
09/02/2015 28Version du document
monteCarlo.Complete();
aggregate.Completion.Wait();
DataflowLinkOptions = new DataflowLinkOptions
{
PropagateCompletion = true
}
MONTE CARLO VAR: DATAFLOW APPROACH
Manual completion propagation
Maximizing CPU usage
09/02/2015 29Version du document
monteCarlo.Completion.ContinueWith(t =>
{
if (t.IsFaulted)
{
((IDataflowBlock)aggregate).Fault(t.Exception); // Pass exception
}
else
{
aggregate.Complete(); // Mark next completed
}
});
ExecutionOptions = new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = Environment.ProcessorCount
}
MONTE CARLO VAR: DATAFLOW APPROACH
Result
09/02/2015 30Version du document
0
500
1000
1500
2000
2500
3000
3500
4000
i5-4200U 4 @2.30GHz
Intel CeleronG1820 2 @
2.70GHz
Intel i5-2400 4 @3.00GHz
i7-3770K w/ 8 @5.09GHz
i7-4790K w/ 8 @4.00GHz
mill
isec
on
ds
CPU
Benchmark (lower is better)
Basic Data flow
What did we learn?
CONCLUSION
09/02/2015 31Version du document
CONCLUSION
Performance increase Faster
Automatically scale to hardware
Paradigm shift Macro-level optimization
New primitives
09/02/2015 32Version du document
github.com/acensi/techdays-2015
msdn.microsoft.com/en-us/library/hh228603(v=vs.110).aspx
github.com/akkadotnet/akka.net
Find out more !
Experiment with the code
Parallelize data loading
Try new blocks
Come see us at the booth
Going further
www.acensi.fr
Let’s keep the conversation going!
Come see us at booth 26
09/02/201533Version du document