Upload
deon
View
54
Download
0
Embed Size (px)
DESCRIPTION
T-BAG: Bootstrap Aggregating the TAGE Predictor. Ibrahim Burak Karsli, Resit Sendag University of Rhode Island. Bootstrap Aggregating. Statistical method introduced by Breiman in 1996 Use ensemble of predictors sub-predictors could be the same or different - PowerPoint PPT Presentation
Citation preview
T-BAG: Bootstrap Aggregating the TAGE Predictor
Ibrahim Burak Karsli, Resit SendagUniversity of Rhode Island
Bootstrap Aggregating• Statistical method introduced by Breiman in 1996• Use ensemble of predictors
– sub-predictors could be the same or different• Train each slightly differently and independently
• Each predictor trained with resampled (with replacement) data set (bootstrapping)
• Aggregate their predictions• The IDEA is: Many weak learners make strong learner
• Theoretically proven to perform better than single learner in an ensemble
Offline Bagging
x1, x2, x3, x4, x5, x6, x7, x8
x2, x7, x8, x3, x7, x6, x3, x1 x7, x8, x5, x6, x4, x2, x7, x1 x3, x6, x2, x7, x5, x6, x2, x2 x4, x5, x1, x4, x6, x4, x3, x8
Predictor 1 Predictor 2 Predictor 3 Predictor 4
Training set 1
Original training set
Training set 2 Training set 3 Training set 4
Test set (same for all predictors)
Weighting or Majority Voting
Final Prediction
Online Bagging
Do not update with x1Update with x2 1 timeUpdate with x3 1 time
.
.
Update with x1 1 timesUpdate with x2 2 timeUpdate with x3 1 time
.
.
Update with x1 1 timesDo not update with x2Update with x3 2 time
.
.
Predictor 1 Predictor 2 Predictor 3 Predictor 4
Test set (can be same as original sequence)
Weighting or Majority Voting
Final Prediction
Original sequencex1x2x3..
Update with x1 3 timesDo not update with x2Update with x3 1 time
.
.
TAGE Predictor
• Winner of CBP3• State-of-art branch predictor• Many parameters to allow variety
T-BAG: Prediction
x32
TAGE
PC
aggregation prediction
Predictor Aggregation
• Bagging in nature uses 10s to 100s of predictors, so we target unlimited track
• Submitted predictor uses 32 TAGE predictors• Keep track of successes of last 16 predictions
with a sliding window for each predictor• Aggregate the predictions using weighted sum
PC & resolveDir
T-BAG: Update
x32
TAGE
Update Count
Random Update
• Each predictor is updated on each sample k times in a row where k is a random number generated by multinomial distribution
• Max k = 2 (because ctr width is 3bits)• For submission, update on each sample 20%,
60%, 20% of the time, 0, 1, 2 times, respectively.
Sub Predictors• 32 predictors• Variability in min/max history lengths, number of tables,
and use of PC in table indexing• ctr 3-bits for all• Each predictor’s size is about 15MB (submitted predictor
492MB)• Min history varies between 3 and 13• Max history varies between 1,200 and 100,000• Number of tables varies between 20 and 38• 16 predictors use PC, the other 16 do NOT!
– Use of PC in indexing tables for TAGE-like predictor is not significantly better!
Results
1.90
1.92
1.94
1.96
1.98
2.00
2.02
4x 8x 16x 32x
Mis
p/KI
Number of Sub-predictors
AllSame_RandUpd
AllDifferent
AllDifferent_RandUpd
• AllSame_RandUpd -> 1.952 misp/KI• AllDifferent -> 1.932 misp/KI• AllDisfferent_RandUpd -> 1.919 misp/KI
misp/KI
Configuration Baseline TAGE 32x-size TAGE T-Bag32
AMEAN 2.007 2.003 1.919
Conclusion and Future Work
• Simple idea• Different types of predictors• Implementation with storage budget
Q&A