Upload
it-weekend
View
2.069
Download
3
Embed Size (px)
DESCRIPTION
by Yaroslav Bunyak
Citation preview
Introduction toData-Oriented Design
@YaroslavBunyakSenior Software Engineer, SoftServe
Programming, M**********rDo you speak it?
Story
Sieve of Eratosthenes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
Sieve of Eratosthenes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
Sieve of Eratosthenes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
Sieve of Eratosthenes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
Sieve of Eratosthenes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
Sieve of Eratosthenes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
Sieve of Eratosthenes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
Sieve of Eratosthenes
Sieve of Eratosthenes
Simple algorithm
Sieve of Eratosthenes
Simple algorithm
Easy to implement
Sieve of Eratosthenes
Sieve of Eratosthenesint array[SIZE];
Sieve of Eratosthenesint array[SIZE];
array[i] = 1;
Sieve of Eratosthenesint array[SIZE];
array[i] = 1;
if (array[i]) ...
Sieve of Eratosthenesint array[SIZE];
array[i] = 1;
if (array[i]) ...
Sieve of Eratosthenesint array[SIZE];
array[i] = 1;
if (array[i]) ...
int bits[SIZE / 32];
Sieve of Eratosthenesint array[SIZE];
array[i] = 1;
if (array[i]) ...
int bits[SIZE / 32];
bits[i / 32] |= 1 << (i % 32);
Sieve of Eratosthenesint array[SIZE];
array[i] = 1;
if (array[i]) ...
int bits[SIZE / 32];
bits[i / 32] |= 1 << (i % 32);
if (bits[i / 32] & (1 << (i % 32))) ...
Sieve of Eratosthenes
Sieve of Eratosthenes
Simple algorithm
Sieve of Eratosthenes
Simple algorithm
Easy to implement
Sieve of Eratosthenes
Simple algorithm
Easy to implement
But...
Sieve of Eratosthenes
Simple algorithm
Easy to implement
But...
unexpected results
Sieve of Eratosthenes
Sieve of Eratosthenes
The second implementation (bitset) is 3-5x faster than first (array)
Sieve of Eratosthenes
The second implementation (bitset) is 3-5x faster than first (array)
Even though it actually does more work
Why?!.
Fast Forward
...
...
• Years have passed
...
• Years have passed
• I become a software engineer
...
• Years have passed
• I become a software engineer
• And one day...
This Graph
Slide 17
CPU/Memory performance
Computer architecture: a quantitative approachBy John L. Hennessy, David A. Patterson, Andrea C. Arpaci-Dusseau
This Table1980 Modern PC Improvement, %
Clock speed, Mhz 6 3000 +500x
Memory size, MB 2 2000 +1000x
Memory bandwidth, MB/s 137000 (read)2000 (write)
+540x+150x
Memory latency, ns 225 ~70 +3x
Memory latency, cycles 1.4 210 -150x
Our Programming Model
Our Programming Model
• High-level languages
Our Programming Model
• High-level languages
• OOP
Our Programming Model
• High-level languages
• OOP
• everywhere!
Our Programming Model
• High-level languages
• OOP
• everywhere!
• objects scattered throughout the address space
Our Programming Model
• High-level languages
• OOP
• everywhere!
• objects scattered throughout the address space
• access patterns are unpredictable
MeetData-Oriented Design
Ideas
Ideas
• Programs transform data
Ideas
• Programs transform data
• nothing more
Ideas
• Programs transform data
• nothing more
• Think about data, not code
Ideas
• Programs transform data
• nothing more
• Think about data, not code
• Hardware is not a black box
Program
data dataxform
Program
data dataxform
Your Program
Claim
• Memory latency is the king
• CPU cycles almost free
Memory
• CPU registers
MemoryCPU
• CPU registers
• Cache Level 1
MemoryCPU
L1iCache
L1dCache
• CPU registers
• Cache Level 1
• Cache Level 2
MemoryCPU
L1iCache
L1dCache
L2Cache
• CPU registers
• Cache Level 1
• Cache Level 2
• RAM
MemoryCPU
RAM
L1iCache
L1dCache
L2Cache
• CPU registers
• Cache Level 1
• Cache Level 2
• RAM
• HDD
MemoryCPU
RAM
Disk
L1iCache
L1dCache
L2Cache
Distance Metaphor
Distance Metaphor
• L1 cache: it's on your desk, pick it up.
Distance Metaphor
• L1 cache: it's on your desk, pick it up.
• L2 cache: it's on the bookshelf in your office, get up out of the chair.
Distance Metaphor
• L1 cache: it's on your desk, pick it up.
• L2 cache: it's on the bookshelf in your office, get up out of the chair.
• Main memory: it's on the shelf in your garage downstairs, might as well get a snack while you're down there.
Distance Metaphor
• L1 cache: it's on your desk, pick it up.
• L2 cache: it's on the bookshelf in your office, get up out of the chair.
• Main memory: it's on the shelf in your garage downstairs, might as well get a snack while you're down there.
• Disk: it's in, um, California. Walk there. Walk back. Really.
Distance Metaphor
• L1 cache: it's on your desk, pick it up.
• L2 cache: it's on the bookshelf in your office, get up out of the chair.
• Main memory: it's on the shelf in your garage downstairs, might as well get a snack while you're down there.
• Disk: it's in, um, California. Walk there. Walk back. Really.
http://hacksoflife.blogspot.com/2011/04/going-to-california-with-aching-in-my.html
Advice
Advice
• Keep your data closer to registers and cache
Advice
• Keep your data closer to registers and cache
• What’s good for memory - good for you
Example 1: AoS vs SoAstruct Tile
{
bool ready;
Data pixels; // big chunk of data
};
Tile tiles[SIZE];
vs
struct Image
{
bool ready[SIZE]; // hot data
Data pixels[SIZE]; // cold data
};
Example 1: AoS vs SoAfor (int i = 0; i < SIZE; ++i)
{
if (tiles[i].ready)
draw(tiles[i].pixels);
}
vs
for (int i = 0; i < SIZE; ++i)
{
if (image.ready[i])
draw(image.pixels[i]);
}
Example 1: AoS vs SoA
vs
Example 2: Existencestruct Image
{
bool ready[SIZE];
Data pixels[SIZE];
};
Image image;
vs
Data ready_pixels[N];
Data no_pixels[M];
// N + M = SIZE
Example 2: Existencefor (int i = 0; i < SIZE; ++i)
{
if (image.ready[i])
draw(image.pixels[i]);
}
vs
for (int i = 0; i < N; ++i)
{
draw(ready_pixels[i];
}
Example 3: Locality
std::vector<float> numbers;
float sum = 0.0f;
for (auto it : numbers)
sum += *it;
vs
std::list<float> numbers;
float sum = 0.0f;
for (auto it : numbers)
sum+ = *it;
Example 3: Locality
vs
Few Patterns
• A to B transform
• In place transform
• Existence based processing
• Data normalization
• DB design says hello!
• Task, gather, dispatch, and more...
Benefits of DOD
Benefits of DOD
• Maximum performance
• CPU doesn’t wait & starve
Benefits of DOD
• Maximum performance
• CPU doesn’t wait & starve
• Easy to parallelize
• data is grouped, transforms separated
• ready for Parallel Processing, OOP doesn’t
Benefits of DOD
• Maximum performance
• CPU doesn’t wait & starve
• Easy to parallelize
• data is grouped, transforms separated
• ready for Parallel Processing, OOP doesn’t
• Simpler code
• surprise!
References: Memory
• Ulrich Drepper “What Every Computer Programmer Should Know About Memory”
• Крис Касперски “Техника оптимизации програм. Еффективное использование памяти”
• Christer Ericson “Memory Optimization”
• Igor Ostrovsky “Gallery of Processor Cache Effects”
References: DOD
• Noel Llopis “Data-Oriented Design”, Game Developer Magazine, September 2009
• Richard Fabian “Data-Oriented Desing”, book draft http://www.dataorienteddesign.com/dodmain/
• Tony Albrecht “Pitfalls of Object-Oriented Programming”
• Niklas Frykholm “Practical Examples of Data Oriented Design”
• Mike Acton “Typical C++ Bullshit”
• Data Oriented Design @ Google+
Thank You!
Q?