About Me
• Denis Reznik
• Kyiv, Ukraine
• Data Architect at Intapp, Inc.
• Microsoft Data Platform MVP
• Co-Founder of Ukrainian Data Community Kyiv (PASS Chapter)
• PASS Regional Mentor, Central and Eastern Europe
• Co-author of “SQL Server MVP Deep Dives vol. 2”
Why We Need a Query Optimizer?
T1
23
4
9
6
112
8
1
T2
1
9
4
4
112
112
112
Complexity?
O(N)
John Dow
John Smith
2
3
1
4
0
What If?
• We have non-fixed amount of tables
• Data filtering required
• Data was changed
• We more complex logic to implement
• Hardware changed
• Etc…
Simple SelectId Name
1 Superman
2 Wonder Woman
3 Deadpool
4 Batman
5 Wolverine
6 Spider-Man
7 Darth Vader
SELECT * FROM Users u
Heap
1 .. 100
100 .. 1k
5K .. 6K
1K .. 5K
6K .. 7K
15K .. 21K
12K .. 15K
10K .. 11K
21K .. 22K
22K .. 41K
9K .. 10K
41K .. 51K
7K .. 8K
8K .. 9K
71K .. 1M
51K .. 71K
1M .. 2M
2M .. 3M
Clustered Index
…
…
1 .. 1M
1 .. 2K 2K+1 .. 4K 1M-2K .. 1M
1 .. 300 301 .. 800 801 .. 1,5K 1,5K+1 .. 2K
More Complex SelectId Name
1 Superman
2 Wonder Woman
3 Deadpool
4 Batman
5 Wolverine
6 Spider-Man
7 Darth Vader
SELECT * FROM Users uWHERE Name = 'Batman'
Index Seek
…
…
1 .. 1M
1 .. 2K 2K+1 .. 4K 1M-2K .. 1M
1 .. 300 301 .. 800 801 .. 1,5K 1,5K+1 .. 2K
SELECT * FROM UsersWHERE Id = 523
Index Scan
…
…
1 .. 1M
1 .. 2K 2K+1 .. 4K 1M-2K .. 1M
1 .. 300 301 .. 800 801 .. 1,5K 1,5K+1 .. 2K
SELECT * FROM Users
Non-Clustered Index
…
A .. Z
A .. C C .. K X .. Z
…
1 .. 1M
1 .. 2K 2K+1 .. 4K 1M-2K .. 1M
SELECT * FROM UsersWHERE Name = 'John Dow'
1 .. 2K 2K .. 4K 1M-2K .. 1M
Clustered Index (Id)
Non-Clustered Index (Name)
Heap
Statistics
500
1000
10
1200
800
1 800 2000 2800 4500 5400
SELECT * FROM UsersWHERE Id BETWEEN 2100 AND 2500SELECT * FROM UsersWHERE Id BETWEEN 200 AND 5000
Joins – Nested LoopsId = 1
Id = 2
Id = 3
Id = 4
UserId = 1
UserId = 4
UserId = 5
UserId = 1
UserId = 3
UserId = 4
Users
Badges
Joins – Hash JoinId = 1
Id = 2
Id = 3
Id = 4
UserId = 1
UserId = 4
UserId = 5
UserId = 1
UserId = 3
UserId = 4
Clients
Work
0
1
2
3
0
0
3
3
10
2
Joins – Merge JoinId = 1
Id = 2
Id = 3
Id = 4
UserId = 1
UserId = 4
UserId = 5
UserId = 1
UserId = 3
UserId = 4
Users
Badges
Query Plan Alternatives
• 1 Table – 1 option
• 2 Tables – 2 options
• 3 Tables – 6 options
• 4 Tables – 24 options
• …
• 10 Tables – 3628800 options
• 1 Table – 1!
• 2 Tables – 2!
• 3 Tables – 3!
• 4 Tables – 4!
• …
• 10 Tables – 10!
Exploring Search Space
• JOIN(A,B,C,D)
• JOIN(A,B,D,C)
• JOIN(A,C,D,B)
• JOIN(A,D,B,C)
• JOIN(A,D,C,B)
• JOIN(B,A,C,D)
• JOIN(B,A,D,C)
• …
• O(N!)
• JOIN(A,B,C,D)
• A – Optimal Access Path
• B – Optimal Access Path
• … pruning
• (A,B) – Optimal Access Path
• … pruning
• (A,B),(C)
• … pruning
• O(𝑁2𝑛−1)
Quite Complex Query
SELECT * FROM Posts pINNER JOIN Users u
ON p.OwnerUserId = u.IdWHERE PostedUserName LIKE 'B%'GROUP BY u.Id, u.NameHAVING COUNT(*) > 1ORDER BY u.Name
Thank You!
Denis Reznik
Twitter: @denisreznik
Email: [email protected]
Blog: http://reznik.uneta.com.ua
Facebook: https://www.facebook.com/denis.reznik.5
LinkedIn: http://ua.linkedin.com/pub/denis-reznik/3/502/234