Upload
vlad-mihalcea
View
1.371
Download
1
Embed Size (px)
Citation preview
High-Performance HibernateVLAD MIHALCEA
About me
• @Hibernate Developer
• vladmihalcea.com
• @vlad_mihalcea
• vladmihalcea
Agenda
• Performance and Scaling
• Connection providers
• Identifier generators
• Relationships
• Batching
• Fetching
• Caching
Performance Facts
“More than half of application performance bottlenecks originate in the database”
AppDynamics - http://www.appdynamics.com/database/
Google Ranking
“Like us, our users place a lot of value in speed — that's why we've decided to take site speed into account in our search
rankings.”
https://webmasters.googleblog.com/2010/04/using-site-speed-in-web-search-ranking.html
Performance and Revenue
“It has been reported that every 100ms of latency costs
Amazon 1% of profit.”
http://radar.oreilly.com/2008/08/radar-theme-web-ops.html
Response Time and Throughput
• n - number of completed transactions
• t - time interval
𝑇𝑎𝑣𝑔 =𝑡
𝑛=
1𝑠
100= 10 𝑚𝑠
𝑋 =𝑛
𝑡=
100
1𝑠= 100 𝑇𝑃𝑆
Response Time and Throughput
𝑋 =1
𝑇𝑎𝑣𝑔
“The lower the Response Time,
The higher the Throughput”
The anatomy of a database transaction
Response Time
• connection acquisition time
• statement submit time
• statement execution time
• result set fetching time
• idle time prior to releasing database connection
𝑇 = 𝑡𝑎𝑐𝑞 + 𝑡𝑟𝑒𝑞 + 𝑡𝑒𝑥𝑒𝑐 + 𝑡𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒
Agenda
• Performance and Scaling
• Connection providers
• Identifier generators
• Relationships
• Batching
• Fetching
• Caching
Connection Management
Metric DB_A (ms) DB_B (ms) DB_C (ms) DB_D (ms) HikariCP (ms)
min 11.174 5.441 24.468 0.860 0.001230
max 129.400 26.110 74.634 74.313 1.014051
mean 13.829 6.477 28.910 1.590 0.003458
p99 20.432 9.944 54.952 3.022 0.010263
𝑇 = 𝑡𝑎𝑐𝑞 + 𝑡𝑟𝑒𝑞 + 𝑡𝑒𝑥𝑒𝑐 + 𝑡𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒
Connection Providers
DataSourceConnectionProvider
Connection Provisioning
FlexyPool
• concurrent connections
• concurrent connection requests
• connection acquisition time
• connection lease time histogram
• maximum pool size
• overflow pool size
• retries attempts
• total connection acquisition time
• Java EE
• Bitronix / Atomikos
• Apache DBCP / DBCP2
• C3P0
• BoneCP
• HikariCP
• Tomcat CP
• Vibur DBCP
https://github.com/vladmihalcea/flexy-pool
FlexyPool – Concurrent connection requests
1
28
55
82
10
9
13
6
16
3
19
0
21
7
24
4
27
1
29
8
32
5
35
2
37
9
40
6
43
3
46
0
48
7
51
4
54
1
56
8
59
5
62
2
64
9
67
6
70
3
73
0
75
7
78
4
81
1
83
8
86
5
89
2
91
9
94
6
97
3
10
00
10
27
0
2
4
6
8
10
12
Sample time (Index × 15s)
Co
nn
ecti
on
req
ues
ts
max mean p50 p95 p99
FlexyPool – Pool size growth
1
28
55
82
10
9
13
6
16
3
19
0
21
7
24
4
27
1
29
8
32
5
35
2
37
9
40
6
43
3
46
0
48
7
51
4
54
1
56
8
59
5
62
2
64
9
67
6
70
3
73
0
75
7
78
4
81
1
83
8
86
5
89
2
91
9
94
6
97
3
10
00
10
27
0
1
2
3
4
5
6
Sample time (Index × 15s)
Max
po
ol s
ize
max mean p50 p95 p99
FlexyPool – Connection acquisition time
12
85
58
21
09
13
61
63
19
02
17
24
42
71
29
83
25
35
23
79
40
64
33
46
04
87
51
45
41
56
85
95
62
26
49
67
67
03
73
07
57
78
48
11
83
88
65
89
29
19
94
69
73
10
00
10
27
0
500
1000
1500
2000
2500
3000
3500
Sample time (Index × 15s)
Co
nn
ecti
on
acq
uis
itio
n t
ime
(ms)
max mean p50 p95 p99
FlexyPool – Connection lease time
1
29
57
85
11
3
14
1
16
9
19
7
22
5
25
3
28
1
30
9
33
7
36
5
39
3
42
1
44
9
47
7
50
5
53
3
56
1
58
9
61
7
64
5
67
3
70
1
72
9
75
7
78
5
81
3
84
1
86
9
89
7
92
5
95
3
98
1
10
09
10
37
0
5000
10000
15000
20000
25000
30000
35000
40000
Sample time (Index × 15s)
Co
nn
ecti
on
leas
e ti
me
(ms)
max mean p50 p95 p99
Agenda
• Performance and Scaling
• Connection providers
• Identifier generators
• Relationships
• Batching
• Fetching
• Caching
JPA Identifier Generators
𝑇 = 𝑡𝑎𝑐𝑞 + 𝑡𝑟𝑒𝑞 + 𝑡𝑒𝑥𝑒𝑐 + 𝑡𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒
• IDENTITY
• SEQUENCE
• TABLE
• AUTO
IDENTITY
• In Hibernate, IDENTITY generator disables JDBC batch inserts
• MySQL 5.7 does not offer support for database SEQUENCE
SEQUENCE
• Oracle, PostgreSQL, and even SQL Server 2012
• May use roundtrip optimizers: hi/lo, pooled, pooled-lo
• By default, Hibernate 5 uses the enhanced sequence generators
<property name="hibernate.id.new_generator_mappings" value="true"/>
SEQUENCE - Pooled optimizer (50 rows)
1 5 10 50
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
Sequence increment size
Tim
e (m
s)
TABLE
• Uses row-level locks and a separate transaction/connection
• May use roundtrip optimizers: hi/lo, pooled, pooled-lo
• By default, Hibernate 5 uses the enhanced sequence generators
<property name="hibernate.id.new_generator_mappings" value="true"/>
TABLE - Pooled optimizer (50 rows)
1 5 10 50
0
0.5
1
1.5
2
2.5
3
Table increment size
Tim
e (m
s)
IDENTITY vs TABLE (100 rows)
• IDENTITY makes no use of batch inserts
• TABLE generator using a pooled optimizer with an increment size of 100
IDENTITY vs TABLE (100 rows)
1 2 4 8 16
0
500
1000
1500
2000
2500
Thread count
Tim
e (m
s)
Identity Table
AUTO: IDENTITY vs TABLE?
• Prior to Hibernate 5, AUTO would resolve to IDENTITY if the database supports such a feature
• Hibernate 5 uses TABLE generator if the database does not support sequences
SEQUENCE vs TABLE (100 rows)
• Both benefiting from JDBC batch inserts
• Both using a pooled optimizer with an increment size of 100
SEQUENCE vs TABLE (100 rows)
1 2 4 8 16
0
200
400
600
800
1000
1200
Thread count
Tim
e (m
s)
Sequence Table
Agenda
• Performance and Scaling
• Connection providers
• Identifier generators
• Relationships
• Batching
• Fetching
• Caching
Relationships
𝑇 = 𝑡𝑎𝑐𝑞 + 𝑡𝑟𝑒𝑞 + 𝑡𝑒𝑥𝑒𝑐 + 𝑡𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒
Agenda
• Performance and Scaling
• Connection providers
• Identifier generators
• Relationships
• Batching
• Fetching
• Caching
Batching
𝑇 = 𝑡𝑎𝑐𝑞 + 𝑡𝑟𝑒𝑞 + 𝑡𝑒𝑥𝑒𝑐 + 𝑡𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒
• SessionFactory setting
• Session-level configuration since Hibernate 5.2
Batching - SessionFactory
<property name="hibernate.jdbc.batch_size" value="5"/>
• Switching from non-batching to batching
Batching - Session
doInJPA( this::entityManagerFactory, entityManager -> {
entityManager.unwrap( Session.class ).setJdbcBatchSize( 10 );
for ( long i = 0; i < entityCount; ++i ) {Person = new Person( i, String.format( "Person %d", i ) );entityManager.persist( person );
if ( i % batchSize == 0 ) {entityManager.flush();entityManager.clear();
}}
} );
Batching
DEBUG [main]: n.t.d.l.SLF4JQueryLoggingListener –Name:DATA_SOURCE_PROXY, Time:1, Success:True, Type:Prepared, Batch:True, QuerySize:1, BatchSize:10, Query: ["insert into Person (name, id) values (?, ?)"], Params:[(Person 1, 1), (Person 2, 2), (Person 3, 3), (Person 4, 4), (Person 5, 5), (Person 6, 6), (Person 7, 7), (Person 8, 8), (Person 9, 9), (Person 10, 10)]
Insert PreparedStatement batching (5k rows)
1 10 20 30 40 50 60 70 80 90 100 1000
0
200
400
600
800
1000
1200
1400
1600
Batch size
Tim
e (m
s)
DB_A DB_B DB_C DB_D
Update PreparedStatement batching (5k rows)
1 10 20 30 40 50 60 70 80 90 100 1000
0
100
200
300
400
500
600
700
Batch size
Tim
e (m
s)
DB_A DB_B DB_C DB_D
Delete PreparedStatement batching (5k rows)
1 10 20 30 40 50 60 70 80 90 100 1000
0
200
400
600
800
1000
1200
Batch size
Tim
e (m
s)
DB_A DB_B DB_C DB_D
Batching - Cascading
<property name="hibernate.order_inserts" value="true"/>
<property name="hibernate.order_updates" value="true"/>
Batching – @Version
<property name="hibernate.jdbc.batch_versioned_data" value="true"/>
• Enabled by default in Hibernate 5
• Disabled in Hibernate 3.x, 4.x, and for Oracle 8i, 9i, and 10g dialects
Agenda
• Performance and Scaling
• Connection providers
• Identifier generators
• Relationships
• Batching
• Fetching
• Caching
Fetching
𝑇 = 𝑡𝑎𝑐𝑞 + 𝑡𝑟𝑒𝑞 + 𝑡𝑒𝑥𝑒𝑐 + 𝑡𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒
• JDBC fetch size
• JDBC ResultSet size
• DTO vs Entity queries
• Fetching relationships
Fetching – JDBC Fetch Size
• Oracle – Default fetch size is 10
• SQL Server – Adaptive buffering
• PostgreSQL, MySQL – Fetch the whole ResultSet at once
• SessionFactory setting:
<property name="hibernate.jdbc.fetch_size" value="100"/>
Fetching - JDBC fetch size
• Query-level hint:
List<PostCommentSummary> summaries = entityManager.createQuery(
"select new PostCommentSummary( " +" p.id, p.title, c.review ) " +"from PostComment c " +"join c.post p")
.setHint(QueryHints.HINT_FETCH_SIZE, fetchSize)
.getResultList();
Fetching – JDBC Fetch Size (10k rows)
1 10 100 1000 10000
0
100
200
300
400
500
600
Fetch size
Tim
e (m
s)
DB_A DB_B DB_C DB_D
Fetching – Pagination
• JPA / Hibernate API works for both entity and native queries
List<PostCommentSummary> summaries = entityManager.createQuery(
"select new PostCommentSummary( " +" p.id, p.title, c.review ) " +"from PostComment c " +"join c.post p")
.setFirstResult(pageStart)
.setMaxResults(pageSize)
.getResultList();
Fetching – 100k vs 100 rows
Fetch all Fetch limit
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
Tim
e (m
s)
DB_A DB_B DB_C DB_D
Fetching – Pagination
• Hibernate uses OFFSET pagination
• Keyset pagination scales better when navigating large result sets
• http://use-the-index-luke.com/no-offset
Fetching – Entity vs Projection
• Selecting all columns vs a custom projection
SELECT *FROM post_comment pcINNER JOIN post p ON p.id = pc.post_idINNER JOIN post_details pd ON p.id = pd.id
SELECT pc.versionFROM post_comment pcINNER JOIN post p ON p.id = pc.post_idINNER JOIN post_details pd ON p.id = pd.id
Fetching – Entity vs Projection
All columns Custom projection
0
5
10
15
20
25
30
Tim
e (m
s)
DB_A DB_B DB_C DB_D
Fetching – DTO Projections
• Read-only views
• Tree structures (Recursive CTE)
• Paginated Tables
• Analytics (Window functions)
Fetching – Entity Queries
• Writing data
• Web flows / Multi-request logical transactions
• Application-level repeatable reads
• Detached entities / PersistenceContextType.EXTENDED
• Optimistic concurrency control (e.g. version, dirty properties)
Fetching – Relationships
Association FetchType
@ManyToOne EAGER
@OneToOne EAGER
@OneToMany LAZY
@ManyToMany LAZY
• LAZY associations can be fetched eagerly
• EAGER associations cannot be fetched lazily
Fetching – Best Practices
• Default to FetchType.LAZY
• Fetch directive in JPQL/Criteria API queries
• Entity graphs / @FetchProfile
• LazyInitializationException
Fetching – Open Session in View Anti-Pattern
Fetching – Temporary Session Anti-Pattern
• “Band aid” for LazyInitializationException
• One temporary Session/Connection for every lazily fetched association
<property name="hibernate.enable_lazy_load_no_trans" value="true"/>
Agenda
• Performance and Scaling
• Connection providers
• Identifier generators
• Relationships
• Batching
• Fetching
• Caching
Caching
𝑇 = 𝑡𝑎𝑐𝑞 + 𝑡𝑟𝑒𝑞 + 𝑡𝑒𝑥𝑒𝑐 + 𝑡𝑟𝑒𝑠 + 𝑡𝑖𝑑𝑙𝑒
Caching – Why 2nd - Level Caching
Caching – Why 2nd - Level Caching
“There are only two hard things in Computer Science: cache invalidation and naming
things.”
Phil Karlton
Caching – Strategies
Strategy Cache type Particularity
READ_ONLY READ-THROUGH Immutable
NONSTRICT_READ_WRITE READ-THROUGH Invalidation/
Inconsistency risk
READ_WRITE WRITE-THROUGH Soft Locks
TRANSACTIONAL WRITE-THROUGH JTA
Caching – Collection Cache
• It complement entity caching
• It stores only entity identifiers
• Read-Through
• Invalidation-based (Consistency over Performance)
Caching – Read - Write Aggregates
Questions and Answers
https://leanpub.com/high-performance-java-persistence
• Performance and Scaling
• Connection providers
• Identifier generators
• Relationships
• Batching
• Fetching
• Caching