Upload
tanel-poder
View
22.148
Download
12
Tags:
Embed Size (px)
DESCRIPTION
This is a very old presentation of mine about LOB internals and some performance tuning...
Citation preview
RMOUG Training Days 1/40Tanel Põder
LOB Internals and Performance Tuning
Tanel Põderindependent consultant
http://integrid.info
11-Feb-04
RMOUG Training Days2004
RMOUG Training Days 2/40Tanel Põder
• Introduction & about• Storing large content in Oracle database• LOB internal architecture• LOB physical storage planning• LOB cache layer tuning• Accessing and loading LOBs• Dev/architecture strategies• Temporary LOBs• Summary
Agenda
RMOUG Training Days 3/40Tanel Põder
Introduction & About• Name: Tanel Põder• Experience: 7 years as Oracle DBA• Occupation: independent consultant• Company: integrid.info• Other: Europe, Middle-East & Africa Oracle
User Group BoD member• Oracle Certified Master DBA• More information and presentations:
http://integrid.info
RMOUG Training Days 4/40Tanel Põder
Storing Content in Oracle• CHAR/VARCHAR2 - 2000/4000 byte limit• RAW - 2000 byte limit• LONG/LONG RAW - always stored in row
- no random access- one column per table- hard to maintain
LOBs - a powerful way for storing, accessing and maintaining large contentin Oracle database
RMOUG Training Days 5/40Tanel Põder
Large Content in VARCHARs• DBA_SOURCE
SQL> desc dba_source Name Null? Type ------------------- -------- --------------- OWNER VARCHAR2(30) NAME VARCHAR2(30) TYPE VARCHAR2(12) LINE NUMBER TEXT VARCHAR2(4000)
• Oracle holds its PL/SQL source code in VARCHAR2 pieces
• Source is split up by rows
RMOUG Training Days 6/40Tanel Põder
Large content in LONGs• DBA_VIEWS
SQL> desc dba_views Name Null? Type -------------------- -------- -------------- OWNER NOT NULL VARCHAR2(30) VIEW_NAME NOT NULL VARCHAR2(30) TEXT_LENGTH NUMBER TEXT LONG . . .
• LONGs are used in data dictionary despite they’ve been deprecated a long time ago…
RMOUG Training Days 7/40Tanel Põder
LOB Concepts• Able to store large content
– 4GB in 9i, 2-128TB in 10g (232-1 blocks)• Must also cope with some small content• Store both binary and character content• Ability for random access• Be manageable
– Partitioning, reorganizing• Can be used in parallel operations• External objects accessible and domain
indexable
RMOUG Training Days 8/40Tanel Põder
LOB Architecture• A LOB can be either stored inline with row
– enable storage in row option– Will be automatically moved ouf of line if it
grows large
• … or can be defined to always remain out-of-line– disable storage in row option
• A LOB can also reside outside the database (BFILEs)– Can be indexed using Oracle Text
RMOUG Training Days 9/40Tanel Põder
Inline LOB Architecture• create table emp (id number, name varchar2(100), fingerprint blob)
lob (fingerprint) store as lob_fingerprint (enable storage in row);
……9
……8
empty_clob()…7
inode…6
inode…5
BLAKE4
KING3
nullCLARK2
SCOTT1
FINGERPRINTNAMEID
LEAF BLOCK
10100101011101
101001010101
101001
SEGHDR
BITMAP 101001
LOB Segment
LOB Index
101001
101001
LOB Item
LOB Column
RMOUG Training Days 10/40Tanel Põder
Multiple LOBs in a Table• create table emp (id number, name varchar2(100), fingerprint blob,
picture blob) lob (fingerprint) store as lob_fprint (enable storage in row) lob (picture) store as lob_picture (enable storage in row);
inode
PICTURE
…9
…8
inode…7
…6
inode…5
BLAKE4
KING3
nullCLARK2
SCOTT1
FINGERPRINTNAMEID
10100101011101
101001010101
101001
101001
LEAF BLOCK
SEGHDR
BITMAP 101001 101001
101001
LEAF BLOCK
SEGHDR
BITMAP 101001 101001
101001
101001101001
101001010101
RMOUG Training Days 11/40Tanel Põder
10100101011101
101001010101
101001
101001010101
Out-of-line LOB Architecture• create table emp (id number, name varchar2(100), fingerprint blob,
picture blob) lob (fingerprint) store as lob_fprint (enable storage in row) lob (picture) store as lob_picture (disable storage in row);
………
………
LOB ID……
LOB ID……
null……
LOB ID…
LOB ID…
LOB ID……
LOB ID…
PICTUREFINGERPRINT…
BR
SEGHDR
BITMAP 101001
LOB Segment
LOB Index
101001
101001
LEAF LEAF LEAF
101001
RMOUG Training Days 12/40Tanel Põder
LOB Column Internal Structures• enable storage in row inline LOB
• enable storage in row out-of-line LOB
• disable storage in row
LOB INODE16 bytes
LOB LOCATOR20 bytes
DATAmax 3964 bytes
Max. total column size 4000 bytesNo LOB index entries
No LOB segment entries
LOB LOCATOR20 bytes
LOB INODE16 bytes
DATAalloc. in chunks
DATA
Up to 12 4-byte relative DBAs are stored inline. If LOB grows larger, LOB index will be used to find data chunks
POINTERS TO DATA0-48 bytes (12xRDBA)
LOB LOCATOR20 bytes
inodes inindex leafs
LOB index stores inodes for large LOB items, also for old chunk versions for
providing read consistency
RMOUG Training Days 13/40Tanel Põder
LOB Locator Structure• LOB locator is a pointer to LOB instances
physical location– Locator works the same way for both persistent
and temporary LOBs– 20 bytes in size– Contains 10 byte LOB ID (2B+8B LOB OID)– Includes some metainformation
• LOB locator column dump and structure:len(2), vsn(2), flg(4), bytl(2), lobid(10), inode(16), datacol 1: [36]
00 54 00 01 02 0c 80 00 00 02 00 00 00 01 00 00 00 02 8c d1 <- LOB locator
00 10 09 00 00 00 00 00 00 00 00 00 00 00 00 00 <- kdlinode
RMOUG Training Days 14/40Tanel Põder
LOB inode Structure• LOB inode is a structure for keeping track of
chunks belonging to a LOB item• LOB inode information can be kept in-row
– Stores RDBAs (relative DBAs) for chunks– Requires 16 bytes + 4 bytes per chunk in lobitem– max 12 chunks inline
• LOB inode information can be kept in LOB index– disable storage in row LOBs– enable storage in row LOBs with over 12 chunks– Old versions for LOB chunks
RMOUG Training Days 15/40Tanel Põder
An Inline LOB inode Examplecreate table t (a number, b clob) tablespace lobtest lob (b)
store as lob_tx (enable storage in row tablespace lobtest nocache nologging);
insert into t values (1,NULL);• The column simply has NULL value (or isn't stored in row)
insert into t values (2,empty_clob());col 1: [36] 00 54 00 01 02 0c 80 00 00 02 00 00 00 01 00 00 00 02 8c d1 00 10 09 00 00 00 00 00 00 00 00 00 00 00 00 00• The column stores 20-byte LOB locator, 16-byte inode structure, but no pointers
to chunks, since the LOB is empty
insert into t values (3,'X');col 1: [38] 00 54 00 01 02 0c 80 00 00 02 00 00 00 01 00 00 00 02 8c d2 00 12 09 00 00 00 00 00 00 02 00 00 00 00 00 01 58 00• The inserted value itself is stored in row (CLOBs are always in fixed width 2-
byte charset)
RMOUG Training Days 16/40Tanel Põder
An Inline LOB inode Example Cont'dinsert into t values (4, rpad('X', 1000, 'X'));col 1: [2036] 00 54 00 01 02 0c 80 00 00 02 00 00 00 01 00 00 00 02 8c d3 <-loc. 07 e0 09 00 00 00 00 00 07 d0 00 00 00 00 00 01 <-inode 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 <- data 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 ...• The total column size doesn't exceed 4000 bytes, thus stored in the row
insert into t values (5, rpad('X', 2000, 'X'));col 1: [40] 00 54 00 01 02 0c 80 00 00 02 00 00 00 01 00 00 00 02 8c d4 <-loc. 00 14 05 00 00 00 00 00 0f a0 00 00 00 00 00 02 03 c0 00 14 <-RDBA• The column is stored out-of line since it's total size with 20-byte locator and 16-
byte inode structure would exceed 4000 bytes
• Instead a 4-byte relative datablock address pointing to first block of the LOB chunk is stored inline
• Still no LOB index entries are inserted
RMOUG Training Days 17/40Tanel Põder
A Quick Look Into LOB datablockalter system dump datafile 15 block 20;
Start dump data blocks tsn: 18 file#: 15 minblk 20 maxblk 20buffer tsn: 18 rdba: 0x03c00014 (RDBA) (15/20)scn: 0x0000.002ec55d seq: 0x02 flg: 0x04 tail: 0xc55d1b02frmt: 0x02 chkval: 0xe8ea type: 0x1b=LOB BLOCKLong field block dump:Object Id 8582 LobId: 000100028CD4 PageNo 0 Version: 0x0000.00000001 pdba: 0 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 58 00 ..............
RMOUG Training Days 18/40Tanel Põder
LOB Index Structure• A B-tree like index structure
– Leafs contain pointers to LOB chunks (RDBAs)– Contains the inode if disable storage in row LOB– One LOB item may have several associated rows
in index, when lobitem consists of many chunks– Is subject to normal caching, logging and undo
mechanisms (unlike LOB segments)
• Has to be stored in the same tablespace with the LOB segment– That way addressing can be done using RDBAs– Logically related segments also grouped
physically
RMOUG Training Days 19/40Tanel Põder
disable storage in row LOB indexrow#9[7536] flag: ------, lock: 2, len=50, data:(32): 00 20 03 00 00 00 00 7d 1d 4c 00 00 00 00 00 01 <- kdlinode 01 80 00 eb 01 80 00 ec 01 80 00 ed 01 80 00 ee <- RDBAscol 0; len 10; (10): 00 00 00 01 00 00 00 00 2f 46 <- LOB IDcol 1; len 4; (4): 00 00 00 00row#10[7486] flag: ------, lock: 2, len=50, data:(32): 01 80 00 ef 01 80 00 f0 01 80 01 39 01 80 01 3a <- only RDBAs 01 80 01 3b 01 80 01 3c 01 80 01 3d 01 80 01 3e <- only RDBAscol 0; len 10; (10): 00 00 00 01 00 00 00 00 2f 46 <- LOB IDcol 1; len 4; (4): 00 00 00 04 <- offset for above chunks in LOBrow#11[7436] flag: ------, lock: 2, len=50, data:(32): 01 80 01 3f 01 80 01 40 01 80 01 41 01 80 01 42 01 80 01 43 01 80 01 44 01 80 01 45 01 80 01 46col 0; len 10; (10): 00 00 00 01 00 00 00 00 2f 46col 1; len 4; (4): 00 00 00 0c
RMOUG Training Days 20/40Tanel Põder
LOB Segment Structure• A bunch of datablocks organized in chunks
– Chunks are sometimes called pages or fatblocks
• A Chunk is the minimum size of IO for LOBs• Read consistency and rollback is
implemented by creating versions of chunks– During update, a new LOB ID is created– A new version of updated chunks is created– New locator's inode and chunk pointers in LOB
indexes point to new chunk version
• Old chunks are retained in LOB segment– Depending on PCTVERSION or RETENTION
RMOUG Training Days 21/40Tanel Põder
Changes Inside LOB Segment• Users access LOBs using locators• A locator allows read consistent
access to a LOB instance– Even when LOB chunks are
updated by anothertransaction
– Old locators pointers arekept in LOB index
– Old chunk versions areread, if they are over-written, ret. ORA-1555
ROOT
SEGHDR
BITMAP 101001
LOB Segment
LOB Index
101001
1 1 1
101001
101001
22
101001
RMOUG Training Days 22/40Tanel Põder
(N)CLOB charset issues• (N)CLOBs are always fixed-width starting
from 8i (UTF-8, AL32UTF16)– Varying width charsets are converted to fixed
width internally– Fixed width charsets will remain unchanged
• AL16UTF16 is always Big-Endian encoded– big-endian means that most significant byte of a
word is stored first
• UCS2 encoding is platform dependent• Comparing different-endian internal data
structures may user more CPU
RMOUG Training Days 23/40Tanel Põder
• Little endian stores less significant values of a word first (to lower memory address):
Original Little Endian Big EndianCharacter 'X': 00 58 58 00 00 58
• Transparent to user, but may have performance impact
• 10g automatically stores all (N)CLOB in Big-E– Use this query
to find old-fashionedLOBs
– Convert them to anew table usingCTAS
Little Endian vs. Big Endian
select o.name, c.name from obj$ o, col$ c, lob$ l
where l.obj#=o.obj# and o.obj#=c.obj# and l.intcol# = c.intcol# and c.type#=112 and bitand(l.property, 512) = 1;
RMOUG Training Days 24/40Tanel Põder
LOB Storage Planning and Tuning• LOB block Size
– Blocksize can be different from table block size– Chunk sizes are multiples of Oracle blocks
• Chunk Size– Bigger for big LOBs– Bigger allow Oracle to do multiblock reads– Bigger sizes keep LOB indexes smaller
• PCTVERSION / RETENTION• Caching (file system read, HW read/write)• Asynch IO
RMOUG Training Days 25/40Tanel Põder
Block Size ConsiderationsSQL> create table tl (a clob);SQL> insert into tl values (rpad('X', 1982, 'X'));SQL> insert into tl values (rpad('X', 1982, 'X'));SQL> analyze table tl compute statistics;SQL> select blocks, avg_space from user_tables where table_name = 'TL'; BLOCKS AVG_SPACE---------- ---------- 2 4070SQL> truncate table tl;SQL> alter table tl pctfree 0;SQL> insert into tl values (rpad('X', 1982, 'X'));SQL> insert into tl values (rpad('X', 1982, 'X'));SQL> analyze table tl compute statistics;SQL> select blocks, avg_space from user_tables where table_name = 'TL'; BLOCKS AVG_SPACE---------- ---------- 1 62
May cause space and IO wastageespecially in FREELIST managed tables
PCTFREE 10%
Datablock
PCTUSED 40%
RMOUG Training Days 26/40Tanel Põder
Partitioned LOBs• Similar to regular table partitioning• LOB segment partitions have to be in same
blocksize– But LOB segment block size may differ from
parent table segment's blocksize
• Partitioned LOBs are fully supported for tables
• 10g supports partitioned IOT with LOBs• Eases backup & recovery in VLDBs
RMOUG Training Days 27/40Tanel Põder
LOB Cache Layer Tuning• LOBs can have following internal caching
attributes:– CACHE– NOCACHE– CACHE READS (8.1.6+)
• Despite any settings, in-line LOBs are cached anyway, they are regular columns in a row
• LOB indexes are always cached• Client side caching can be done using OCI
and Thick JDBC– Uses lob buffering subsystem (LBS)
create table t (a clob) lob (a) store as lob_a (disable storage in row nocache nologging);
RMOUG Training Days 28/40Tanel Põder
CACHE LOB• Reads&writes are done through buffer cache• Any modifications will be logged
– Only redo vectors for actual changes in the block are logged
• Modifications are asynchronously written to disk by DBWR
• Can saturate buffer cache heavily– Especially for older databases withoug buffer
cache touch count mechanism– There is no _small_table_threshold parameter
for LOBs to avoid large buffered scans
RMOUG Training Days 29/40Tanel Põder
CACHE LOB Continued• Consider different blocksize and buffer pool
for LOBs• Wait event:
– db file seq/scattered read– P1: file number– P2: first DBA– P3: blockcount
• Good for accessing andmodifying the samedata frequently
101001
ServerPGA
DBWR
RMOUG Training Days 30/40Tanel Põder
NOCACHE LOB• Server process reads blocks directly to PGA• Writes are done also by server process• Server process waits until writes completed• Even though buffer cache is passed, some
coordination work still required• Wait event:
– direct path read / write (lob)– P1: file number– P2: first DBA– P3: blockcount
ServerPGA
RMOUG Training Days 31/40Tanel Põder
CACHE READS LOB• Reads will be done through buffer cache• Writes are direct• Good for frequently read, but rarely modified
data• Particularily good in environments with huge
incoming data feeds, where writes can be cached and batched on client side or in OCI
• Where caching writes would cause too much logging
RMOUG Training Days 32/40Tanel Põder
LOB Logging Attributes• Inline LOBs are always logged!• LOB indexes are always logged!• CACHE LOBs are always logged!• NOLOGGING works only for NOCACHE and
CACHE READS LOBs• Using LOGGING NOCACHE LOB, the whole
chunks will be logged, despite the size of updated content – performance impact
• NOLOGGING NOCACHE LOB introduces backup & recovery issues
RMOUG Training Days 33/40Tanel Põder
Nologging Direct IO Implications• Controlfile updates on every NOLOGGING
operation– Extreme controlfile parallel read and write
events
• Set event 10359 at level 1 to avoid controlfile updates– Documented in 9.0.1 docs and Note 1058851.6– The event doesn't change recoverability, only
RMANs ability detect files needing backup– Affects undrecoverable_change# in V$DATAFILE
• Backup & Recovery considerations
RMOUG Training Days 34/40Tanel Põder
Developer Strategies• Simply storing and reading LOBs is relatively
simple• Frequently manipulated LOBs may become
resource hungry• Use temporary LOBs
– Support in PL/SQL, OCI, JDBC– dbms_lob.createtemporary()
• Use LOB Buffering Subsystem (LBS)– Allows to buffer, modify and batch updates to
LOBs in client side
RMOUG Training Days 35/40Tanel Põder
Temporary LOBs• Regular LOBs stored in database are called
internal persistent LOBs• Temporary LOBs are in-session structures
used for efficient LOB manipulation– In-memory objects in UGA– “paged” to temporary tablespace– No logging, rollback or CR mechanisms– Is empty when instatiated– Can be instatiated using CACHE or NOCACHE
attribute• Very lightweight
RMOUG Training Days 36/40Tanel Põder
Architectural Strategies• Uniform access to LOBs whether stored
inside or outside the database (BFILEs)– The same code can be reused despite the
storage type (although BFILEs are read only)
• When loading, buffer and bundle at client side as much as possible– Means less calls to database and less direct
write requests if NOCACHE or CACHE READS LOBs
• Use LOB random access facilities if required– A major benefit over LONGs
RMOUG Training Days 37/40Tanel Põder
General Recommendations• The best way for optimizing performance of
an operation is not to do the operation at all– Cache, bundle, etc..– However, caching is not always the best
approach
• Several bugs related to ASSM and LOBs– Free space never reused, etc..– Use manual (freelist) segment space mgmt.– Any new functionality (such partitioned IOT with
LOBs) should be evaluated and tested carefully
• Incremental backups of fairly static content
RMOUG Training Days 38/40Tanel Põder
Summary• LOBs should be used instead of LONGs
– However in some cases RAW or VARCHAR may be better
• Can be enable storage in row or disable storage in row– Disable storage in row always out of line, leaving
20-byte LOB locator in row– Content accessed through LOB index– Enable storage in row keeps content in row if
content+overhead <= 4000 bytes (3964+20+16)
• NOCACHE, CACHE, CACHE READS, LOGGING, NOLOGGING options
RMOUG Training Days 39/40Tanel Põder
LOB Internals and Performance Tuning
Tanel Põder
Thank you!
http://[email protected]
integrid.info