Upload
solarisyougood
View
62
Download
0
Tags:
Embed Size (px)
Citation preview
© Copyright IBM Corporation, 2011
Tzahi Shahak
Product ManagerReal-time Compression
Real-time Compression in Storwize V7000 & SAN Volume Controller (SVC)
2 © Copyright IBM Corporation, 2011
Compression Without CompromiseEnhancing Storwize V7000 to Deliver Extraordinary Efficiency
• Introducing IBM Storwize V7000 Real-time Compression• Innovative, easy-to-use compression fully integrated into Storwize
V7000• High performance implementation supports active primary
workloads
• Storwize V7000 Real-time Compression typically delivers 50% or better compression for data that is not already compressed
• Compression helps reduce– Storage purchase costs
– Rack space
– Cooling
– Software costs for additional functions
• Compression can help freeze storage growth or delay need for additional purchases
3 © Copyright IBM Corporation, 2011
Compression Without CompromiseAdvantages Compared with other Technologies
IBM Real-time Compression can be used with active primary data– High performance compression supports workloads off-limits to other
alternatives
– Significantly expands candidate data for compression
– Greater compression benefits through use on more types of data
IBM Real-time Compression operates immediately and is easy to manage
– No need to schedule periods to run post-process compression
– Eliminates need to reserve space for uncompressed data waiting post-processing
IBM Real-time Compression supports all Storwize V7000 storage– Internal or externally virtualized storage
– Can significantly enhance value of existing storage assets
4 © Copyright IBM Corporation, 2011
Compression Without CompromiseAdvantages Compared with other Technologies
IBM Real-time Compression can be used with active primary data– High performance compression supports workloads off-limits to other
alternatives
– Significantly expands candidate data for compression
– Greater compression benefits through use on more types of data
IBM Real-time Compression operates immediately and is easy to manage
– No need to schedule periods to run post-process compression
– Eliminates need to reserve space for uncompressed data waiting post-processing
IBM Real-time Compression supports all Storwize V7000 storage– Internal or externally virtualized storage
– Can significantly enhance value of existing storage assets
5 © Copyright IBM Corporation, 2011
Real-time Compression – Basics
Compression is an alternative to Thin Provisioning– They both allow you to use less physical space on disk than is presented to
the host A Compressed Volume is “a kind of” Thin Provisioning
– Only uses physical storage to store compressed data– Volume can be built from a pool using internal or external MDisks
Compression requires the I/O group hardware be one of the following platforms
– SVC Model 2145-CF8/CG8 Nodes– Storwize V7000 Model 2076-1xx/3xx Control Enclosure
Can use Volume mirroring to convert to a Compressed Volume
6 © Copyright IBM Corporation, 2011
Real-time Compression – Basics
Maximum of 200 Compressed Volumes per I/O group will initially be supported
Licensing is as follows:– For SVC it is per TB of Volume capacity as seen by a host
• Need fifty 100GB Compressed Volumes so need 5TB license
– For Storwize V7000 it is per enclosure• E.g. Customer has 4 enclosure system and is virtualizing an external disk
system with 2 enclosures they would require 6 enclosure license Note: Creating the first Compressed Volume in an I/O
group will instantly dedicate CPU and memory resources from the nodes/node canisters in that I/O group to the compression engine
– So planning/sizing should be done before implementing in a production environment
More detail on this and how compression works will be provided on the June 13th call tomorrow
7 © Copyright IBM Corporation, 2011
Clien ts
SVC S /W C om ponent
RAC E S /W C om ponent
F ro nt E nd
R e m o te C o p y
C ac he
F las h C o p y
Mirro ring
T hin P ro vis io ning
V irtualizatio n
Storag e
B ack E nd
R andom AccessC ompression
Engine™
All copy services will interoperate with compressed Volumes
– All copy services will be working with uncompressed data
• No real changes in sizing and planning for FlashCopy or replication
– Bandwidth sizing for replication same for compressed/non-compressed Volumes
– Compression engine resources allocated per I/O group need considered in sizing
All Thin Provisioning properties apply to compressed Volumes
– Virtual capacity, real capacity, used capacity, etc.
New property introduced– Uncompressed capacity
• Provides an indication of how much uncompressed data has been written to the Volume
Real-time Compression – Basics
8 © Copyright IBM Corporation, 2011
Real-time Compression – GUI Support
GUI Displays Compression Savings on a Volume, Pool and System basis:
9 © Copyright IBM Corporation, 2011
Real-time Compression – GUI Support
GUI Performance panel shows separate CPU utilization for Compression and System workloads
10 © Copyright IBM Corporation, 2011
Real-time Compression – Sizing Tools
The following tools will be available to support customers deploying Compression
– Disk Magic
• Will ask the user to provide an “Effectiveness” value (similar to Easy Tier)– Available later this year
– Capacity Magic
• Will ask the user to provide a compression ratio to complete the sizing
– Comprestimator
• A tool to estimate the compression ratio which is achievable for a given set of data
• Loaded on customer’s hosts
11 © Copyright IBM Corporation, 2011
Real-time Compression – 45 Day Trial License
45 Day Free Trial License of Compression Function– Included in software so simply activate using the GUI by setting to
something other then zero to avoid errors in event log
13 © Copyright IBM Corporation, 2011
• Storwize V7000 with Real-time Compression delivers up to 4x compression while maintaining VMware and Application performance
VMware VMmark Performance Benchmarks
MailServer Score MailServer QoS
OLIO QoSOLIO Score
Compression Without Compromise – VMware
Uncompressed
Compressed
Measured Score(Performance)
0
100
200
300
400
Uncompressed
Compressed
Uncompressed
Compressed
0
1000
2000
3000
4000
Measured Score(Performance)
0
10
20
30
40
50
Measured QoS(Response Time)
Uncompressed
Compressed
Measured QoS(Response Time)
010203040506070
14 © Copyright IBM Corporation, 2011
Compression Without Compromise
Storwize V7000 with Real-time Compression delivers up to 5x compression while maintaining or improving application business throughput
Business throughput / Transactional IOPS – higher is better
Source: IBM lab measurements, 96/48-drive configurations
0
1500
3000
TransactionProcessing
Tra
nsa
cti
on
Th
rou
gh
pu
t
No Compression
Compressed
15 © Copyright IBM Corporation, 2011
0
.8
1.2
.4
The benchmark was performed using a Storwize V7000 system with 300GB SAS HDDs and 300GB SSDs. 1.2TB DB2 database with 700 concurrent clients were used in the benchmark. The same test was performed with compressed volumes and non-compressed volumes.
Stock Level
Res
pons
e T
ime
in S
econ
ds
Delivery Order Status
No Compression - 96 disks1.144
.857
.468
Database Performance
Storwize V7000 with Real-time Compression delivers up to 5x compression while maintaining or improving database transaction response time and overall business throughput
Tested using industry standard TPC-C Benchmark – 1.2TB DB2 Database with 700 users
Response time in seconds – lower is better (faster response time)
Compressed – 48 disks
.701
.665
.385.46
.20
.501
Compressed – 6 Flash Drives
16 © Copyright IBM Corporation, 2011
Beta Customer database testing resultsSVC virtualizing 6-node XIV Gen2 configuration
Orion (Oracle I/O Calibration Tool) is a standalone tool for calibrating the I/O performance for storage systems that are intended to be used for Oracle databases. The calibration results are useful for understanding the performance capabilities of a storage system, either to uncover issues that would impact the performance of an Oracle database or to size a new database installation
17 © Copyright IBM Corporation, 2011
Expected Compression Rates
IBM Comprestimator tool should be used to evaluate expected compression benefits in existing environments
18 © Copyright IBM Corporation, 2011
Comprestimator is a host based utility for a fast estimation of a block device compression ratio
Objectives: Run over a block device Estimates:
– Portion of non-zero blocks in the volume
– Compression rate of non-zero blocks
Performance: Runs FAST! < 60 seconds, no matter what the volume size is Provides accuracy level for the estimation: ~5 % max error
– Can improve guarantee with more samples (longer running time)
Method: Random sampling and compression throughout the volume Collect enough non-zero samples to gain desired confidence
– More zero blocks slower (takes more time to find non-zero blocks) Mathematical analysis gives confidence guarantees Note: the tool is estimating compression during migration of a volume into RtC
Comprestimator
19 © Copyright IBM Corporation, 2011
Compression Implementation Guidelines Compression performance –
– Performance of thin provisioned volumes (supported by the same number of HDDs) with and without compression is roughly equivalent
Use Comprestimator to identify workloads that are good candidates for compression
– More than 45% savings – recommend to compress
– Between 25-45% savings – recommend evaluating workload with compression
– Less than 25% savings – recommend avoiding compression
Common workloads suitable for compression– Databases – DB2, Oracle, MS-SQL, etc.
– Applications based on databases – SAP, Oracle Applications, etc.
– Server Virtualization – KVM, VMware, Hyper-V, etc.
– Other compressible workloads – engineering, seismic, collaboration, etc.
Common workloads NOT suitable for compression– Workloads using pre-compressed data types such as video, images, audio, etc.
– Workloads using encrypted data
– Heavy sequential write oriented workloads
– Other workloads using incompressible data or data with low compression rate
20 © Copyright IBM Corporation, 2011
Additional Considerations
Compression is supported for a maximum of 200 compressed volume copies per I/O group. Note this limit applies only to compressed volumes, there is no restriction for the number of non-compressed volumes
Recommended to use compression on:– 4 core systems (V7000, CF8, older CG8) with less than 25% CPU utilization (before enabling
compression)
– 6 core systems (newer CG8) with less than 50% CPU utilization (before enabling compression)
The CPU reallocation is done as soon as the first compressed volume is defined (even if it is not used)
If existing system CPU utilization is over the thresholds mentioned above – in environments with less than 4 I/O groups a new I/O group to support compression can be added to the cluster
Compressed volumes are not supported with Easy Tier in this release
21 © Copyright IBM Corporation, 2011
Additional Considerations
Compression is supported for a maximum of 200 compressed volume copies per I/O group. Note this limit applies only to compressed volumes, there is no restriction for the number of non-compressed volumes
Recommended to use compression on:– 4 core systems (V7000, CF8, older CG8) with less than 25% CPU utilization (before enabling
compression)
– 6 core systems (newer CG8) with less than 50% CPU utilization (before enabling compression)
The CPU reallocation is done as soon as the first compressed volume is defined (even if it is not used)
If existing system CPU utilization is over the thresholds mentioned above – in environments with less than 4 I/O groups a new I/O group to support compression can be added to the cluster
Compressed volumes are not supported with Easy Tier in this release
28 © Copyright IBM Corporation, 2011
Additional Considerations
Compression is supported for a maximum of 200 compressed volume copies per I/O group. Note this limit applies only to compressed volumes, there is no restriction for the number of non-compressed volumes
Recommended to use compression on:– 4 core systems (V7000, CF8, older CG8) with less than 25% CPU utilization (before enabling
compression)
– 6 core systems (newer CG8) with less than 50% CPU utilization (before enabling compression)
The CPU reallocation is done as soon as the first compressed volume is defined (even if it is not used)
If existing system CPU utilization is over the thresholds mentioned above – in environments with less than 4 I/O groups a new I/O group to support compression can be added to the cluster
Compressed volumes are not supported with Easy Tier in this release
30 © Copyright IBM Corporation, 2011
Compression Basics – Lempel Ziv
Detects repetitions in the data
Replaces portions of the data with references to matching data
30
ASJKDFHASJABCDEFORIUFSDFWEIRUCMNXSDFKWIOEUZXCMZXNVSFJSDFLSJCXCSKLRJHWEIOUOCZXCVKMSNDKSFSJZXM23NB33KJK1J1HJGJHHJ1VFHJGFHJ1GHJG23GJ123ABCDEFDHJKWEIORUWOEIRIXCVLXVJLASDFSDF’LSDGRERMNJDFJKGDGJERTYUIRDJKGHDKJTEHTREUITYEUIDWOSIOSDFWEOIRUABCDEFKDFHSDFJHWEIORWERYWEFUWYEIRUWERYXDKJFHSWETR5DFGCVBNA1SFSKLJFSKLDFJSLKDFJSLKDFJSKLDFJSDLFKJSDFKLJSDFKLJSDLFJSDFKLSJDFKLSJDEI4SDFDFDFDSDSDFSDFSDFSDFDSDFSDFSSDF283
4HKJH
31 © Copyright IBM Corporation, 2011
Compression Basics – Sliding Window
• Repetitions can be detected only within the sliding window history
• Common sliding window size – 32K
• Repetitions outside the window can not be referenced
Window size limit• Memory footprint required to
hold history in searchable manner
• Processing power required for searching larger history window
• Size of pointer needed to reference small repetition
31
ASJKDFHASJHWRETORIUFSDFWEIRUCMNXSDFKWIOEUZXCMZXNVSFJSDFLSJCXABCDEFHWEIOUOCZXCVKMSNDKSFSJZXM23NB33KJK1J1HJGJHHJ1VFHJGFHJ1GHJG23GJ123ABCDEFDHJKWEIORUWOEIRIXCVLXVJLASDFSDF’LSDGRERMNJDFJKGDGJERTYUIRDJKGHDKJTEHTREUITYEUIDWOSIOSDFWEOIRUKDFHSDFJHWEIORWERYWEFUWYEIRUWERYXDKJFHSWETR5DFGCVBNA1SFSKLJFSKLDFJSLKDFJSLKDFJSKLDFJSDLFKJSDFKLJSDFKLJSDABCDEFLFJSDFKLSJDFKLSJDEI4SDFDFDFDSDSDFSDFSDFSDFDSDFSDFSSDF2
834HKJH
32 © Copyright IBM Corporation, 2011
Compression Basics – Huffman Coding
Detects common characters in data
Represents common character using less bits
32
IAIIIABIBIIDMIBBBMIIIIIBBBMADKLEBBIIIIBBBIIIIAJHJKJDAMMMMIIIIIIIBBBIIIIISDFDIOIIIIIIIABBBBBMIIIIMMMIIIIIIDDFMMMMIIGFMMAEERTGMMDFMMMIIIIIIIAAABBBBBBBIIIIIUIIDIIIIIIDDGDBBBBBBBMMMEERMB
BIIBMI
Common Char Bit Representation
Other
I
B
M
0
10
110
111 + 8 Bits
33 © Copyright IBM Corporation, 2011
Compression – Random Access
• Data is dependent on preceding data due to the nature of compression
• In order to read from a specific location, all data before it has to be decompressed
• To write to a specific location, all data after it has to be recompressed as well
• Not effective for large files or block devices
• Compression implementations do not support random access
33
CompressedData
Data
34 © Copyright IBM Corporation, 2011
Compression – Random Access Chunks
• Break original data to fixed chunks• Each chunk is compressed and
decompressed independently• Enables some random access to
the data (reads, not writes)
• Large chunks – Heavy I/O penalty– 4KB update = 1MB read + 1MB write
• Small chunks – Poor compression
• Variable output– Data fragmentation– Lower performance over time– Lower compression ratio over time
34
CompressedData
Data
1
2
3
4
5
6
7
1 23
4 56
7
35 © Copyright IBM Corporation, 2011
Compression – Random Access Chunks
• Break original data to fixed chunks• Each chunk is compressed and
decompressed independently• Enables some random access to
the data (reads, not writes)
• Large chunks – Heavy I/O penalty– 4KB update = 1MB read + 1MB write
• Small chunks – Poor compression
• Variable output– Data fragmentation– Lower performance over time– Lower compression ratio over time
35
CompressedData
Data
1
2
3
4
5
6
7
1 23
4 56
7
37 © Copyright IBM Corporation, 2011
Variable Input Fixed Output
• RACE flips this approach, taking a variable data stream size and producing “fixed” output units
– Compressed volumes have a consistent layout
– Temporal locality: data that’s accessed together is compressed together
– Variable sized input chunks get better compression
– Requires fewer disk I/Os– Delivers better performance
• No Fragmentation• Consistent performance over time• Consistent compression ratio over
time
37
CompressedData
Data
1
2
3
4
5
6
123456
1
2
3
4
5
6
CompressedData
38 © Copyright IBM Corporation, 2011
Temporal Compression
Applications make multiple updates to data Traditional and post-process compression
uses fixed-sized chunks and compresses each update based on its location on a volume
RACE compression acts on data that is written around the same time (“temporal locality”) not according to location
Temporal locality is more related to real application operations
RACE takes advantage of the structure of the data and its application level relations
Better compression efficiency and performance
38
1 2 3
Time
TemporalCompression
Window
1
2
3
Location Compression
Window
# = Data Update
39 © Copyright IBM Corporation, 2011
Compressed Data Indexing
• Data is mapped to its location in the compressed container
• Efficient data updates are made possible with remapping
• Hierarchical indexing enables fast access and efficient memory usage
• Efficient write of the map with low I/O overhead
39
CompressedData
Data
Index
40 © Copyright IBM Corporation, 2011
Compression Journaling
• Compressed Data is written in a journal
– Physical location– Length– Data
• Journal entries are compressed• Compressed data populates
fixed length blocks
• Enables temporal data compression
• Compressed data write – no read before write
• Compressed data write – less data written to disk
40
W1
123456
W2 W3 W4
Time
Journal
CompressedData
1 2 3 4
C1 C2 C3 C4
41 © Copyright IBM Corporation, 2011
Progressive Compressed Block Write
• Each write from the host is compressed independently
• Compression rate of the resulting block is nearly identical to the compression ratio of compressing the entire data in one operation
• Compression dictionary is preserved between the independent writes
© Storwize 2010 Storwize Confidential and Proprietary 41
32K 34K 264K 1
20K 1K 7K
43 © Copyright IBM Corporation, 2011
The following terms are trademarks of International Business Machines Corporation in the United States, other countries, or both:IBM, IBM Logo, on demand business logo, Enterprise Storage Server, xSeries, BladeCenter, eServer, ServeRAID andFlashCopy, System Storage, Tivoli, Easy Tier, Active Cloud EngineThe following are trademarks or registered trademarks of other companies.Intel is a trademark of the Intel Corporation in the United States and other countries.Java and all Java-related trademarks and logos are trademarks or registered trademarks of Sun Microsystems, Inc., in the United States and other countries.Lotus, Notes, and Domino are trademarks or registered trademarks of Lotus Development Corporation.Linux is a registered trademark of Linus Torvalds.Microsoft, Windows and Windows NT are registered trademarks of Microsoft Corporation.SET and Secure Electronic Transaction are trademarks owned by SET Secure Electronic Transaction LLC.UNIX is a registered trademark of The Open Group in the United States and other countries.Storwize and the Storwize logo are trademarks or registered trademarks of Storwize Inc., an IBM Company.
* All other products may be trademarks or registered trademarks of their respective companies.
Notes:Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here.
IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.
All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions.
This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area.
The information on the new products is intended to outline our general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on the new products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for our products remains at our sole discretion.
All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.
Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography.
This presentation and the claims outlined in it were reviewed for compliance with US law. Adaptations of these claims for use in other geographies must be reviewed by the local country counsel for compliance with local laws.
Legal Information and Trademarks