Upload
harish-gupta
View
213
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Commvault
Citation preview
CommVault Building Block Configuration White Paper
June 2011
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 1
© Copyright 2011 CommVault Systems, Incorporated. All rights reserved. CommVault, CommVault and logo, the
"CV" logo, CommVault Systems, Solving Forward, SIM, Singular Information Management, Simpana, CommVault
Galaxy, Unified Data Management, QiNetix, Quick Recovery, QR, CommNet, GridStor, Vault Tracker, InnerVault,
Quick Snap, QSnap, SnapProtect, Recovery Director, CommServe, CommCell, ROMS, and CommValue are
trademarks or registered trademarks of CommVault Systems, Inc. All other third party brands, products, service
names, trademarks, or registered service marks are the property of and used to identify the products or services of
their respective owners. All specifications are subject to change without notice.
The information in this document has been reviewed and is believed to be accurate. However, neither CommVault
Systems, Inc. nor its affiliates assume any responsibility for inaccuracies, errors, or omissions that may be
contained herein. In no event will CommVault Systems, Inc. or its affiliates are liable for direct, indirect, special,
incidental, or consequential damages resulting from any defect or omission in this document, even if advised of
the possibility of such damages.
CommVault Systems, Inc. reserves the right to make improvements or changes to this document and information
contained within, and to the products and services described at any time, without notice or obligation.
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 2
CommVault Building Block Configuration White Paper
Contents
1. Introduction: What is a Building Block…………………………………………………………………..5
1.1. Physical Layer…………………………………………………………………………………………….…8
1.1.1. At a Glance Specifications and Configurations
1.1.2. Examples of Servers that meet Building Block Requirements
1.2. Logical View…………………………………………………………………………………………….…..10
1.2.1. Average Throughput
1.2.2. Deduplication Databases
1.2.3. Number of Deduplication Databases per Building Block
1.2.4. Deduplication Building Block Size Settings
1.2.5. Managing Multiple DDBs and Hardware Requirements
1.2.6. Disk Space required for DDBs
1.2.7. Disk Library
1.3. Disk Attachment Considerations
2. Global Deduplication Storage Policy……………………………………………………………….……17
2.1. Block Size
2.2. Disk Libraries
2.3. Remote Offices
2.4. Global Deduplication Storage Policy Caveats
2.5. Streams
2.6. Data Path Configuration
2.7. Use store Priming Option with Source-Side Deduplication
3. Deduplication Database Availability………………………………………………………………….....22
3.1. Considerations
4. Building Block Design……………………………………………………………………………………………27
4.1. Choosing the Right Building Block
4.2. Building Block Configuration Examples
5. Conclusion………………………………………………………………………………………………………..….34
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 3
Introduction
What is a Building Block?
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 4
1. What is a Building Block?
A large data center requires a data management solution that can be flexible, scalable and hardware
agnostic. This paper will illustrate how the CommVault Building Block Data Management Solution
delivers that solution.
The Building Blocks are flexible because they can grow by adding mount paths. They can also
accommodate different retentions and different data types all within the same deduplication
framework.
The Building Blocks are scalable because they can grow to hundreds of TB of unique data. Through
staggering full backups, the building blocks can protect large amounts of data with minimal
infrastructure which holds down cost and liability.
The Building Blocks are hardware agnostic by requiring hardware classes instead of specific models.
Within this paper we describe six different examples of adequate servers from three major
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 5
manufacturers. Additionally, the solution is completely flexible with respect to the storage
infrastructure including disk types, connectivity and brand.
A Building Block is a modular approach to data management. A single Building Block is capable of
managing 64 TB of deduplicated data within a Disk Library. Each Building Block also provides processing
throughput of at least 2 TB/hr. The Deduplication Building Block design is comprised of two layers; the
physical layer and logical layer. The physical layer is the actual hardware specification and configuration.
The logical layer is the CommCell® configuration that controls that hardware.
Physical Layer
There are FOUR design considerations that make up the Building Block’s physical layer:
Server
Data Throughput Rate
Disk Library Hardware
Deduplication Database (DDB) LUN
Logical Layer
There are SEVEN aspects that comprise the Building Block logical layer:
Average Throughput
Deduplication Databases
Number of Deduplication Databases per Building Block
Deduplication Building Block Size Settings
Managing Multiple Global Deduplication Databases and Hardware Requirements
Disk Space required for Deduplication Database
Disk Library
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 6
1.1. The Physical Layer
The physical layer comprises the hardware of the solution. In addition to servers, storage and
networking play apart in the physical layer.
1.1.1. At A Glance Specifications and Configurations
Minimum Server Specifications
Components 64 bit OS 2 CPU, Quad Core
32 GB RAM
Windows/Linux
Minimum Data Throughput Port Specifications
Option 1 (Recommended) 1 exclusive 10 GigE port
Option 2 4, 1 GigE Parts
NIC Teaming on Host
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 7
Disk Library Configuration
Option 1 (Recommended) Network attached Storage (NAS)
Exclusive 10 GigE port
7.2 K RPM SAS spindles
Option 2 SAS/FC/iSCSI
SAS: 6 Gbps HBA
FC: 8 Gbps HBA
iSCSI: Exclusive 10 GigE NIC
7.2 K RPM SATA/SAS spindles
Min. RAID 5 Raid groups with 7+ spindles each
2 TB LUNs up to 50 LUNs
Dedicated Storage Adaptors
Minimum DDB LUN Specifications
Option 1 - Internal Disk 6 Gbps SAS HBA DDB Volume Specifications 15 k RPM SAS spindles RAID 0 – 4 spindles RAID 5 – 5-6 spindles RAID 10 – 8 spindles RAID 50 – 10-12 spindles Note: The LUN hosting the DDB should be 3x the size of the active DDB in order to allow for recovery point reconstruction operations.
Option2 – SAN Disk FC: 8 Gbps HBA iSCSI: Exclusive 10 GigE NIC DDB Volume Specifications 15 k RPM physical disks RAID 0 – 4 spindles RAID 5 – 5-6 spindles RAID 10 – 8 spindles RAID 50 – 10-12 spindles Note: The LUN hosting the DDB should be 3x the size of the active DDB in order to allow for recovery point reconstruction operations.
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 8
1.1.2. Examples of Servers that Meet Building Block Requirements
Servers Blades
Dell R710 with H700 and H800 controllers and MD storage
Dell M610 blades on Dell M1000e enclosure with 10 GigE backplane with EqualLogic or MD3000i storage OR 8 Gbps FC fabric.
HP DL 380 G6 with 480i internal controller and FC/10 GigE iSCSI/ Gbps SAS for external storage
HP BL 460 or BL600 blades on in HP c7000 enclosure with 8 Gbps FC fabric and 10 GigE Ethernet fabric.
IBM x3550 or above with internal SAS controller and external SAS/FC/10 GigE iSCSI controller
IBM JS, PS or HS blade servers with FC/10 GigE fabrics
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 9
1.2. The Logical Layer
The logical layer is the software and configuration that controls the hardware. A properly configured
logical layer allows the physical layer to achieve its potential.
1.2.1. Average Throughput
The Building Block has a minimum throughput rate of 2 TB/hr up to a maximum of 4 TB/hr. A
single Building Block can transfer between 48 TB to 96 TB in a 24 hour period. A typical
streaming backup window is 8 hours, which allows a Building Block to transfer 16 TB to 32 TB of
data. The following table shows expected amounts of data transferred over specific time periods
and throughputs. Most design cases should be scaled from an assumption of 2 TB/hr, assuming
a configuration as recommended in this document.
Table
Backup Window Total amount of data P/H
2 TB/H 3 TB/H 4 TB/H
8 Hours 16 24 32
10 Hours 20 30 40
12 Hours 24 36 48
14 Hours 28 42 56
16 Hours 32 48 64
18 Hours 36 54 72
20 Hours 40 60 80
22 Hours 44 66 88
24 Hours 48 72 96
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 10
1.2.2. Deduplication Database
The Simpana® v9 Deduplication engine utilizes a multi-threaded C-Tree server mode database.
This database can scale to a maximum of 750 million records. This record limit is equivalent to
90 TB of data residing on the Disk Library and 900 TB of application data assuming a 10:1
deduplication ratio. The DDB has a recommended maximum of 50 concurrent connections, or
streams. Any configuration above 50 concurrent DDB connections will have a negative impact to
the Building Block performance and scalability.
Deduplication Database Characteristics
Database C-Tree server mode
Threaded Multi-Threaded
DDB Rows
500 to max 750 million records
Capacity 60-90 TB for unique data @128k block
Application Data 600 TB to 900TB @10:1 deduplication ratio
Connection 50 concurrent connection
1.2.3. Number of Deduplication Databases per Building Block
CommVault recommends hosting a single deduplication database per Building Block. However,
certain workloads may require higher concurrency but lower capacity. The Simpana
Desktop/Laptop Solution is a perfect example of this workload. For such workloads, it is possible
to host up to 2 DDBs per Building Block. This is known as DDB Extended Mode. Having the
additional DDB allows a total of 100 streams per Building Block enabling higher concurrency for
the workloads.
In DDB Extended Mode, the total capacity of the DDB’s will reach 60-90 TB combined. One DDB
may scale to 20 TB of raw data and the other to 40 TB raw data. There is no way to easily predict
the size to which a DDB will grow. In this configuration, it is a best practice to stagger the
backups so that only one DDB is utilized at a time. This will ensure that each DDB will scale
closer to the 60-90 TB of raw data capacity.
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 11
1.2.4. Deduplication Block Size Setting
It is a CommVault best practice to configure Simpana v9 Deduplication Storage Policy block sizes
at a minimum of 128 K. This recommendation is for all data types other than databases larger
than 100 GB. Large databases can be configured at 256 K (1TB to 5TB) or 512 K (> 5TB) block
sizes and should be configured WITHOUT software compression enabled at the Storage Policy
Level. This setting represents the block size that the data stream is cut up into. In Simpana v9,
enhancements have been made to eliminate the need for Storage Policies per data type. Any
block from 16 k to the configured block size will automatically be hashed and checked into the
deduplication database. This eliminates the complexity of multiple storage policies per data
type.
1.2.5. Managing Multiple DDBs and the Hardware Requirements.
The scalability of a DDB is highly dependent upon the deduplication block size. The larger the
block size, the more data can be stored in the Disk Library. Assuming a standard block size of
128 K, a DDB using a single store can comfortably grow to 64 TB without performance penalty.
Using this conservative number as a guide, one can predict the number of DDBs required for a
given amount of unique data. By default, the software will generate hashes for blocks that are
smaller than the specified size down to a minimum size of 16k. In Simpana v9, the block size
hashing can be further reduced by using the registry key SignatureMinFallbackDataSize. This
further reduces the minimal deduplication block size from 16 k to 4 k. With a 128 k block
storage policy any block between 4 k or larger will be checked into the deduplication database.
This registry key is ideal for Client Side Deduplication or a network optimized DASH copy over a
slow network.
SignatureMinFallbackDataSize
Location: MediaAgent
Type DWORD
Value 4096
This registry key should be installed on MediaAgent or client performing signature generation.
It can be pushed out via CommCell® GUI the MediaAgent subkey will be created on the client.
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 12
1.2.6. Disk Space Required for DDB
The amount of disk space required for the DDB will depend on the amount of data protected,
deduplication ratios, retention, and change rate. This information should be placed in the
Storage_Policy_Plan table of the Deduplication Calculator. The top number (in yellow outline) is
the total amount of disk space required for the active DDB’s. The lower number (in blue outline)
is the individual total used by each Storage Policy copy. These numbers don’t take into account
the DDB recovery point or the working space required which is 3 times the store size.
1.2.7. Disk Library
A Best Practice is to create a single Disk Library for deduplication with no more than three
Building Blocks. This is illustrated in the following table.
Data per
DDB
Data Total in the
Disk Library
Application Data at a
10:1 ratio
Throughput of
6 TB/hr
60 TB 180 TB 1.8 PB 6 TB/hr
90 TB 270 TB 2.7 PB 6 TB/hr
Non-deduplicated data should backup to a separate Disk Library whenever possible.
Sequestering the data types into separate Disk Libraries allows for easier reporting on the
overall deduplication savings. Mixing deduplicated and non-deduplicated data into a single
library will skew the overall Disk usage information and make space usage prediction difficult.
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 13
Each Building Block can support 100 TB of disk storage. The disk storage should be partitioned
into 2 – 4 TB LUNs and configured as mount points in the operating system. This equates to 50-
2 TB LUNs, 33- 3 TB LUNs, or 25- 4 TB LUNS. This LUN size is recommended to allow for ease of
maintenance for the Disk Library. Additionally, a larger array of smaller LUNs reduces the impact
of a failure of a given LUN.
Additional disk capacity should be added in 2-4 TB LUNs matching the original LUN configuration
if possible. When GridStor is used apply the equal amount of capacity across all MediaAgents.
For example, three MediaAgents would require a total of 6 TB, 2 TB per Building Block. It is not
recommended to use third party real-time disk de-fragmentation software on a Disk Library or
DDB-LUN. This can cause locks on files that are being access by backup, restore, DASH copies
and data aging operation. Third party software can be used to defragment a mount path after it
has been taken offline. Anti virus software should also be configured to NOT scan CommVault
Disk Libraries and DDB-LUNs.
1.3. Disk Attachment Considerations
Mount paths can be of two types, NAS paths (Disk
Library over shared storage) or direct attached block
storage (Disk Library over direct attached storage). In
direct attached block storage (SAN) the mount paths
are locally attached to the MediaAgent. With NAS, the
disk storage is on the network and the MediaAgent
connects via a network protocol. The NAS Mount Path
is the preferred method for a mount path
configuration. This provides several benefits over the
direct attached configuration. If a MediaAgent goes
offline, the Disk Library is still accessible by other
MediaAgents in the library. With direct attached, if a
MediaAgent is lost then the Disk Library is offline.
Secondly, all network communication to the mount
path occurs from the MediaAgent to the NAS device.
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 14
During restores and DASH copies, there is no intermediate communication between MediaAgents.
In direct attached, all communication must pass through the hosting MediaAgent in order to service
the DASH copy or restore. Backup activities are not affected by the mount path choice.
In a direct attached design, configure the mount paths as mount points instead of drive letters. This
allows for larger capacity solutions to configure more mount paths than there are drive letters.
Smaller capacity sites can use drive letters as long as they do not exceed the number of available
drive letters. From an administration perspective it’s better to stick with drive letters or mounts
paths and to not mix the two. There are no performance advantages to either configuration.
Each MediaAgent should have no more than 50 writers across all the mount paths. A MediaAgent
with 10- 2 TB mount paths (20 TB of raw capacity) would have 5 writers per mount path. The
purpose behind this is to evenly distribute the load across all mount paths and to ensure the
number of concurrent connections to the DDB remains under the 50 connection limit. In a 3 Building
Block GridStor configuration the total number of writers should not exceed 150 writers, 50 writers
per MediaAgent.
Configure the Disk Library to use Spill and fill mount paths as this allows for load balancing the
writers evenly across all mount paths in the library. This setting is located in the Disk Library
Properties > Mount Paths Tab. For further information please refer to Establish the Parameters for
Mount Path Usage.
Regardless of the type of disk being used, SAN or NAS, the configuration is the same. The Disk
Library consists of disk devices that point to the location of the Disk Library folders. Each disk device
will have a read/write path and a read only path. The read/write path is for the MediaAgent
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 15
controlling the mount path to perform backup. The read only path is for the alternate MediaAgent
to be able to read the data from the host MediaAgent. This is to allow for restores or aux copy
operations while the local MediaAgent is busy. For step by step instructions on configuring a shared
Disk Library with alternate data paths please reference Configuring a Shared Disk Library with
Alternate Data Paths.
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 16
Global Deduplication Storage Policy
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 17
2. Global Deduplication Storage Policy
Global Deduplication Policy introduces the concept of a common deduplication store that can be shared
by multiple Storage Policy copies, Primary or DASH, to provide one large global deduplication store. Each
Storage Policy copy defines its own retention rules. However, all participating Storage Policy copies
share the same data paths which consists of MediaAgents and Disk Library mount paths.
A Global Deduplication Storage Policy (GDSP) should be used instead of a standard deduplication
storage policy whenever possible. A GDSP allows for multiple standard deduplication policies to be
associated to it allowing for global deduplication across all associated clients. The requirements for a
standard Deduplication Storage Policy to become associated to a GDSP are common block size and Disk
Library.
2.1. Block Size
All associated standard Deduplication Policies are configured with the same block size regardless of
the copy being associated to the GDSP. For example, the primary copy has a standalone
deduplication database and DASH copy associated to a GDSP. Both the Primary and DASH copy will
require the same block size. This is because the block size is configured at the Storage policy level
and all copies will adhere to that value. Trying to associate a Storage Policy copy to a GDPS with a
different block size will generate the following error:
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 18
2.2. Disk Libraries
All associated storage policies, in a GDSP, will back up to the same Disk Library. If a different Disk
Library is required then a different GDSP will be needed. All disk based library configurations are
supported for a GDSP. There is no limit to the number of standard Deduplication Policies that can be
associated to a GDSP. However, there are operational benefits to maintaining a simple design.
Create standard Deduplication Policies based on client specific requirements and retention needs
such as compression, signature generation, and encryption requirements. With a standard
Deduplication Policy each specific backup requirement noted above would need a separate DDB.
2.3. Remote Offices
Remote offices with local restorability requirements typically have small data sets and low
retention. Although, a single standard Deduplication Policy, in most cases, will service the remote
site’s requirements for data availability, it is recommended to use a GDSP. Remote sites may need
flexibility to handle special data such as legal information. In this case, a GDSP would allow this data
to deduplicate with other data at the site.
2.4. Global Deduplication Storage Policy Caveats
There are three important considerations when using Global Deduplication Storage Policies:
Client computers cannot be associated to a GDSP; only to standard storage policies.
Once a storage policy copy has been associated to a GDSP there is no way to change that
association.
Multiple copies within a storage policy cannot use the same GDSP.
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 19
2.5. Streams
The stream configuration in a Storage Policy design is also important. When a Round-Robin design is
configured, ensure the total number of streams across the storage policies associated to the GDSP
does not exceed 50. This ensures that no more than 50 jobs will protect data at a given time and
overload the DDB. For example, a GDSP may have four associated storage policies with 50 streams
each for a total of 200 streams. If all policies were in concurrent use, the DDB would have 200
connections and performance would degrade. By limiting the number of writers to a total of 50, all
200 jobs may start, however, only 50 will run at any one time. As resources become available from
jobs completing, the waiting jobs will resume.
2.6. Data Path Configuration
When using SAN storage for the mount path, use Alternate Data Paths -> When Resources are
offline -> immediately. In a GridStor® environment this will ensure the backups are configured to
go through the designated Building Block. If a data path fails or is marked offline for maintenance
the job will failover to the next data path configured in the Data Path tab. Although Round-Robin
between Data paths will work for SAN storage it’s not recommended because of the performance
penalty during DASH copies and restores. This is because of the multiple hops that have to occur in
order to restore or copy the data. When using Use Alternate Data Path with When Resources are
Offline then number of streams per client storage policy should not exceed 50.
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 20
When using NAS storage for the mount path, Round Robin, between Data Paths is recommended.
This is configured in the Storage Policy copy properties -> Data Path Configuration tab of the
storage policy associated to the GDSP and not in the GDSP properties. NAS mount paths do not have
the same performance penalty because the network communication is between the servicing Media
Agent and the NAS mount path directly.
2.7. Use Store Priming Options with Source-Side Deduplication
The store priming feature queries a previously sealed DDB for hash lookup before requesting a client
to send the data. The purpose of this feature is to leverage existing protected data in the Disk
Library before sending new data over the network. The feature is designed for slow network based
backup only. This would include Client-Side Deduplication and DASH copies. The feature is not
recommended for LAN based backup or network links faster than 1 Gbps. Lab testing has shown that
using this feature on the LAN can actually hinder backup performance. This is because it is faster to
request the data from the client than perform the queries on the previously sealed DDB. This
feature does not eliminate the need to re-baseline after a sealed deduplication database. It only
eliminates the need for the client to send the data over the network to the MediaAgent. This feature
requires Source-Side Deduplication to be enabled.
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 21
Deduplication Data Base Availability
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 22
3. Deduplication Database Availability
The DDB recovery point is a copy of the active DDB. This copy is used to rebuild the DDB in the event of
failure. When the recovery point process is initiated all communication to the active DDB is paused. The
information in memory is committed to disk to ensure the DDB is in a quiesced state. The DDB is then
copied from the active location to the backup location. After a DDB has been backed up successfully the
previous recovery point is deleted. All communication to the DDB is then resumed. Throughout this
time, the Job Controller will show the jobs in a running state. By default, the DDB recovery point is
placed in a folder called “BACKUP” in the DDB location. Since this is a copy of the active DDB the LUN
hosting the DDB will need THREE times the amount of disk space as the active DDB. This allows for the
active DDB, the DDB recovery point, and an
equal amount of working space. The DDB
recovery point can be moved to an alternate
location if more space is required. If this
process is going to be used then the DDB
LUN requires enough disk space for the
active DDB plus growth. The DDB recovery point location will require two times the size of the active
DDB. This allows for the recovery point and the working space for the DDB recovery point process. The
best practice is to use the Disk Library for the recovery point destination.
The default interval for recovery point creation is 8 hours. The registry key that controls this is the
Create recovery Points Every registry key. Once the time interval has been reached the next backup will
create the recovery point. It is not recommended to lower the Create Recovery Point Every setting to
below 4 hours. Lowering the setting below 4 hours can have a negative impact on backup performance.
There are 2 reasons for this. First, the recovery point flushes the DDB that is residing in memory to disk.
When the jobs resume the DDB has to be loaded back into memory. This process can be time
consuming. Secondly, all backup activity pauses while the active DDB is copied to the recovery point.
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 23
3.1. Considerations
Changing the DDB recovery point interval requires the DDB engine to be restarted. This can be done
by restarting the Media Management services from the CommVault Services Control Panel. To view
the running time interval, locate the following entry in the SIDBEngine.log file. The value in brackets
represents the interval in seconds. The Valid range for the DDB recovery point is 0-99 hours.
### Backup interval set to [28800] When moving the DDB recovery point to a network share, take the network speed into
consideration when choosing the destination. Best practice is to use the fastest network connection
available. During the DDB recovery point operation, if the copy process of the DDB to the backup
folder takes longer than 20 minutes the running jobs will move into a pending state. This is because
clients, by default, wait a maximum of 20 minutes when there is no response from the DDB. While
the default value can be changed the best practice is to ensure the DDB recovery point process
completes within 20 minutes. In order to extend the wait time, three possible registry keys may
need to be applied. The examples that follow are all set for one hour. If the timeout value is set to
accommodate the backup time for the DDB, then the backup will wait until the SIDB starts allowing
threads to continue and will not go pending or show any errors.
MediaAgent when Source-Side Deduplication is not being used.
Location: MediaAgent Key: SIDBReplyTimeoutInS Type: DWORD Value: 3600
Client for Source-Side Deduplication
Location: iDataAgent Key: SignatureWaitTimeSeconds Type: DWORD Value: 360
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 24
MediaAgent for DASH copy which uses the same code as Source-Side Deduplication.
Location: MediaAgent Key: SignatureWaitTimeSeconds Type: DWORD Value: 360
When using a Disk Library, as a recovery point destination, ensure that the mount path reserve
space is set appropriately to accommodate the DDB recovery point. Without this the mount path
could run out of disk space and fail all DDB recovery point operations until free space is available.
To move the DDB recovery point to a network path the following registry value must be created.
This change requires a support case to be opened as the SIDBBackupPathPassword string must be
encrypted via a proprietary encryption tool that is not publically available.
SIDBBackupPath
Location: MediaAgent
Type String
Value Local or network path
SIDBBackupPathUser
Location: MediaAgent
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 25
Type String
Value Domain/User. Only required of the network share
SIDBBackupPathPassword
Location: MediaAgent
Type String
Value Encrypted by a CommVault tool. Only required of the network share
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 26
Building Block Design
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 27
4. Building Block Design Designing the operational architectures involve several important considerations. These
considerations include backup windows, data sets, throughput and retention.
4.1. Choosing the Right Building Block
Backup Windows total amount of time allotted to protect the data set
Data Set The amount of data to protect in the backup window
Throughput
Required Throughput to protect the Data Set within the
Backup Window
Retention How long the data is to be kept before aging off the
system
To determine the correct Building Block configuration, the Deduplication Calculator can be
populated with the appropriate data. The summary page of the Deduplication Calculator provides
the Backup Window, Total Amount of data to protect in a full cycle and the number of DDBs to
protect the data for the required retention.
To determine the required throughput, divide the Production Site Size by the Backup Window. The
result is the required throughput needed by the Building Blocks to protect the data within the
backup window. Take the required throughput and divide this by 2 to generate the number of
Building Blocks required to protect the data set.
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 28
When using Building Blocks, different block sizes for different storage policies will require another
deduplication database. Each deduplication database will have a specific hardware requirement as
outlined in this document.
5.2. Building Block Configuration Examples
In this section we will cover several configuration examples. These examples include a 1 Building
Block configuration, a 3 Building block configuration and a staggered full backup configuration.
Example 1: 1 Building Block
Full backups performed one day a week. Information obtained from Deduplication Calculator
Data Set
16 TB
Backup Window
8 Hours
Retention
4 Weeks
Daily Change Rate 2% or 320GB
Only one Building Block is required to protect the amount of data specified during the backup
window. The daily change rage is 320 GB which can be protected by a single Building Block.
Clients
50 Writers
MediaAgent
DDB
Disk
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 29
Example 2: 3 Building Block
Full backups performed one day a week. Information obtained from Deduplication Calculator Data Set
48 TB
Backup Window
8 Hours
Retention
4 Weeks
Daily Change Rate 2% or 950GB
This site would require three Building Blocks in order to protect the data within the backup window.
The incremental change rate is 950 GB and can be handled by the Building Blocks. This will allow a
total of 150 concurrent streams and an overall deduplication capacity, between the 3 nodes of 180-
270 TB of unique data across all the DDBs. The Deduplication Calculator estimates the deduplication
store to be at 42 TB. Per the Deduplication Calculator the total required Disk Library space is 52 TB.
Using 2 TB LUNs would yield 26 mount paths (52 TB/2 TB LUN = 26). Rounding the number of mount
paths up to 27 results in each node hosting 9 mount paths (27 mount paths/3 BBs = 9). Increasing
the number of mount paths to 27 would also increase the disk space to 54 TB.
Clients
50 Writers
MediaAgent
DDB
Disk
DDB
Disk
MediaAgent
50 Writers
DDB
Disk
50 Writers
MediaAgent
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 30
5.3. Staggering Full Backups
Staggering full backups can have a major impact on the overall architecture and design. The next
example will show the architectural impact of staggered full backups in a large environment.
Example 3: Part 1: Traditional Backups
Full backups performed one day a week. Information obtained from Deduplication Calculator
Data Set
120TB
Backup Window
8 Hours
Retention
4 Weeks
Daily Change Rate
2% or 2.4TB
Number of DDBs 2
This site would require eight Building Blocks in order to protect the data within the backup window.
The incremental change rate is 2.4 TB and can be handled by the Building Blocks. To protect the data
within the backup window, 8 DDBs will be required. This will allow a total of 400 concurrent
operational streams and an overall deduplication capacity between the eight nodes of 660-720 TB of
unique data across all the DDBs. The Deduplication Calculator estimates the deduplication store to
be at 106 TB. Per the Deduplication Calculator, the total required Disk Library space is 130 TB. Using
2 TB LUNs would yield 65 mount paths (130 TB/2 TB LUN = 65). For evenly distributed mount paths
the number would have to decrease to 64 or increase to 72. Decreasing the mount paths to 64
would reduce the overall capacity to 128 TB. Increasing the mount paths to 72 would increase the
capacity to 144 TB. In this case, keep the mount paths at 65. Configure 8 mount paths for 7
MediaAgents and 9 mount paths for the 8th.
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 31
Example 3: Part 2: Staggered Full Backups
Full backups performed six days a week. Information obtained from Deduplication Calculator
Data Set
120 TB
Backup Window
8 Hours
Full backup
Monday - Saturday
Retention
4 Weeks
Daily Change Rate
2% or 2.4TB
Number of DDBs
2
In this scenario, the site has a total data set of 120 TB. The full backups will occur Friday –
Wednesday leaving Thursday available for data aging operations. To figure out the daily data to
protect the following formula is used.
/day
Next, determine the number of Building Blocks required for the data rate..
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 32
Staggering the full backup would only require this site to use two Building Blocks in order to protect
the data foot print in part 1 within the backup window. The Deduplication Calculator only calls for 2
DDBs for the amount of data being protected and the retention. This will allow a total of 100
concurrent streams and an overall deduplication capacity between the two nodes of 120-180 TB of
unique data across the DDBs. The Deduplication Calculator estimates the deduplication store to be
at 106 GB. The total required Disk Library space is 130 TB. This is the same deduplication footprint as
in Part 1. Using 2 TB LUNs would yield 65 mount paths (130 TB/2 TB LUN = 65). For evenly
distributed mount paths the number would have to increase to 66. This also increases the total
capacity to 132 TB. Each Building Block would have 33 mount paths. Staggering the backups across
the week reduces the overall infrastructure required to protect the data set significantly.
Clients
50 Writers
MediaAgent
DDB
Disk
DDB
Disk
MediaAgent
50 Writers
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 33
Conclusions
CommVault Building Block Configuration White Paper
June 2011 Content in this document is subject to change without notice Page 34
2. Conclusion
The Building Block data management solution is flexible, scalable and hardware agnostic. The Building
Blocks are flexible because they can grow by adding mount paths and they can accommodate different
retentions and different data types all within the same deduplication framework. The Building Blocks are
scalable because they can grow to hundreds of TB of unique data. Through staggering full backups the
building blocks can protect large amounts of data with minimal infrastructure which holds down cost
and liability. The Building Blocks are hardware agnostic by requiring hardware classes instead of specific
models. As detailed in the preceding sections, we have shown there are six different examples of
adequate servers across three major manufacturers.