Upload
others
View
27
Download
0
Embed Size (px)
Citation preview
DATA DOMAIN-OPTIMIZED BACKUP USING NETWORKER 8.1Gururaj KulkarniSoftware QA Manager EMC [email protected]
Soumya GuptaSenior Software QA Engineer EMC [email protected]
2014 EMC Proven Professional Knowledge Sharing 2
Table of Contents
Introduction _________________________________________________________________ 3
Test Approach _______________________________________________________________ 4
Test Results _________________________________________________________________ 4
Best Practices ______________________________________________________________ 10
Disclaimer: The views, processes, or methodologies published in this article are those of the
authors. They do not necessarily reflect EMC Corporation’s views, processes, or
methodologies.
2014 EMC Proven Professional Knowledge Sharing 3
Introduction
EMC NetWorker®, an enterprise-class Backup and Recovery solution, is three-tiered software;
NetWorker Server (that co-ordinates the entire backup/recover process and tracks the
metadata), NetWorker Client (which hosts the data to be backed up), and NetWorker Storage
Node (which connects to diverse storage devices and writes/reads data).
EMC Data Domain® Deduplication Storage System is a storage appliance that is revolutionizing
disk backup, archiving, and disaster recovery with high-speed, inline deduplication.
Two important new features were introduced in NetWorker 8.1 with Data Domain integration.
This article briefs the performance benefit gained from both features.
Data Domain Boost Over Fibre Channel
Customers using NetWorker with Virtual Tape Libraries (including EDL, Data Domain as VTL,
and others), auto changers, or tape device as their backup solutions cannot transition to
NetWorker backup to disk with Data Domain since they have a dedicated Fibre Channel
environment and Data Domain devices support data transfer only over TCP/IP.
This article describes the new feature introduced in NetWorker 8.1 where NetWorker clients and
storage nodes support Fibre Channel (backup and recovery operation) connectivity to Data
Domain devices by leveraging Fibre Channel capability available with DD Boost 2.6 library.
This support not only optimizes the customers’ existing investment in their Fibre Channel
infrastructure but offers both client-side deduplication and support of the Fibre Channel protocol
using a backup-to-disk workflow.
Boost over Fibre Channel with Client Direct is 20-25% faster compared to backup via
Data Domain VTL.
DFA-Recover throughput via Fibre Channel is 2.5x times faster than recover throughput
via Data Domain VTL.
Virtual Synthetics
In the current Synthetic Full (SF) backup feature, the data is sent to a NetWorker processes
from the DDR which sends it back to the same DDR. This increases time to synthesize the
saveset as well as network bandwidth usage.
2014 EMC Proven Professional Knowledge Sharing 4
This article describes the new feature introduced in NetWorker 8.1 where Virtual Synthetic Full
(VSF) backups are an out-of-the-box integration with NetWorker, making it ‘self-aware.’
Therefore, if your customer is using a Data Domain System as their backup target, NetWorker
will use VSF backups as the backup workflow by default when a synthetic full backup is
scheduled, thus optimizing incremental backups for file systems.
VSF backups reduce the processing overhead associated with traditional synthetic full backups
by using metadata on the Data Domain system to synthesize a full backup without moving data
across the network.
VSF backup is 21x - 29x times faster than Synthetic Full (SF) backup.
Test Approach
NetWorker Server, Storage Node, and Clients were installed with NetWorker
dev.Build.6064 for Data Domain over Fibre Channel Feature Testing and
NetWorker dev.Build.6297 for Virtual Synthetics Feature Testing.
All tests were carried out on Windows NetWorker Server Platform.
Network speed of 1GB and Fibre Channel speed of 4GB was maintained in the
setup.
Data Domain was initialized to “zero” state prior to attempting each scenario.
Tests were carried out via Dedicated Storage Node (DSN) with Boost over Fibre
Channel device.
Synthetic Full and Virtual Synthetic Full backup were run for all the scenarios.
Test Results Test results for both features—Data Domain Boost over Fibre Channel and Virtual Synthetic Full
Backup—are shown in the following tables.
2014 EMC Proven Professional Knowledge Sharing 5
Data Domain Boost over Fibre Channel
Client Direct backup via Dedicated Storage Node with Boost over Fibre Channel was
carried out.
Backup via Boost over Fibre Channel with Client Direct is 20-25% faster compared to
backup via Data Domain VTL.
Next subsequent full backup is 3x times faster compared to the first full backup.
Resource Utilization
Memory and CPU Usage by Single save process during client direct backup over Fibre
Chanel are shown below.
49
54
46474849505152535455
Linux Windows
Me
mo
ry U
sage
in M
B
OS Platform
Memory Consumption by Single Save Session during Client Direct Backup
DFA-FC
15 15
0
2
4
6
8
10
12
14
16
Linux Windows
CP
U U
sage
OS Platform
CPU Usage by single save session during Client direct backup
DFA-FC
2014 EMC Proven Professional Knowledge Sharing 6
Recovery of 1TB data was carried out.
Recovery over Boost over Fibre Channel is 2.5x times faster compared to recovery via
Data Domain VTL.
Virtual Synthetic Full Backup
Results of Virtual Synthetic Full Backup
Figure above depicts backup of Medium Density FS (having 2360 files); 1%, 2%, and
5% more data added to 3 incremental backups.
File size ranged from 200MB to 4GB.
Database file size ranged from 10GB to 35GB.
The Next Full was run with Virtual Synthetic Full, Traditional Full, and Synthetic Full.
Results are given in the last column of the table below.
1.4
3.6
0
0.5
1
1.5
2
2.5
3
3.5
4
FC DD-VTL
Tim
e (
Ho
ur)
Recover Time for 1TB
475
13 33
77
7 0
50
100
150
200
250
300
350
400
450
500
Full incr1 incr2 incr5 Next FULL
Tim
e T
ake
n (
min
) Time to synthetize full backup is almost negligible using VSF.
2014 EMC Proven Professional Knowledge Sharing 7
• Virtual Synthetic Full backup is 21x faster than Traditional Full backup.
• Virtual Synthetic Full backup is 29x faster than Synthetic Full backup.
7
150
205
0
50
100
150
200
250
VirtualSynthetic
Full
TraditionalBackup
SyntheticFull
Bac
kup
Tim
e (
min
)
Backup Level
Backup Time for Medium Density FS of 4 TB
This is for representation purpose, not to compare feature with L0 (Traditional backup) and Synthetic Full backup.
2014 EMC Proven Professional Knowledge Sharing 8
As file size increases, Virtual Synthetic Full backup throughput
increases.
As the number of files increases, Virtual Synthetic Full backup
throughput decreases.
Resource Utilization
Memory Usage
Memory usage by nsrrecopy process on Linux and Windows Storage Node and by
nsrconsolidate process on Windows Server is shown below.
Memory Usage by nsrrecopy during Virtual Synthetic Full backup is 3x
less than Synthetic Full backup.
Memory Usage by nsr-recopy process is the same during the back up of
High Density File System and Medium Density File System.
144 1389
8959
0
2000
4000
6000
8000
10000
HDF (File Size= 10KB)
HDF (File Size= 100KB)
MDF (FileSize = 200MB
- 4GB)
Thro
ugh
pu
t in
MB
/s
VSF Backup Throughput for MDF vs HDF
35
80
27
85
0
20
40
60
80
100
VSF SF
Me
mo
ry in
MB
Memory Usage by per nsrrecopy on Windows Storage Node (32GB RAM)
MDF
HDF 20
72
16
70
0
20
40
60
80
VSF SF
Me
mo
ry in
MB
Memory Usage by per nsrrecopy on Linux Storage Node (8GB RAM)
MDF
HDF
2014 EMC Proven Professional Knowledge Sharing 9
Memory Usage by nsrconsolidate process on server is the same during
both Virtual Synthetic Full and Synthetic Full backup.
Memory Usage by nsrconsolidate process on server is same during the
backup of Medium Density File System and High Density File System.
CPU Usage
CPU usage by nsrconsolidate process on Windows NetWorker Server and by
nsrrecopy process on both Linux and Windows Storage Node is shown below.
CPU usage by nsrconsolidate process is more during the High Density FS Virtual
Synthetic Full backup as the index processing is done by nsrconsolidate process.
42
41
43 43
40
41
42
43
44
VSF SF
Me
mo
ry in
MB
Memory Usage per nsrconsolidate process on Windows Server (32GB RAM)
MDF
HDF
1 1
8
1
0
2
4
6
8
10
VSF SF
% C
PU
Usa
ge
User CPU Usage by nsrconsolidate on Windows Server(2CPU @ 2.13 GHz)
MDF
HDF
2014 EMC Proven Professional Knowledge Sharing 10
Best Practices
Recover throughput via Boost over Fibre Channel is 2.5x times faster than
recover via Data Domain VTL.
Virtual Synthetic Full Level Backup throughput varies with size of the File
backup. As file size increases, Virtual Synthetic Full Backup throughput also
increases.
Virtual Synthetic Full Backup time varies with the type of filesystem (high
density / medium density). Virtual Synthetic Full Backup for Medium Density
FileSystem performs better than the Virtual Synthetic Full Backup for High
Density FileSystem.
During Virtual Synthetic Full Backup, processing of data is off-loaded to Data
Domain. Hence, resource utilization on NetWorker Storage Node is
significantly less.
20
27
20
31
0
5
10
15
20
25
30
35
VSF SF
% C
PU
Usa
ge
User CPU Usage by nsrrecopy on Windows Storage Node (2 CPU@ 2.13 GHz)
MDF
HDF
12
10
12
10
9
10
11
12
13
VSF SF
% C
PU
Usa
ge
CPU Usage by nsrrecopy on Linux Stoarge Node (8 [email protected] GHz)
MDF
HDF
2014 EMC Proven Professional Knowledge Sharing 11
EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION MAKES NO RESPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.