Upload
nayab-rasool
View
241
Download
0
Embed Size (px)
Citation preview
7/25/2019 ONTAP Select Product Architecture
1/44
Technical Report
ONTAP SelectProduct Architecture and Best Practices
Tudor Pascu, NetApp
July 2016 | TR-4517
7/25/2019 ONTAP Select Product Architecture
2/44
2
TABLE OF CONTENTS
1 Introduction .......................................................................................................................................... 4
1.1 Software-Defined Infrastructure ............. .............. .............. .............. .............. .............. .............. .............. ........ 4
1.2 Running ONTAP as Software ............. ............... .............. .............. .............. .............. .............. .............. .......... 5
1.3 ONTAP Select Versus Data ONTAP Edge ............ .............. .............. .............. .............. ............... .............. ..... 5
1.4 ONTAP Select Platform and Feature Support .............. .............. .............. .............. ............... .............. ............ 6
2 Architecture Overview ......................................................................................................................... 7
2.1 Virtual Machine Properties ............................................................................................................................... 7
2.2 RAID Services .............. .............. .............. .............. .............. .............. .............. .............. .............. ............... ..... 8
2.3 Virtualized NVRAM ........................................................................................................................................ 10
2.4 High Availability .............. .............. ............... .............. .............. .............. .............. .............. .............. .............. . 11
2.5 Network Configurations .............. .............. .............. .............. .............. .............. .............. ............... .............. ... 16
2.6 Networking: Internal and External .................................................................................................................. 20
3 Deployment and Management .......................................................................................................... 22
3.1 ONTAP Select Deploy ............. .............. .............. .............. .............. .............. .............. .............. .............. ...... 22
3.2 Licensing ........................................................................................................................................................ 24
3.3 ONTAP Management ............ .............. ............... .............. .............. .............. .............. .............. .............. ........ 24
4 Storage Design Considerations ........................................................................................................ 24
4.1 Storage Provisioning ............. .............. ............... .............. .............. .............. .............. .............. .............. ........ 24
4.2 ONTAP Select Virtual Disks ........................................................................................................................... 27
4.3 ONTAP Select Deploy ............. .............. .............. .............. .............. .............. .............. .............. .............. ...... 28
5 Network Design Considerations ....................................................................................................... 29
5.1 Supported Network Configurations ................................................................................................................ 29
5.2 vSphere: vSwitch Configuration ..................................................................................................................... 30
5.3 Physical Switch Configuration ........................................................................................................................ 32
5.4 Data and Management Separation ............. .............. .............. .............. .............. .............. .............. .............. . 33
5.5 Four-NIC Configuration .............. .............. .............. .............. .............. .............. .............. ............... .............. ... 35
5.6 Two-NIC Configuration .................................................................................................................................. 36
6 Use Cases ........................................................................................................................................... 37
6.1 Mirrored Aggregate Creation .............. ............... .............. .............. .............. .............. .............. .............. ........ 37
6.2 Remote/Branch Office .............. .............. .............. .............. .............. .............. .............. .............. .............. ...... 39
6.3 Private Cloud (Data Center) ........................................................................................................................... 39
7 Upgrading ........................................................................................................................................... 40
7.1 Increasing Capacity ....................................................................................................................................... 40
7/25/2019 ONTAP Select Product Architecture
3/44
3
7.2 Single-Node to Multinode Upgrade ................................................................................................................ 41
8 Performance ....................................................................................................................................... 41
Version History ......................................................................................................................................... 43
LIST OF TABLES
Table 1) ONTAP Select versus Data ONTAP Edge. ............. .............. .............. .............. .............. ............... .............. ..... 5
Table 2) ONTAP Select virtual machine properties. .............. .............. .............. .............. .............. ............... .............. ..... 7
Table 3) Internal versus external network quick reference. ............. .............. .............. .............. .............. .............. ........ 21
Table 4) Network configuration support matrix. .............. .............. .............. .............. .............. .............. ............... .......... 29
Table 5: Performance results ............. .............. .............. .............. .............. .............. .............. .............. ............... .......... 42
LIST OF FIGURES
Figure 1) Virtual disk to physical disk mapping. ............. .............. .............. .............. .............. .............. ............... ............ 9
Figure 2) Incoming writes to ONTAP Select VM. .............. .............. .............. .............. .............. .............. .............. ........ 11
Figure 3) Four-node ONTAP Select cluster. ............... .............. .............. .............. .............. .............. .............. .............. . 12
Figure 4) ONTAP Select mirrored aggregate. .............. .............. .............. .............. .............. .............. .............. ............. 13
Figure 5) ONTAP Select write path workflow. .............. .............. .............. .............. .............. .............. .............. ............. 14
Figure 7) ONTAP Select multinode network configuration. ............. .............. .............. .............. .............. .............. ........ 17
Figure 8) Network configuration of a multinode ONTAP Select VM. ............. .............. .............. .............. .............. ........ 18
Figure 9) Network configuration of single-node ONTAP Select VM. ............. .............. .............. .............. .............. ........ 20
Figure 10) Server LUN configuration with only RAID-managed spindles. .............. .............. .............. .............. ............. 25
Figure 11) Server LUN configuration on mixed RAID/non-RAID system. .......................... .............. .............. .............. . 26
Figure 12) Virtual disk provisioning. ............ .............. ............... .............. .............. .............. .............. .............. .............. . 28
Figure 14) Standard vSwitch configuration. ............. .............. .............. .............. .............. .............. ............... .............. ... 30
Figure 15) LACP distributed vSwitch configuration. .............. .............. .............. .............. .............. ............... .............. ... 31
Figure 16) Network configuration using shared physical switch. .............. .............. .............. .............. .............. ............. 32
Figure 17) Network configuration using multiple physical switches. .............. .............. .............. .............. .............. ........ 33
Figure 18) Data and management separation using VST. .............. .............. .............. .............. .............. .............. ........ 34
Figure 19) Data and management separation using VGT. .............. .............. .............. .............. .............. .............. ........ 34
Figure 20) Four-NIC network configuration. ............... .............. .............. .............. .............. .............. .............. .............. . 35
Figure 21) Two-NIC network configuration. ............. .............. .............. .............. .............. .............. ............... .............. ... 37Figure 22) Scheduled backup of remote office to corporate data center. ............. .............. .............. .............. .............. . 39
Figure 23) Private cloud built on direct-attached storage. ............ .............. .............. .............. ............... .............. .......... 40
7/25/2019 ONTAP Select Product Architecture
4/44
4
1 Introduction
NetApp ONTAP Select is helping pioneer the newly emerging software-defined storage (SDS) area by
bringing enterprise-class storage management features to the software-defined data center. ONTAP
Select is a critical component of the Data Fabric envisioned by NetApp, allowing customers to run
ONTAP management services on commodity hardware.
This document describes the best practices that should be followed when building an ONTAP Select
cluster, from hardware selection to deployment and configuration. Additionally, it aims to answer the
following questions:
How is ONTAP Select different from our engineered FAS storage platforms?
Why were certain design choices made when creating the ONTAP Select architecture?
What are the performance implications between the various configuration options?
1.1 Software-Defined Infrastructure
Fundamental changes in IT require that companies reevaluate the technology they use to provide
business services. Weve seen this with the migration away from the monolithic infrastructures used in
mainframe computing and more recently in the adoption of virtualization technologies. Today, the nextwave of IT transformation is upon us. The implementation and delivery of IT services through software
provide administrators with the ability to rapidly provision resources with a level of speed and agility that
was previously impossible. Service-oriented delivery models, the need to dynamically respond to
changing application requirements, and a dramatic shift in infrastructure consumption by enterprises have
given rise to software-defined infrastructures (SDIs).
Modern data centers are moving toward software-defined infrastructures as a mechanism to provide IT
services with greater agility and efficiency. Separating out IT value from the underlying physical
infrastructure allows them to react quickly to changing IT needs by dynamically shifting infrastructure
resources to where they are needed most.
Software-defined infrastructures are built on three tenets:
Flexibility Scalability
Programmability
These environments provide a set of IT services across a heterogeneous physical infrastructure,
achieved through the abstraction layer provided by hardware virtualization. They are able to dynamically
increase or decrease the amount of IT resources available, without needing to add or remove hardware.
Most important, they are automatable, allowing for programmatic deployment and configuration at the
touch of a button. No equipment rack and stack or recabling is necessary.
SDI covers many aspects of IT infrastructure, from networking to compute to storage. In this document
we focus on one specific area, SDS, and discuss how it relates to NetApp.
Software-Defined StorageThe shift toward software-defined infrastructures may be having its greatest impact in an area that has
traditionally been one of the least affected by the virtualization movement: storage. Software-only
solutions that separate out storage management services from the physical hardware are becoming more
commonplace. This is especially evident within private cloud environments: enterprise-class service-
oriented architectures designed from the ground up with software defined in mind. Many of these
environments are being built on commodity hardware: white box servers with locally attached storage,
with software controlling the placement and management of user data.
7/25/2019 ONTAP Select Product Architecture
5/44
5
This is also seen within the emergence of hyperconverged infrastructures (HCIs), a building-block style of
IT design based on the premise of bundling compute, storage, and networking services. The rapid
adoption of hyperconverged solutions over the past several years has highlighted the desire for simplicity
and flexibility. However, as companies make the decision to replace enterprise-class storage arrays with
a more customized, make your own model, by building storage management solutions on top of home-
grown components, a set of new problems emerges.
In a commodity world, where data lives fragmented across silos of direct-attached storage, data mobility
and data management become complex problems that need to be solved. This is where NetApp can help.
1.2 Running ONTAP as Software
There is a compelling value proposition in allowing customers to determine the physical characteristics of
their underlying hardware, while still giving them the ability to consume ONTAP and all of its storage
management services. Decoupling ONTAP from the underlying hardware allows us to provide enterprise-
class file and replication services within a software-defined environment.
This is enabled by leveraging the abstraction layer provided by server virtualization, which allows us to
tease apart ONTAP from any dependencies on the underlying physical hardware and put ONTAP into
places that our FAS arrays cant reach.
Why do we require a hypervisor? Why not run ONTAP on bare metal? There are two answers to thatquestion:
Qualification
Hyperconvergence
Running ONTAP as software on top of another software application allows us to leverage much of the
qualification work done by the hypervisor, critical in helping us rapidly expand our list of supported
platforms. Additionally, positioning ONTAP as a hyperconverged solution allows customers to plug into
existing orchestration frameworks, allowing for rapid provisioning and end-to-end automation, from
deployment and configuration to the provisioning of storage resources through supported OnCommand
tooling such as WFA or NMSDK.
This is the goal of the ONTAP Select product.
1.3 ONTAP Select Versus Data ONTAP Edge
If youre familiar with the past NetApp software-defined offering Data ONTAP Edge, you may be
wondering how ONTAP Select is different. Although much of this is covered in additional detail in the
architecture overviewsection of this document, the Table 1 highlights some of the major differences
between the two products.
Table 1) ONTAP Select versus Data ONTAP Edge.
Description Data ONTAP Edge ONTAP Select
Node count Single nodeTwo offerings: single node and 4
node with HA
Virtual machine CPU/memory 2 vCPUs/8GB 4 vCPUs/16GB
Hypervisor vSphere 5.1, 5.5 vSphere 5.5 update 3a (build3116895 or greater) and 6.0(build 2494585 or greater)
High availability No Yes
iSCSI/CIFS/NFS Yes Yes
7/25/2019 ONTAP Select Product Architecture
6/44
6
Description Data ONTAP Edge ONTAP Select
SnapMirror and SnapVault Yes Yes
Compression No Yes
Capacity limit 10TB, 25TB, 50TB Up to 100TB/node
Hardware platform support Select families within qualifiedserver vendors
Wider support for major vendorofferings that meet minimum
criteria
1.4 ONTAP Select Platform and Feature Support
The abstraction layer provided by the hypervisor allows ONTAP Select to run on a wide variety of
commodity platforms from virtually all of the major server vendors, providing they meet minimum
hardware criteria. These specifications are detailed below.
Hardware Requirements
ONTAP Select requires that the hosting physical server meet the following requirements: Intel Xeon E5-26xx v3 (Haswell) CPU or greater: 6 cores (4 for ONTAP Select, 2 for OS)
32GB RAM
824 internal SAS disks
Min. 2 10GbE NIC ports (4 recommended)
Hardware RAID controller with writeback cache
For a complete list of supported hardware platforms and management applications, refer to the ONTAP
Select 9.0 Release Notes
ONTAP Feature Support
The ONTAP Select product has shipped with clustered Data ONTAP release 9.0 and offers full support
for most functionality, with the exception of those features that have hardware-specific dependencies
such as MetroCluster and FCoE.
This includes support for:
NFS/CIFS/iSCSI
SnapMirror and SnapVault
Flexclone
SnapRestore
Dedupe and compression
Additionally, support for the OnCommand management suite is included. This includes most tooling used
to manage NetApp FAS arrays, such as OnCommand Unified Manager (OCUM), OnCommand Insight
(OCI), Workflow Automation (WFA), and SnapCenter. Consult the IMT for a complete list of supported
management applications.
Note that the following ONTAP features are not supported by ONTAP Select:
Interface groups (IFGRPs)
Hardware-centric features such as MetroCluster, Fibre Channel (FC/FCoE), and full disk encryption(FDE)
SnapLock
7/25/2019 ONTAP Select Product Architecture
7/44
7
Compaction
Inline dedupe
In traditional FAS systems, Interface Groups are used to provide aggregrate throughput and faulttolerance using a single, logical, virtualized network interface configured on top of multiple physicalnetwork interfaces. OnTap Select leverages the underlying Hypervisors virtualization of multiple physicalnetwork interfaces to achieve the same goals of throughput aggregation and resiliency. The networkinterface cards that OnTap Select manage are therefore logical constructs and configuring additionalInterface Groups will not achieve the goals of throughput aggregation or recovering from hardwarefailures.
2 Architecture Overview
ONTAP Select is clustered Data ONTAP deployed as a virtual machine and providing storage
management services on a virtualized commodity server by managing the servers direct-attached
storage.
The ONTAP Select product can be deployed two different ways:
Non-HA (single node).The single-node version of ONTAP Select is well suited for remote andbranch offices by providing customers with the ability to run enterprise-class file services, backup,and disaster recovery solutions on commodity hardware.
High availability (multinode).The multinode version of the platform uses four ONTAP Select nodesand adds support for high availability and clustered Data ONTAP nondisruptive operations, all withina shared-nothing environment.
When choosing a solution, resiliency requirements, environment restrictions, and cost factors should be
taken into consideration. Although both versions run clustered Data ONTAP and support many of the
same core features, the multinode solution provides the addition of high availability and supports
nondisruptive operations, a core value proposition for clustered Data ONTAP.
Note: The single-node and multinode versions of ONTAP Select are deployment options, not separateproducts. Although the multinode solution requires the purchase of additional node licenses, both
share the same product model, FDvM300.
This section attempts to provide a deeper dive into various aspects of the system architecture for both the
single-node and multinode solutions while highlighting important differences between the two variants.
2.1 Virtual Machine Properties
The ONTAP Select virtual machine has a fixed set of properties, described in Table 2. Increasing or
decreasing the amount of resources allocated to the VM is not supported. Additionally, the ONTAP Select
instance hard reserves the CPU and memory resources, meaning the physical resources backed by the
virtual machine are unavailable to any other VMs hosted on the server.
Table 2 shows the resources used by the ONTAP Select VM.
Table 2) ONTAP Select virtual machine properties.
Description Single Node Multinode (per Node)
CPU/memory 4 vCPUs/16GB 4 vCPUs/16GB
Virtual network interfaces 2 6
SCSI controllers 4 4
System boot disk 10GB 10GB
7/25/2019 ONTAP Select Product Architecture
8/44
8
Description Single Node Multinode (per Node)
System coredump disk 120GB 120GB
Mailbox disk 556MB 556MB
Cluster root disk 68GB 68GB (x 2 because disk is
mirrored)
Serial ports 2 network serial ports 2 network serial ports
Note: The coredump disk partition is separate from the system boot disk. Because the corefile size isdirectly related to the amount of memory allocated to the ONTAP instance, this allows NetApp tosupport larger sized memory instances in the future without requiring a redesign of the systemboot disk.
ONTAP makes use of locally attached physical hardware, specifically the hardware RAID controller
cache, to achieve a significant increase in write performance. Additionally, because ONTAP Select is
designed to manage the locally attached storage on the system, certain restrictions apply to the ONTAP
Select virtual machine. Specifically:
Only one ONTAP Select VM can reside on a single server.
ONTAP Select may not be migrated or vMotioned to another server. This includes storage vMotion ofthe ONTAP Select VM.
Enabling vSphere Fault Tolerance (FT) is not supported. This is in part due to the fact that the systemdisks of the ONTAP Select VM are IDE disks, which vSphere FT does not support.
2.2 RAID Services
Although some software-defined solutions require the presence of an SSD to act as a higher speed write
staging device, ONTAP Select uses a hardware RAID controller to achieve both a write performance
boost and the added benefit of protection against physical drive failures by moving RAID services to the
hardware controller. As a result, RAID protection for all nodes within the ONTAP Select cluster are
provided by the locally attached RAID controller and not through Data ONTAP software RAID.
Note: ONTAP Select data aggregates are configured to use RAID 0, because the physical RAIDcontroller is providing RAID striping to the underlying drives. No other RAID levels are supported.
RAID Controller Configuration
All locally attached disks that provide ONTAP Select with backing storage must sit behind a RAID
controller. Most commodity servers come with multiple RAID controller options across multiple price
points, and each with varying levels of functionality. The intent is to support as many of these options as
possible, providing they meet certain minimum requirements placed on the controller.
The RAID controller that is managing the ONTAP Select disks must support the following:
The HW RAID controller must have a battery backup unit (BBU) or flash-backed write cache (FBWC). The RAID controller must support a mode that can withstand at least one or two disk failures (RAID 5,
RAID 6).
The drive cache should be set to disabled.
The write policy should be configured for writeback mode with a fallback to writethrough upon BBU orflash failure. This is explained in further detail in the RAID Modesection of this document.
The I/O policy for reads must be set to cached.
7/25/2019 ONTAP Select Product Architecture
9/44
9
All locally attached disks that provide ONTAP Select with backing storage must be placed into a single
RAID group running RAID 5 or RAID 6. Using a single RAID group allows ONTAP to reap the benefits of
spreading incoming read requests across a higher number of disks, providing a significant gain in
performance. Additionally, performance testing was done against single-LUN vs. multi-LUN
configurations. No significant differences were found, so for simplicitys sake, we strongly recommend
creating the fewest number of LUNs necessary to support your configuration needs.
Best Practice
Although most configurations should require the creation of only a single LUN, if the physical server
contains a single RAID controller managing alllocally attached disks, we recommend creating two
LUNs: one to provide backing storage for the server OS and a second for ONTAP Select. In the event
of boot disk corruption, this allows the administrator to recreate the OS LUN without affecting ONTAP
Select.
At its core, ONTAP Select presents Data ONTAP with a set of virtual disks, provisioned from a backing
storage pool, using LUNs composed of locally attached spindles. Data ONTAP is presented with a set of
virtual disks, which it treats as physical, and the remaining portion of the storage stack is abstracted by
the hypervisor and RAID controller. Figure 1 shows this relationship in more detail, highlighting the
relationship between the physical RAID controller, the hypervisor, and the ONTAP Select VM. Note the
following:
RAID group and LUN configuration occur from within the server's RAID controller software.
Storage pool configuration happens from within the hypervisor.
Virtual disks are created and owned by individual VMs, in this case, ONTAP Select.
Figure 1) Virtual disk to physical disk mapping.
RAID Mode
Many RAID controllers support three modes of operation, each representing a significant difference in the
data path taken by write requests. These are:
Writethrough.All incoming I/O requests are written to the RAID controller cache and thenimmediately flushed to disk before acknowledging the request back to the host.
Writearound.All incoming I/O requests are written directly to disk, circumventing the RAIDcontroller cache.
7/25/2019 ONTAP Select Product Architecture
10/44
10
Writeback.All incoming I/O requests are written directly to the controller cache and immediatelyacknowledged back to the host. Data blocks are flushed to disk asynchronously using thecontroller.
Writeback mode offers the shortest data path, with I/O acknowledgement occurring immediately after the
blocks enter cache, and thus lower latency and higher throughput for mixed read/write workloads.
However, without the presence of a BBU or nonvolatile flash technology, when operating in this mode,
users run the risk of losing data should the system incur a power failure.
Because ONTAP Select requires the presence of a battery backup or flash unit, we can be confident that
cached blocks are flushed to disk in the event of this type of failure. For this reason, it is a requirement
that the RAID controller be configured in writeback mode.
Best Practice
The server RAID controller should be configured to operate in writeback mode. If write workload
performance issues are seen, check the controller settings and make sure that writethrough or
writearound is not enabled.
2.3 Virtualized NVRAMNetApp FAS systems are traditionally fitted with a physical NVRAM PCI card: a high-performing card
containing nonvolatile flash memory that provides a significant boost in write performance by granting
Data ONTAP with the ability to:
Immediately acknowledge incoming writes back to the client
Schedule the movement of modified data blocks back to the slower storage media (this process is
known as destaging)
Commodity systems are not traditionally fitted with this type of equipment because they can be cost
prohibitive. Therefore, the functionality of the NVRAM card has been virtualized and placed into a partition
on the ONTAP Select system boot disk. It is for precisely this reason that placement of the system virtual
disk of the instance is extremely important, and why the product requires the presence of a physical RAID
controller with a resilient cache.
Data Path Explained: vNVRAM and RAID Controller
The interaction between the virtualized NVRAM system partition and the RAID controller can be best
highlighted by walking through the data path taken by a write request as it enters the system.
Incoming write requests to the ONTAP Select virtual machine are targeted at the VMs NVRAM partition.
At the virtualization layer, this partition exists within an ONTAP Select system disk: a VMDK attached to
the ONTAP Select VM. At the physical layer, these requests are cached in the local RAID controller, like
allblock changes targeted at the underlying spindles. From here, the write is acknowledged back to the
host.
So at this point:
Physically, the block resides in the RAID controller cache, waiting to be flushed to disk.
Logically, the block resides in NVRAM, waiting for destaging to the appropriate user data disks.
Because changed blocks are automatically stored within the RAID controllers local cache, incoming
writes to the NVRAM partition are automatically cached and periodically flushed to physical storage
media. This should not be confused with the periodic flushing of NVRAM contents back to Data ONTAP
data disks. These two events are unrelated and occur at different times and frequencies.
7/25/2019 ONTAP Select Product Architecture
11/44
11
Figure 2 is intended to show the I/O path an incoming write takes, highlighting the difference between the
physical layer, represented by the RAID controller cache and disks, from the virtual layer, shown through
the virtual machines NVRAM and data virtual disks.
Note: Although blocks changed on the NVRAM VMDK are cached in the local RAID controller cache,the cache is not aware of the VM construct or its virtual disks. It stores all changed blocks on thesystem, of which NVRAM is only a part. This includes write requests bound for the hypervisor,
which is provisioned from the same backing spindles.
Figure 2) Incoming writes to ONTAP Select VM.
Best Practice
Because the RAID controller cache is used to store all incoming block changes and not only those
targeted toward the NVRAM partition, when choosing a RAID controller, select one with the largest
cache available. A larger cache allows for less frequent disk flushing and an increase in performance of
the ONTAP Select VM, the hypervisor, and any compute VMs colocated on the server.
2.4 High Availability
Although customers are starting to move application workloads from enterprise-class storage appliances
to software-based solutions running on commodity hardware, the expectations and needs around
resiliency and fault tolerance have not changed. A high-availability solution providing a zero RPO is
required, one that protects the customer from data loss due to a failure from any component in the
infrastructure stack. This makes asynchronous replication engines poor candidates to provide these
services.
A large portion of the SDS market is built on the notion of nonshared storage, with software replication
providing data resiliency by storing multiple copies of user data accross different storage silos. ONTAP
Select builds on this premise by using the synchronous replication features (RAID SyncMirror) provided
by clustered Data ONTAP to store an additional copy of user data within the cluster. This occurs within
the context of an HA pair. Every HA pair stores two copies of user data: one on storage provided by the
local node and one on storage provided by the HA partner. Within an ONTAP Select cluster, HA and
synchronous replication are tied together, and the functionality of the two cannot be decoupled or used
independently. As a result, the synchronous replication functionality is only available in the multinode
offering.
Note: In an ONTAP Select cluster, synchronous replication functionality is a function of the HAimplementation, not a replacement for the asynchronous SnapMirror or SnapVault replicationengines. Synchronous replication cannot be used independently from HA.
7/25/2019 ONTAP Select Product Architecture
12/44
12
Synchronous Replication
The Data ONTAP HA model is built on the notion of HA partners. As explained earlier, ONTAP Select
extends this architecture into the nonshared commodity server world by using the RAID SyncMirror
functionality that is present in clustered Data ONTAP to replicate data blocks between cluster nodes,
providing two copies of user data spread across an HA pair.
Note: This product is not intended to be an MCC-style disaster recovery replacement and cannot beused as a stretch cluster. Cluster network and replication traffic occurs using link-local IPaddresses and requires a low-latency, high-throughput network. As a result, spreading out clusternodes across long distances is not supported.
This architecture is represented by Figure 3. Note that the four-node ONTAP Select cluster is composed
of two HA pairs, each synchronously mirroring blocks back and forth. Data aggregates on each cluster
node are guaranteed to be identical, and in the event of a failover there is no loss of data.
Figure 3) Four-node ONTAP Select cluster.
Note: Only one ONTAP Select instance may be present on a physical server. That instance is tied tothe server, meaning the VM may not be migrated off to another server. ONTAP Select requiresunshared access to the local RAID controller of the system and is designed to manage the locallyattached disks, which would be impossible without physical connectivity to the storage.
Mirrored Aggregates
An ONTAP Select cluster is composed of four nodes and contains two copies of user data, synchronously
mirrored across HA pairs over an IP network. This mirroring is transparent to the user and is a property of
the aggregate assigned at the time of creation.
Note: All aggregates in an ONTAP Select cluster mustbe mirrored in order to insure data availability inthe case of a node failover and avoid a single point of failure in case of hardwarefailure.Aggregates in an ONTAP Select cluster are built from virtual disks provided from each
node in the HA pair and use: A local set of disks, contributed by the current ONTAP Select node
A mirror set of disks, contributed by the HA partner of the current node
Note: Both the local and mirror disks used to build a mirrored aggregate must be of the same size. Wewill refer to these aggregates as Plex 0 and Plex 1 to indicate the local and remote mirror pairsrespectively. The actual Plex numbers may be different in your installation.
This is an important point and fundamentally different from the way standard ONTAP clusters work. This
applies to all root and data disks within the ONTAP Select cluster. Because the aggregate contains both
7/25/2019 ONTAP Select Product Architecture
13/44
13
local and mirror copies of data, an aggregate that contains N virtual disks actually offers N/2 disks worth
of unique storage, because the second copy of data resides on its own unique disks.
Figure 4 depicts an HA pair within a four-node ONTAP Select cluster. Within this cluster is a single
aggregate, test, which uses storage from both HA partners. This data aggregate is composed of two
sets of virtual disks: a local set, contributed by the ONTAP Select owning cluster node (Plex 0), and a
remote set, contributed by the failover partner (Plex 1).
Plex 0 is the bucket that holds all localdisks. Plex 1 is the bucket that holds mirrordisks, or disks
responsible for storing a second replicated copy of user data. The node that owns the aggregate
contributes disks to Plex 0, and the HA partner of that node contributes disks to Plex 1.
In our figure, we have a mirrored aggregate with two disks. The contents of this aggregate are mirrored
across our two cluster nodes, with local disk NET-1.1 placed into the Plex 0 bucket and remote disk NET-
2.1 placed into Plex 1. In this example, aggregate test is owned by the cluster node to the left and uses
local disk NET-1.1 and HA partner mirror disk NET-2.1.
Figure 4) ONTAP Select mirrored aggregate.
Note:When an ONTAP Select cluster is deployed, all virtual disks present on the system areautoassigned to the correct plex, requiring no additional step from the user with respect to disk
assignment. This prevents the accidental assignment of disks to an incorrect plex and makes sure of
optimal mirror disk configuration.
For an example of the process of building a mirrored aggregate using the Data ONTAP command line
interface, refer to Configuration of a mirrored aggregate.
Best Practice
While the existence of the mirrored aggregate is used to guarantee an up to date (RPO 0) copy of the
primary aggregate, care should be taken that the primary aggregate does not run low on free space. A
low space condition in the primary aggregate may cause ONTAP to delete the common snapshot usedas the baseline for storage giveback. While this works as designed in order to accommodate client
writes, the lack of a common snapshot on failback will require the ONTAP Select node to do a full base
line from the mirrored aggregate. This operation can take a significant amount of time in a shared
nothing environment.
A good baseline for monitoring aggregate space utilization is 85%.
7/25/2019 ONTAP Select Product Architecture
14/44
14
Write Path Explained
Synchronous mirroring of data blocks between cluster nodes and the requirement of no data loss in the
event of a system failure have a significant impact on the path an incoming write takes as it propagates
through an ONTAP Select cluster. This process consists of two stages:
Acknowledgement
Destaging
Writes to a target volume occur over a data LIF and are committed to the virtualized NVRAM partition,
present on a system disk of the ONTAP Select node, before being acknowledged back to the client. On
an HA configuration, an additional step occurs, because these NVRAM writes are immediately mirrored to
the HA partner of the target volumes owner before being acknowledged. This insures the file system
consistency on the HA partner node, in case of a hardware failure on the original node.
After the write has been committed to NVRAM, Data ONTAP periodically moves the contents of this
partition to the appropriate virtual disk, a process known as destaging. This process only happens once,
on the cluster node owning the target volume, and does not happen on the HA partner.
Figure 5 shows the write path of an incoming write request to an ONTAP Select node.
Figure 5) ONTAP Select write path workflow.
Incoming write acknowledgement:
1. Writes enter the system through a logical interface owned by Select A
2. Writes are committed to both local system memory and NVRAM then synchronously mirrored to theHA partner
a. Once the IO request is present on both HA nodes it is then acknowledged back to the client
Destaging to virtual disk:
3. Writes are destaged from system memory to aggregate
4. Mirror engine synchronously replicates blocks to both plexes
7/25/2019 ONTAP Select Product Architecture
15/44
15
Disk Heartbeating
Although the ONTAP Select HA architecture leverages many of the code paths used by the traditional
FAS arrays, some exceptions exist. One of these exceptions is in the implementation of disk-based
heartbeating, a nonnetwork-based method of communication used by cluster nodes to prevent network
isolation from causing split-brain behavior. Split brain is the result of cluster partitioning, typically caused
by network failures, whereby each side believes the other is down and attempts to take over cluster
resources. Enterprise-class HA implementations must gracefully handle this type of scenario, and Data
ONTAP does this through a customized disk-based method of heartbeating. This is the job of the HA
mailbox, a location on physical storage that is used by cluster nodes to pass heartbeat messages. This
helps the cluster determine connectivity and therefore define quorum in the event of a failover.
On FAS arrays, which use a shared-storage HA architecture, Data ONTAP resolves split-brain issues
through:
1. SCSI persistent reservations
2. Persistent HA metadata
3. HA state sent over HA interconnect
However, within the shared-nothing architecture of an ONTAP Select cluster, a node is only able to see
its own local storage and not that of the HA partner. Therefore, when network partitioning isolates eachside of an HA pair, the preceding methods of determining cluster quorum and failover behavior are
unavailable.
Although the existing method of split-brain detection and avoidance cannot be used, a method of
mediation is still required, one that fits within the constraints of a shared-nothing environment. ONTAP
Select extends the existing mailbox infrastructure further, allowing it to act as a method of mediation in
the event of network partitioning. Because shared storage is unavailable, mediation is accomplished
through access to the mailbox disks over network-attached storage. These disks are spread throughout
the cluster, across an iSCSI network, so intelligent failover decisions can be made by a cluster node
based on access to these disks. If a node is able to access the mailbox disks of all cluster nodes outside
of its HA partner, it is likely up and healthy. If not, network connectivity can be isolated to itself.
Note: The mailbox architecture and disk-based heartbeating method of resolving cluster quorum and
split-brain issues are the reasons the multinode variant of ONTAP Select requires four separatenodes.
HA Mailbox Posting
The HA mailbox architecture uses a message post model. At repeated intervals, cluster nodes post
messages to all other mailbox disks across the cluster, stating that the node is up and running. Within a
healthy cluster, at any given point in time, a single mailbox disk on a cluster node will have messages
posted from all other cluster nodes.
Attached to each Select cluster node is a virtual disk that is used specifically for shared mailbox access.
This disk is referred to as the mediator mailbox disk, since its main function is to act as a method of
cluster mediation in the event of node failures or network partitioning. This mailbox disk contains
partitions for each cluster node and is mounted over an iSCSI network by other Select cluster nodes.
Periodically, these nodes will post health status to the appropriate partition of the mailbox disk. Usingnetwork accessible mailbox disks spread throughout the cluster allows us to infer node health through a
reachability matrix. For example, if cluster nodes A and B can post to the mailbox of cluster node D, but
not node C, and cluster node D cannot post to the mailbox of node C, its likely that node C is either down
or network isolated and should be taken over.
7/25/2019 ONTAP Select Product Architecture
16/44
16
HA Heartbeating
Like NetApps FAS platforms, ONTAP Select periodically sends HA heartbeat messages over the HA
interconnect. Within the ONTAP Select cluster, this is done over a TCP/IP network connection that exists
between HA partners. Additionally, disk-based heartbeat messages are passed to all HA mailbox disks,
including mediator mailbox disks. These messages are passed every few seconds and read back
periodically. The frequency with which these are sent/received allow the ONTAP Select cluster to detect
HA failure events within 15 seconds, the same window available on FAS platforms. When heartbeat
messages are no longer being read, a failover event is triggered.
Figure 6 illustrates the process of sending and receiving heartbeat messages over the HA interconnect
and mediator disks from the perspective of a single ONTAP Select cluster node, node C. Note that
network heartbeats are sent over the HA interconnect to the HA partner, node D, while disk heartbeats
use mailbox disks across all cluster nodes, A, B, C, and D.
Figure 6) HA heartbeating: steady state.
2.5 Network Configurations
Decoupling Data ONTAP from physical hardware and providing it to customers as a software package
designed to run on commodity servers introduced a new problem to NetApp, one best summed up by two
important questions:
How can we be confident that Data ONTAP will run reliably with variability in the underlying networkconfiguration?
Does running Data ONTAP as a virtual machine guarantee implicit support for any software-definednetwork configuration?
Providing a storage management platform as software, instead of hardware, requires supporting a level of
abstraction between the storage OS and its underlying hardware resources. Configuration variability,which can affect the hardwares ability to meet the needs to the OS, becomes a real problem. Supporting
any VM network configuration would potentially introduce the possibility of resource contention into
cluster network communications, an area that has traditionally been owned exclusively by Data ONTAP.
Therefore, to make sure that Data ONTAP has sufficient network resources needed for reliable cluster
operations, a specific set of architecture requirements has been placed around the ONTAP Select
network configuration.
7/25/2019 ONTAP Select Product Architecture
17/44
17
The network architecture of the single-node variant of the ONTAP Select platform is different from the
multinode version. This section first dives into the more complicated multinode solution and then covers
the single-node configuration, which is a logical subset.
Network Configuration: Multinode
The multinode ONTAP Select network configuration consists of two networks: an internal network,
responsible for providing cluster and internal replication services, and an external network, responsible for
providing data access and management services. End-to-end isolation of traffic that flows within these
two networks is extremely important in allowing us to build an environment that ensures the cluster
resiliency.
Figure 7) ONTAP Select multinode network configuration.
These networks are represented in Figure 7, which shows a four-node ONTAP Select cluster running ona VMware vSphere platform. Note that each ONTAP Select instance resides on a separate physical
server and internal and external traffic is isolated through the use of separate network port groups, which
are assigned to each virtual network interface and allow the cluster nodes to share the same physical
switch infrastructure.
Each ONTAP Select virtual machine contains six virtual network adapters, presented to Data ONTAP as
a set of six network ports, e0a through e0f. Although ONTAP treats these adapters as physical NICs, they
are in fact virtual and map to a set of physical interfaces through a virtualized network layer. As a result,
each hosting server does not require six physical network ports.
Note: Adding virtual network adapters to the ONTAP Select VM is not supported.
These ports are preconfigured to provide the following services:
e0a, e0b. Data and management LIFs
e0c, e0d. Cluster network LIFs
e0e. RAID SyncMirror (RSM)
e0f. HA interconnect
Ports e0a and e0b reside on the external network. While ports e0c e0f perform several different
functions, collectively they comprise the internal Select network. When making network design decisions,
these ports should be placed on a single L2 network. There is no need to separate these virtual adapters
across different networks.
7/25/2019 ONTAP Select Product Architecture
18/44
18
The relationship between these ports and the underlying physical adapters can be seen in Figure 8,
which depicts one ONTAP Select cluster node on the ESX hypervisor.
Figure 8) Network configuration of a multinode ONTAP Select VM.
Note that in Figure 8, internal traffic and external traffic are split across two different vSwitches.
Segregating traffic across different physical NICs makes sure that we are not introducing latencies into
the system due to insufficient access to network resources. Additionally, aggregation through NIC
teaming makes sure that failure of a single network adapter does not prevent the ONTAP Select cluster
node from accessing the respective network.
Refer to the Networkingsection for network configuration best practices.
LIF Assignment
With the introduction of IPspaces, Data ONTAP port roles have been deprecated. Like FAS, ONTAP
Select clusters contain both a default and cluster IPspace. By placing network ports e0a and e0b into thedefault IPspace and ports e0c and e0d into the cluster IPspace, we have essentially walled off those ports
from hosting LIFs that do not belong. The remaining ports within the ONTAP Select cluster are consumed
through the automatic assignment of interfaces providing internal services and not exposed through the
ONTAP shell, as is the case with the RSM and HA interconnect interfaces.
Note: Not all LIFs are visible through the ONTAP command shell. The HA interconnect and RSMinterfaces are hidden from ONTAP and used internally by FreeBSD to provide their respectiveservices.
The network ports/LIFs are explained in further detail in the following sections.
Data and Management LIFs (e0a, e0b)
Data ONTAP ports e0a and e0b have been delegated as candidate ports for logical interfaces that carrythe following types of traffic:
SAN/NAS protocol traffic (CIFS, NFS, iSCSI)
Cluster, node, and SVM management traffic
Intercluster traffic (SnapMirror, SnapVault)
Note: Cluster and node management LIFs are automatically created during ONTAP Select clustersetup. The remaining LIFs may be created postdeployment.
7/25/2019 ONTAP Select Product Architecture
19/44
19
Cluster Network LIFs (e0c, e0d)
Data ONTAP ports e0c and e0d have been delegated as home ports for cluster interfaces. Within each
ONTAP Select cluster node, two cluster interfaces are automatically generated during Data ONTAP setup
using link-local IP addresses (169.254.x.x).
Note: These interfaces cannot be assigned static IP addresses, and additional cluster interfaces should
not be created.Cluster network traffic must flow through a low-latency, nonrouted layer 2 network. Due to cluster
throughput and latency requirements, the ONTAP Select cluster is expected to be physically located
within close proximity (for example, multipack, single data center). Building a stretch cluster configuration
by separating HA nodes across a wide area network or across significant geographical distances is not
supported.
Note: To make sure of maximum throughput for cluster network traffic, this network port is configured touse jumbo frames (9000 MTU). This is not configurable, so to make sure of proper clusteroperation, verify that jumbo frames are enabled on all upstream virtual and physical switchesproviding internal network services to ONTAP Select cluster nodes.
RAID SyncMirror Traffic (e0e)
Synchronous replication of blocks across HA partner nodes occurs using an internal network interfaceresiding on network port e0e. This functionality happens automatically, using network interfaces
configured by Data ONTAP during cluster setup, and requires no configuration by the administrator.
Because this port is reserved by Data ONTAP for internal replication traffic, neither the port nor the
hosted LIF is visible in the Data ONTAP CLI or management tooling. This interface is configured to use
an automatically generated link-local IP address, and the reassignment of an alternate IP address is not
supported.
Note: This network port requires the use of jumbo frames (9000 MTU).
Throughput and latency requirements that are critical to the proper behavior of the replication network
dictate that ONTAP Select nodes be located within close physical proximity, so building a hot disaster
recovery solution is not supported.
HA Interconnect (e0f)
NetApp FAS arrays use specialized hardware to pass information between HA pairs in an ONTAP cluster.
Software-defined environments, however, do not tend to have this type of equipment available (such as
Infiniband or iWARP devices), so an alternate solution is needed. Although several possibilities were
considered, ONTAP requirements placed on the interconnect transport required that this functionality be
emulated in software. As a result, within an ONTAP Select cluster, the functionality of the HA interconnect
(traditionally provided by hardware) has been designed into the OS, using Ethernet as a transport
mechanism.
Each ONTAP Select node is configured with an HA interconnect port, e0f. This port hosts the HA
interconnect network interface, which is responsible for two primary functions:
Mirroring the contents of NVRAM between HA pairs Sending/receiving HA status information and network heartbeat messages between HA pairs
HA interconnect traffic flows through this network port using a single network interface by layering RDMA
frames within Ethernet packets. Similar to RSM, neither the physical port nor the hosted network interface
is visible to users from either the ONTAP CLI or management tooling. As a result, the IP address of this
interface cannot be modified, and the state of the port cannot be changed.
Note: This network port requires the use of jumbo frames (9000 MTU).
7/25/2019 ONTAP Select Product Architecture
20/44
20
Network Configuration: Single Node
Single-node ONTAP Select configurations do not require the ONTAP internal network, because there is
no cluster, HA, or mirror traffic. Unlike the multi-node version of the ONTAP Select product which contains
6 virtual network adapters, each ONTAP Select virtual machine contains 2 virtual network adapters,
presented to Data ONTAP network ports e0aand e0b.
These ports will be used to provide all the following services: Data, Management and Intercluster LIFs
The relationship between these ports and the underlying physical adapters can be seen in Figure 9,
which depicts one ONTAP Select cluster node on the ESX hypervisor.
Figure 9) Network configuration of single-node ONTAP Select VM.
Note that unlike the multinode configuration, the ONTAP Select VM is configured to use only a singlevSwitch: vSwitch0. Also note that similar to the multinode solution, this vSwitch is backed by twophysical NIC ports, eth0 and eth1 which is required to insure the resiliency of the configuration.
Best Practice
We encourage splitting physical network ports into vSwitches across ASIC boundaries. In the event
where a NIC has two ASICs, pick one from each when teaming for the internal and external networks.
LIF Assignment
As explained in the multinode LIF assignmentsection of this document, IPspaces are used by ONTAP
Select to keep cluster network traffic separate from data and management traffic. The single-node variant
of this platform does not contain a cluster network, therefore no ports are present in the cluster IPspace.
Note: Cluster and node management LIFs are automatically created during ONTAP Select clustersetup. The remaining LIFs may be created postdeployment.
2.6 Networking: Internal and External
ONTAP Select Internal Network
The internal ONTAP Select network, which is only present in the multinode variant of the product, is
responsible for providing the ONTAP Select cluster with cluster communication, HA interconnect, and
synchronous replication services. This network includes the following ports and interfaces:
e0c, e0d.Hosting cluster network LIFs
7/25/2019 ONTAP Select Product Architecture
21/44
21
e0e.Hosting the RAID SyncMirror (RSM) interface
e0f.Hosting the HA interconnect
The throughput and latency of this network are critical in determining the performance and resiliency of
the ONTAP Select cluster. Network isolation is required for cluster security and to make sure that system
interfaces are kept separate from other network traffic. Therefore, this network must be used exclusively
by the ONTAP Select cluster.
Note: Using the internal network for non-Select cluster traffic, such as application or managementtraffic, is not supported. There can be no other VMs or hosts on the ONTAP-internal VLAN.
Network packets traversing the internal network must be be on a dedicated VLAN tagged layer-2network. This can be accomplished either by:Assigning a VLAN-tagged port group to the internalvirtual NICs (e0ce0f)
Using the native VLAN provided by the upstream switch
Although this substantially reduces broadcast traffic on the network, the VLAN ID is also used in the MAC
address generation of the Data ONTAP ports associated with the internal network.
ONTAP Select External NetworkThe ONTAP Select external network is responsible for all outbound communications by the cluster and
therefore is present on both the single-node and multinode configurations. Although this network does not
have the tightly defined throughput requirements of the internal network, the administrator should be
careful not to create network bottlenecks between the client and ONTAP VM, because performance
issues could be mischaracterized as ONTAP Select problems.
Internal Versus External network
Table 3 highlights the major differences between the ONTAP Select internal and external networks.
Table 3) Internal versus external network quick reference.
Description Internal Network External Network
Network servicesCluster, HA/IC, RAIDSyncMirror (RSM)
Data, management,intercluster (SnapMirror
and SnapVault)
VLAN tagging Required Optional
Frame size (MTU) 9,000 1,500 (default) / 9000(supported)
NIC aggregation Required Required
IP address assignment Autogenerated User defined
DHCP support No No
NIC Aggregation
To make sure that the internal and external networks have both the necessary bandwidth and resiliency
characteristics required to provide high performance and fault tolerance, physical network adapter
aggregation is used. This is a requirement on both the internal and external networks of the ONTAP
Select cluster, regardless of the underlying hypervisor being used, and provides the ONTAP Select
cluster with two major benefits:
7/25/2019 ONTAP Select Product Architecture
22/44
22
Isolation from a single physical port failure
Increased throughput
NIC aggregation allows the ONTAP Select instance to balance network traffic across two physical ports.
LACP-enabled port channels are only supported on the External Network (note that LACP is only
available on when using distributed vSwitches).
Best Practice
In the event that a NIC has multiple ASICs, select one network port from each ASIC when building
network aggregation constructs through NIC teaming for the internal and external networks.
MAC Address Generation
The MAC addresses assigned to all ONTAP Select network ports are generated automatically by the
included deployment utility, using a platform-specific organizationally unique identifier (OUI) specific to
NetApp to make sure there is no conflict with FAS systems. A copy of this address is then stored in an
internal database, within the ONTAP Select installation VM (ONTAP Deploy), to prevent accidental
reassignment during future node deployments. At no point should the administrator modify the assigned
MAC address of a network port.
3 Deployment and Management
This section covers the deployment and management aspects of the ONTAP Select product.
3.1 ONTAP Select Deploy
The ONTAP Select cluster is deployed using specialized tooling that provides the administrator with the
ability to build the ONTAP cluster as well as manage various aspects of the virtualized server. This utility,
called ONTAP Select Deploy, comes packaged inside of an installation VM along with the ONTAP Select
OS image. Bundling the deployment utility and ONTAP Select bits inside of a single virtual machine
allows NetApp to include all the necessary support libraries and modules while helping reduce thecomplexity of the interoperability matrix between various versions of ONTAP Deploy and ONTAP Select.
The ONTAP Deploy application can be accessed two ways:
Command-line interface (CLI)
REST API
The ONTAP Deploy CLI is shell-based and immediately accessible upon connecting to the installation VM
using SSH. Navigation of the shell is similar to that of the ONTAP shell, with commands bundled into
groupings that provide related functionality (for example, network create, network show, network
delete).
For automated deployments and integration into existing orchestration frameworks, ONTAP Deploy can
also be invoked programmatically, through a REST API. All functionality available through the shell-based
CLI is available through the API.
Further ONTAP Deploy details can be found in the ONTAP Select 9.0 Installation and Setup Guide.
The ONTAP Deploy VM can be placed anywhere in the environment, provided there is network
connectivity to the ONTAP Select target physical server. For more information, refer to the ONTAP
Deploy VM Placementsection of design considerations portion of this document.
7/25/2019 ONTAP Select Product Architecture
23/44
23
Server Preparation
Although ONTAP Deploy provides the user with functionality that allows for configuration of portions of
the underlying physical server, there are several requirements that must be met before attempting to
manage the server. This can be thought of as a manual preparation phase, because many of the steps
are difficult to orchestrate through automation. This preparation phase involves the following:
RAID controller and attached local storage is configured.! RAID groups and LUNs have been provisioned.
Physical network connectivity to server is verified.
Hypervisor is installed.
Virtual networking constructs are configured (vSwitches/port groups).
Note: After the ONTAP Select cluster has been deployed, the appropriate ONTAP management toolingshould be used to configure SVMs, LIFs, volumes, and so on. ONTAP Deploy does not providethis functionality.
The ONTAP Deploy utility and ONTAP Select software are bundled together into a single virtual machine,
which is then made available as a .OVA file for vSphere. The bits are available from the NetApp Support
site, from this link:
http://mysupport.netapp.com/NOW/cgi-bin/software
This installation VM runs the debian Linux OS and has the following properties:
2 vCPUs
4GB RAM
40GB virtual disk
Multiple ONTAP Select Deploy Instances
Depending on the complexity of the environment, it may be beneficial to have more than one ONTAP
Deploy instance managing the ONTAP Select environment. When this is desired, make sure that each
ONTAP Select cluster is managed by a dedicated ONTAP Deploy instance. ONTAP Seploy stores cluster
metadata within an internal database, so managing an ONTAP Select cluster using multiple ONTAPDeploy instances is not recommended.
When deciding whether to use multiple installation VMs, keep in mind that while ONTAP Deploy attempts
to create unique MAC addresses by using a numeric hash based on the IP address of the installation VM,
the uniqueness of the MAC address can only be guaranteed within that Deploy instance. As there is no
communication across Deploy instances, its possible for 2 separate instances to assign multiple ONTAP
Select network adapters with the same MAC address.
Best Practice
To eliminate the possibility of having multiple Deploy instances assign duplicate MAC addresses, one
Deploy instance per L2 network should be used to manage existing or deploy new Select Clusters /
Nodes
Note: Each ONTAP Deploy can generate 64,000 unique MAC addresses. Each ONTAP Select nodeconsumes 4 MAC addresses for its internal communication network schema. Therefore eachONTAP Deploy could deploy a theoretical maximum of 16,000 Select nodes (the equivalent of4,000 4-node Select Clusters).
7/25/2019 ONTAP Select Product Architecture
24/44
24
3.2 Licensing
ONTAP Select provides a flexible, consumption-based licensing model, specifically designed to allow
customers to only pay for the storage that they need. Capacity licenses are sold in 1TB increments and
must be applied to each node in the ONTAP Select cluster within 30 days of deployment. Failure to apply
a valid capacity license to each cluster node results in the VM being shut until a valid license is reapplied.
3.3 ONTAP Management
Because ONTAP Select runs Data ONTAP, it supports many common NetApp management tools. As a
result, after the product is deployed and Data ONTAP is configured, it can be administered using the
same set of applications that a system administrator would use to manage FAS storage arrays. There is
no special procedure required to build out an ONTAP configuration, such as creating SVMs, volumes,
LIFs, and so on.
4 Storage Design Considerations
This section covers the various storage-related options and best practices that should be taken into
consideration when building a single-node or a 4 node ONTAP Select cluster. The choices made by the
administrator when building the underlying infrastructure can have a significant impact on both theperformance and resiliency of the ONTAP Select cluster.
4.1 Storage Provisioning
The flexibility provided to the administrator by the ONTAP Select product requires that it support
variability in the underlying hardware configurations. Server vendors offer the customer numerous
choices, providing different families of servers designed for different types of application workloads or
performance requirements. Even within a family, substantial variability may exist. Customers may
customize virtually every aspect of their configuration, so although two physical servers may come from
the same vendor and may even have the same model, they are likely composed of completely different
physical components. This has the potential to impact the ONTAP Select installation workflow, explained
in further detail later.
Homogeneous Configurations
The ONTAP Select product requires that all managed storage sit behind a single RAID controller.
Managed storage is defined as storage that is consumed by the ONTAP Select VM and may not
represent all storage attached to the system. A server may have different types of locally attached
storage available, such as an internal flash drive (possibly used as a boot device) or even an SSD/SAS
hybrid setup with the SSDs being managed by one controller or HBA and the SAS spindles by another.
Note: All locally attached storage that is managed by ONTAP Select must be of like kind with respect tostorage type and speed. Furthermore, spreading out the VMs virtual disks across multiple RAIDcontrollers or storage types (including nonRAID backed storage) is not supported.
4. From a storage standpoint, physical servers that are candidates for hosting an ONTAP Select clusterare frequently configured two different ways, described later.
Single RAID Group
Most RAID controllers support a maximum of 32 drives for a single RAID group. Extensive performance
testing was done to determine whether there was any benefit in splitting ONTAP Select virtual drives
across LUNs from multiple RAID groups. None was found.
In the unlikely event that the target server has more than 32 attached drives, split the disks evenly across
multiple RAID groups. Provision an equal number of LUNs to the server and subsequently stripe the
7/25/2019 ONTAP Select Product Architecture
25/44
25
virtualized file system across all LUNs. This creates a single storage pool that ONTAP Select can then
use to carve into virtual disks.
Best Practice
All locally attached storage on the server should be configured into a single RAID group. In the
multinode configuration, no hot spares should be used because a mirror copy of the data makes sureof data availability in the event of multiple drive failures.
Local Disks Shared Between ONTAP Select and OS
5. The most common server configuration is one where all locally attached spindles sit behind a singleRAID controller. In this style of configuration, a single RAID group using all of the attached storageshould be created. From there, two LUNs should be provisioned: one for the hypervisor and anotherfor the ONTAP Select VM.
Note: The one LUN for ONTAP Select statement assumes the physical storage capacity of the systemdoesnt surpass the hypervisor-supported file system extent limits. See Multiple LUNsfor moreinformation.
6. For example, lets say a customer purchases an HP DL380 g8 with six internal drives and a singleSmart Array P420i RAID controller. All internal drives are managed by this RAID controller, and noother storage is present on the system.
7. Figure 10 shows this style of configuration. In this example, no other storage is present on thesystem, so the hypervisor needs to share storage with the ONTAP Select node.
Figure 10) Server LUN configuration with only RAID-managed spindles.
Provisioning both LUNs from the same RAID group allows the hypervisor OS (and any client VMs that are
also provisioned from that storage) to benefit from RAID protection, preventing a single-drive failure from
bringing down the entire system.
Best Practice
Separating the OS LUN from the storage managed by ONTAP Select prevents a catastrophic failure
that requires a complete OS reinstallation or LUN reprovisioning from affecting the ONTAP Select VM
or user data. Its strongly encouraged that two LUNs are used in this style of configuration.
Local Disks Split Between ONTAP Select and OS
The other possible configuration provided by server vendors involves configuring the system with multiple
RAID or disk controllers. In this configuration, a set of disks is managed by one disk controller, which may
7/25/2019 ONTAP Select Product Architecture
26/44
26
or may not offer RAID services, with a second set of disks being managed by a hardware RAID controller
that is able to offer RAID 5/6 services.
With this style of configuration, the set of spindles that sits behind the RAID controller that is able to
provide RAID 5/6 services should be used exclusively by the ONTAP Select VM. All spindles should be
configured into a single RAID group, and from there, a single LUN should be provisioned and used by
ONTAP Select. The second set of disks is reserved for the hypervisor OS (and any client VMs not using
ONTAP storage).
This is shown in further detail with Figure 11.
Figure 11) Server LUN configuration on mixed RAID/non-RAID system.
Multiple LUNsAs servers become equipped with larger drives, the guidance around singleRAID group/single-LUN
configurations must change. When a single LUN becomes larger than the supported extent limit of the
underlying hypervisor, storage must be broken up into multiple LUNs to allow for successful file system
creation. The term extent refers to an area of storage that is used by the file system and, for the purpose
of this section, to the size of the disk or LUN that the hypervisor can use within a single file system.
Best Practice
ONTAP Select receives no performance benefits by increasing the number of LUNs within the RAID
group. Adding additional LUNs should only be done to bypass hypervisor file system limitations.
vSphere VMFS Limits
The maximum extent size on a vSphere 5.5 server is 64TB. A VMFS file system cannot use disks or
LUNs that are larger than this size. If a server has more than 64TB of storage attached, multiple LUNs
must be provisioned for the host, each smaller than 64TB. A single vSphere datastore can contain
multiple extents (multiple disks/LUNs), and the underlying file system VMFS can stripe across multiple
storage devices.
When multiple LUNs are required, use the following guidance:
7/25/2019 ONTAP Select Product Architecture
27/44
27
Continue to group all locally attached spindles that are managed by ONTAP Select into a single RAIDgroup.
Create multiple equal sized LUNs (two should be sufficient).
Provision the vSphere datastore using all attached LUNs.
! The virtualized file system is striped across all available LUNs.
4.2 ONTAP Select Virtual Disks
The ONTAP Select cluster consumes the underlying storage provided by the locally attached spindles
through the abstraction layers provided by both the RAID controller and virtualized file system. ONTAP
Select is completely unaware of the underlying spindle type and does not attempt to manage the disk
directly. Storage presented to the ONTAP Select node is done through the window of a virtual disk, a
mechanism provided by the hypervisor that allows a virtualized file system to be broken up into pieces
that can be managed by an individual virtual machine and treated as if they were locally attached disks.
For example, on vSphere, an ONTAP Select cluster node is presented with a datastore that is nothing
more than a single LUN on which the VMFS file system has been configured. ONTAP Select provisions a
set of virtual disks, or VMDKs, and treats these disks as if they were physical, locally attached spindles.
ONTAP then assembles these disks into aggregates, from which volumes are provisioned and exported
to clients through the appropriate access protocol.
Best Practice
Similar to creating multiple LUNs, ONTAP Select receives no performance benefits by increasing the
number of virtual disks used by the system.
Virtual Disk Provisioning
To provide for a more streamlined user experience, the ONTAP Select management tool, ONTAP
Deploy, automatically provisions virtual disks from the associated storage pool and attaches them to the
ONTAP Select virtual machine. Virtual disks are then automatically assigned to a local and mirror storage
pool.
Because all virtual disks on the ONTAP Select VM are striped across the underlying physical disks, there
is no performance gain in building configurations with a higher number of virtual disks and structuring
application workloads across different aggregates. Additionally, shifting the responsibility of virtual disk
creation and assignment from the administrator to the management tool prevents the user from
inadvertently assigning a virtual disk to an incorrect storage pool.
ONTAP Select breaks up the underlying attached storage into equal sized virtual disks, each not
exceeding 8TB. A minimum of two virtual disks is created on each cluster node and assigned to the local
and mirror plex to be used within a mirrored aggregate.
For example, if ONTAP Select is assigned a datastore or LUN that is 31TB (space remaining after VM is
deployed and system and root disks are provisioned), 4 7.75TB virtual disks are created and assigned to
the appropriate ONTAP local and mirror plex.
Figure 12 shows this provisioning further. In this example, a single server has 16 locally attached 2TB
disks. Note that:
All disks are assigned to a single 32TB RAID group, with one disk acting as a hot spare.
From the RAID group, a single 30TB LUN is provided.
The ONTAP Select VM has ~250GB worth of system and root disks, leaving 29.75TB storage to bedivided into virtual data disks.
7/25/2019 ONTAP Select Product Architecture
28/44
28
4 7.4 TB data disks are created and placed into the appropriate ONTAP storage pools (two disks intothe local pool (plex 0) and two into the mirror pool (plex 1)
Figure 12) Virtual disk provisioning.
4.3 ONTAP Select Deploy
Careful consideration should be given to the placement of the ONTAP Deploy installation VM, because
there is flexibility provided to the administrator with respect to the physical server that hosts the virtual
machine.
VM Placement
The ONTAP Select installation VM can be placed on any virtualized server in the customer environment;
it can be collocated on the same host as an ONTAP Select instance or on a separate virtualized server.
The only requirement is that there exists network connectivity between the ONTAP Select installation VM
and the ONTAP Select virtual servers.
Figure 13 shows both of these deployment options.
Figure 13) ONTAP Select installation VM placement.
Note: To reduce overall resource consumption, the installation VM can be powered down when notactively managing ONTAP Select VMs or virtualized servers through the ONTAP Deploy utility.
7/25/2019 ONTAP Select Product Architecture
29/44
29
5 Network Design Considerations
This section covers the various network configurations and best practices that should be taken into
consideration when building an ONTAP Select cluster. Like the design and implementation of the
underlying storage, care should be taken when making network design decisions because these choices
have a significant impact on both the performance and resiliency of the ONTAP Select cluster.
5.1 Supported Network Configurations
Server vendors understand that customers have different needs, and choice is critical. As a result, when
purchasing a physical server, there are numerous options available when making network connectivity
decisions. Most commodity systems ship with a variety of NIC choices, offering single-port and multiport
options with varying permutations of 1Gb and 10Gb ports. Care should be taken when selecting server
NICs, because the choices provided by server vendors can have a significant impact on the overall
performance of the ONTAP Select cluster.
As mentioned in the Network Configurationssection of this document, link aggregation is a core construct
used to provide sufficient bandwidth to the external ONTAP Select network. Link Aggregation Control
Protocol (LACP) is a vendor-neutral standard providing an open protocol for network endpoints to use tobundle groupings of physical network ports into a single logical channel.
When choosing an ONTAP Select network configuration, use of LACP, which requires specialized
hardware support, may be a primary consideration. Although LACP requires support from both the
software virtual switch and the upstream physical switch, it can provide a significant throughput benefit to
incoming client protocol traffic.
Table 4 shows the supported NIC permutations and the underlying hypervisor support. Additionally, use
of LACP is also called out, because hypervisor specific dependencies prevent all combinations from
being supported.
Table 4) Network configuration support matrix.
Available Network Interfaces Internal Network(LACP not
supported)
External Network
2 x 1Gb + 2 x 10Gb 2 x 10Gb 2 x 1Gb (LACPsupported)
4 x 10Gb 2 x 10Gb 2 x 10Gb (LACPsupported)
2 x 10Gb: 2 x 10Gb 2 x 10Gb (same
physical ports / NoLACP support)
Because the performance of the ONTAP Select VM is tied directly to the characteristics of the underlying
hardware, increasing the throughput to the VM by selecting 10Gb-capable NICs results in a higherperforming cluster and a better overall user experience. When cost or form factor prevents the user from
designing a system with four 10Gb NICs, two 10Gb NICs can be used.
These choices are explained in further detail later.
7/25/2019 ONTAP Select Product Architecture
30/44
30
5.2 vSphere: vSwitch Configuration
ONTAP Select supports the use of both standard and distributed vSwitch configurations. This section
describes the vSwitch configuration and load-balancing policies that should be used in both two-NIC and
four-NIC configurations.
vSphere: Standard vSwitchAll vSwitch configurations require a minimum of two physical network adapters bundled into a single link
aggregation group (referred to as NIC teaming). On a vSphere server, NIC teams are the aggregation
construct used to bundle multiple physical network adapters into a single logical channel, allowing the
network load to be shared across all member ports. Its important to remember that NIC teams can be
created without support from the physical switch. Load-balancing and failover policies can be applied
directly to a NIC team, which is unaware of the upstream switch configuration. In this case, policies are
only applied to outbound traffic. In order to balance inbound traffic, the physical switch must be properly
configured. Port channels are the primary way this is accomplished.
Note: LACP-enabled port channels are not supported, due to the lack of vSphere switch support. Forthis functionality, distributed vSwitches (vDSs) are required. Static port channels are notsupported with ONTAP Select. Therefore we recommend using Distributed vSwitches for the
External Network.
Best Practice
To make sure of optimal load balancing across both the internal and the external ONTAP Select
networks, the load-balancing policy of Route based on originating virtual port should be used
Figure 14 shows the configuration of a standard vSwitch and port group, responsible for handling internal
communication services for the ONTAP Select cluster.
Figure 14) Standard vSwitch configuration.
vSphere: Distributed vSwitch
When using distributed vSwitches in your configuration, LACP can be used on the external network only,
in order to increase the throughput available for data, management, and intercluster replication traffic.
7/25/2019 ONTAP Select Product Architecture
31/44
31
Best Practice
When using a Dsitributed vSwitch with LACP for the external ONTAP Select network we recommend to
configure the load-balancing policy to Route based on IP Hash on the portgroup and Source and
Destination IP Adress and TCP/UDP port on the link aggregation group (LAG)
Regardless of the type of vSwitch, the internal ONTAP Select network does not support LACP. Therecommended load-balancing policy for the internal network remains Route based on originating
virtual port ID
Figure 15 shows LACP configured on the external distributed port group responsible for handling
outbound services for the ONTAP Select cluster. The unique number of network endpoints connecting to
the ONTAP Select instance should be taken into consideration when determining the load-balancing
policy for the link aggregation group (LAG). Although all load-balancing policies are technically supported,
the algorithms available are tailored to specific network configurations and topologies and more efficiently
distribute network traffic. In the event that the upstream physical switch has already been configured to
use LACP, use the existing settings within the ESX vDS, because mixing LACP load-balancing algorithms
can have unintended consequences.
Note that in Figure 15, the load-balancing mode for the external LACP-enabled aggregation group shouldbe set to Source and Destination IP Address and TCP/UDP port to make sure of optimal balancing
across all adapters.
Figure 15) LACP distributed vSwitch configuration.
Note: LACP requires the upstream switch ports to be configured as a port channel. Prior to enablingthis on the distributed vSwitch, make sure that an LACP-enabled port channel is properlyconfigured.
Best Practice
When using LACP, NetApp recommends that the LACP mode be set to ACTIVE on both the ESX and
the physical switches. Further more, the LACP timer should be set to FAST (1 sec) on the portchannel
interfaces and on the VMNICs.
7/25/2019 ONTAP Select Product Architecture
32/44
32
5.3 Physical Switch Configuration
Careful consideration should be taken when making connectivity decisions from the virtual switch layer to
physical switches. Separation of internal cluster traffic from external data services should extend to the
upstream physical networking layer through isolation provi