Upload
arline-chambers
View
228
Download
2
Tags:
Embed Size (px)
Citation preview
April 18, 2023 Dusan Baljevic, 2007 2
Server Market Trends
Revenue Share Revenue Share Percent
Vendor 2Q 2005 2Q 2005 2Q 2006 2Q 2006 Growth
IBM $3,893 31.9% $3,808 31.2% -2.2%
HP $3,480 28.5% $3,420 28.0% -1.7%
Sun $1,372 11.2% $1,585 13.0% 15.5%
Dell $1,286 10.5% $1,269 10.4% -1.3%
Fujitsu $551 4.5% $554 4.5% 0.5%
Others $1,638 13.4% $1,651 13.5% 0.8%
All $12,219 $12,287 0.6%
Note: Revenue in Billions of USD
Source: IDC http://www.itjungle.com/tug/tug082406-story02.html
April 18, 2023 Dusan Baljevic, 2007 3
ZFS Design Goals• A zettabyte (derived from the SI prefix zetta-)
is a unit of information or computer storage equal to one sextillion (one long scale trilliard) bytes:
Zetta Byte = 1021 Bytes ~ 270 Bytes• Officially introduced with Solaris 10 06/06• Combines Volume Manager and File System
(storage pool concept)• Based on a transactional object model• 128-bit file system (If Moore's Law holds, in 10
to 15 years humanity will need the 65th bit)• Data integrity
April 18, 2023 Dusan Baljevic, 2007 4
ZFS Design Goals (continued)• Snapshot and compression support• Endian-neutral• Automated common administrative tasks
and near-zero administration• Performance• Virtually unlimited scalability• Flexibility
April 18, 2023 Dusan Baljevic, 2007 5
ZFS – Modern ApproachF/S Creator Introduced O/S
HFS Apple Computer 1985 MacOS
VxFS VERITAS 1991 SVR4.0
AdvFS DEC pre-1993 Digital Unix
UFS1 Kirk McKusick 1994 4.4BSD
…
ZFS Sun Microsystems 2004Solaris
Reiser4 Namesys 2004 Linux
OCFS2 Oracle 2005 Linux
NILFS NTT 2005 Linux
GFS2 Red Hat 2006 Linux
ext4 Andrew Morton 2006 Linux
April 18, 2023 Dusan Baljevic, 2007 6
Traditional File Systems and ZFS (diagram courtesy of Sun Microsystems)
April 18, 2023 Dusan Baljevic, 2007 7
ZFS Storage Pools• Logical collection of physical storage
devices• Virtual storage pools make it easy to
expand or contract file systems by adding physical devices
• A storage pool is also the root of the ZFS file system hierarchy. The root of the pool can be accessed as a file system (for example, mount or unmount, snapshots, change properties)
• ZFS storage pools are divided into datasets (file system, volume, or snapshot). Datasets are identified by unique paths:
/pool-name/dataset-name
April 18, 2023 Dusan Baljevic, 2007 8
ZFS Administration Tools
• Standard tools are zpool and zfs• Most tasks can be run through a web
interface (ZFS Administration GUI and the SAM FS Manager use the same underlying web console packages):
https://hostname:6789/zfs
Prerequisite:
#/usr/sbin/smcwebserver start
#/usr/sbin/smcwebserver enable• zdb (ZFS debugger) command for support
engineers
April 18, 2023 Dusan Baljevic, 2007 9
ZFS Transactional Object Model
• All operations are copy-on-write (COW). Live data is never overwritten
• ZFS writes data to a new block before changing the data pointers and committing the write. Copy-on-write provides several benefits:
* Always-valid on-disk state
* Consistent, reliable backups
* Data rollback to known point in time • Time-consuming recovery procedures like fsck
are not required if the system is shut down in an unclean manner
April 18, 2023 Dusan Baljevic, 2007 10
ZFS Snapshot versus Clone• Snapshot is a read-only copy of a file
system (or volume) that initially consumes no additional space. It cannot be mounted as a file system. ZFS snapshots are immutable. This features is critical for supporting legal compliance requirements, such as Sarbanes-Oxley, where businesses have to demonstrate that the view of the data at a given point in time is correct
• Clone is a write-enabled “snapshot”. It can only be created from a snapshot. A clone can be mounted
• Snapshot properties are inherited at creation time and cannot be changed
April 18, 2023 Dusan Baljevic, 2007 11
ZFS Data Integrity• Solaris 10 with ZFS is the only known
operating system designed to provide end-to-end checksum capability for all data
• All data is protected by 64-bit checksums
• ZFS constantly reads and checks data to help ensure it is correct. If it detects an error in a mirrored pool, the technology can automatically repair the corrupt data
April 18, 2023 Dusan Baljevic, 2007 12
ZFS Data Integrity (continued)• Checksums stored with indirect blocks
• Self-validating, self-authenticating checksum tree
• Detects phantom writes, misdirections, common administrative errors (for example, swap on active ZFS disk)
April 18, 2023 Dusan Baljevic, 2007 13
ZFS Endianess
• ZFS is supported on both SPARC and x86 platforms
• One can easily move storage pools from a SPARC to an x86 server. Neither architecture pays a byte-swapping "tax" due to "adaptive endian-ness" technology (unique to ZFS)
April 18, 2023 Dusan Baljevic, 2007 14
ZFS Scalability• Enables administrators to state the intent
of their storage policies rather than all of the details needed to implement them
• To resize ZFS is easy. To resize an UFS requires downtime and data restore from tapes or other disks
• Maximum filename length = 255 Bytes• Allowable characters in directory entries =
Any Unicode except NUL • Maximum pathname length = No limit
defined• Maximum file size = 16 EB• Maximum volume size = 16 EB
April 18, 2023 Dusan Baljevic, 2007 15
ZFS Scalability (continued)• 248 snapshots in any file system • 248 files in any individual file system • 16 EB files • 16 EB attributes • 3x1023 PB storage pools • 248 attributes for a file • 248 files in a directory • 264 devices in a storage pool • 264 storage pools per system • 264 file systems per storage pool
April 18, 2023 Dusan Baljevic, 2007 16
ZFS Scalability (continued)ZFS Status
Stores File Owner Yes
POSIX File Permissions Yes
Creation Timestamps Yes
Last Access/Read Timestamp Yes
Last Metadata Change Timestamps
Yes
Last Archive Timestamps Yes
Access Control Lists Yes
April 18, 2023 Dusan Baljevic, 2007 17
ZFS Scalability (continued)ZFS Status
Extended Attributes Yes
Checksums Yes
XIP (Hotbar Image
Compression Format)No
Hard Links Yes
Soft Links Yes
Block Journaling Yes
Metadata-only Journaling No
April 18, 2023 Dusan Baljevic, 2007 18
ZFS Scalability (continued)ZFS Status
Transparent Compression Yes
Extents No
Variable Block Size Yes
Allocate-on-flush Yes
Incremental Snapshots Yes
Case-sensitive Yes
Case-preserving Yes
April 18, 2023 Dusan Baljevic, 2007 19
ZFS Scalability (continued)ZFS Status
Iostat Gives Details of IO Utilization
Yes
Rollbacks Yes
Clones Yes
Snapshots Use More Than 1% of Data Space to Create
No
Handles Whole Disk Failures Yes
Built-in Backup/Restore Yes
Integrated Quotas Yes (per file system)
April 18, 2023 Dusan Baljevic, 2007 20
ZFS Flexibility• When additional disks are added to an existing
mirrored or RAID-Z pool, the ZFS is “rebuilt” to redistribute the data. This feature owes a lot to its conceptual predecessor, WAFL, the “Write Anywhere File Layout” file system developed by NetApp for their network file server applicances
• Dynamic striping across all devices to maximize throughput
• Copy-on-write design (most disk writes sequential)
• Variable block sizes (up to 128 kilobytes), automatically selected to match workload
April 18, 2023 Dusan Baljevic, 2007 21
ZFS Flexibility (continued)• Globally optimal I/O sorting and
aggregation • Multiple independent prefetch streams with
automatic length and stride detection• Unlimited, instantaneous read/write
snapshots • Parallel, constant-time directory operations• Explicit I/O priority with deadline scheduling
April 18, 2023 Dusan Baljevic, 2007 22
ZFS Simplifies NFS• To share file systems via NFS, no entries in
/etc/dfs/dfstab are required
• Automatically handled by ZFS if the property sharenfs=on is set
• Commands
zfs share and zfs unshare
April 18, 2023 Dusan Baljevic, 2007 23
ZFS RAID Levels• ZFS file systems automatically stripe across
all top-level disk devices
• Mirrors and RAID-Z devices are considered to be top-level devices
• It is not recommended to mix RAID types in a pool (zpool tries to prevent this, but it can be forced with the -f flag)
April 18, 2023 Dusan Baljevic, 2007 24
ZFS RAID Levels (continued)
The following RAID levels are supported:
* RAID-0 (striping)
* RAID-1 (mirroring)
* RAID-Z (similar to RAID-5, but with variable-
width stripes to avoid RAID-5 write hole)
* RAID-Z2 (double-parity RAID-5)
April 18, 2023 Dusan Baljevic, 2007 25
ZFS RAID Levels (continued)• A RAID-Z configuration with N disks of size X
with P parity disks can hold approximately (N-P)*X bytes and can withstand one device failing
• Start a single-parity RAID-Z configuration at 3 disks (2+1)
• Start a double-parity RAID-Z2 configuration at 5 disks (3+2)
• (N+P) with P = 1 (RAID-Z) or 2 (RAID-Z2) and N equals 2, 4, or 8
• The recommended number of disks per group is between 3 and 9 (use multiple groups if larger)
April 18, 2023 Dusan Baljevic, 2007 27
ZFS and Disk Arrays with Own Cache
• ZFS does not “trust” that anything it writes to the ZFS Intent Log (ZIL) made it to your storage, until it flushes the storage cache
• After every write to the ZIL, ZFS executes an fsync() call to instruct the storage to flush its write cache to the disk. ZFS will not consider a write operation done until the ZIL write and flush have completed
• Problem might occur when trying to layer ZFS over an intelligent storage array with a battery-backed cache (due to arrays ability to use cache and override ZFS activities)
April 18, 2023 Dusan Baljevic, 2007 28
ZFS and Disk Arrays with Own Cache (continued)
• Initial tests show that HP EVAs and Hitachi/HP XP SAN are not affected and work well with ZFS
• Lab test will provide more comprehensive data for HP EVA and XP SAN
April 18, 2023 Dusan Baljevic, 2007 29
ZFS and Disk Arrays with Own Cache (continued)
Two possible solutions for SAN that are affected:
• Disable the ZIL. The ZIL is the way ZFS maintains consistency until it can get the blocks written to their final place on the disk. BAD OPTION!
• Configure the disk array to ignore ZFS flush commands. Quite safe and beneficial
April 18, 2023 Dusan Baljevic, 2007 30
Configure Disk Array to Ignore ZFS flush
For Engenio arrays (Sun StorageTek FlexLine
200/300 series, Sun StorEdge 6130, Sun
StorageTek 6140/6540, IBM DS4x00, many SGI
InfiniteStorage arrays):
• Shut down the server or, at minimum, export ZFS pools before running this
• Cut and paste the following into the script editor of the "Enterprise Management Window" of the SANtricity management GUI:
April 18, 2023 Dusan Baljevic, 2007 31
Configure Disk Array to Ignore ZFS flush – Script Example//Show Solaris ICS option
show controller[a] HostNVSRAMbyte[0x2, 0x21];
show controller[b] HostNVSRAMbyte[0x2, 0x21];
//Enable ICS
set controller[a] HostNVSRAMbyte[0x2, 0x21]=0x01;
set controller[b] HostNVSRAMbyte[0x2, 0x21]=0x01;
//Make changes - rebooting controllers
show "Rebooting A controller.";
reset controller[a];
show "Rebooting B controller.";
reset controller[b];
April 18, 2023 Dusan Baljevic, 2007 32
Why ZFS Now?• ZFS is positioned to support more file
systems, snapshots, and files in a file system than can possibly be created in the foreseeable future
• Complicated storage administration concepts are automated and consolidated into straightforward language, reducing administrative overhead by up to 80 percent
• Unlike traditional file systems that require a separate volume manager, ZFS integrates volume management functions. It breaks out of “one-to-one mapping between the file system and its associated volumes” limitation with the storage pool model. When capacity is no longer required by one file system in the pool, it becomes available to others
April 18, 2023 Dusan Baljevic, 2007 33
Dusan’s own experience• Numerous tests on a limited number of Sun
SPARCand Intel platforms consistently showed that ZFS outperformed Solaris Volume Manager (SVM) on same physical volumes by at least 25%
• Ease of maintenance made ZFS a very attractive option
• Ease of adding new physical volumes to pools and automated resilvering was impressive
April 18, 2023 Dusan Baljevic, 2007 34
Typical ZFS on SPARC PlatformUnless specifically requested, this type of
configuration is very common:
• 73 GB internal disks are used• Internal disks are c0t0d0s2 and c1t0d0s2• SVM is used for mirroring• Physical memory is 8GB
April 18, 2023 Dusan Baljevic, 2007 35
Typical Swap Consideration• Primary swap is calculated by using the
following…• If no additional storage available (only two
internal disks used):
2 x RAM, if RAM <= 8GB
1 x RAM, if RAM > 8GB• If additional storage available, make
primary swap relatively small (4GB) and add additional swaps on other disks as necessary
April 18, 2023 Dusan Baljevic, 2007 36
Typical ZFS on SPARC Platform (no Sun Cluster)
Slice Filesystem Size TypeMirror Description
s0 / 11GB UFS Yes Root
s1 Swap 8GB Yes Swap
s2 All 73GB
s3 /var 5GB UFS Yes /var
s4 / 11GB UFS Yes Alternate Root
s5 /var 5GB UFS Yes Alternate /var
s6 30GB ZFS Yes Application SW
s7 SDB replica 64MB SVM mgmt
April 18, 2023 Dusan Baljevic, 2007 37
Typical ZFS on SPARC Platform (up to Sun Cluster 3.1 – ZFS not supported)
Slice Filesystem Size Type Mirror Description
s0 / 30GB UFS Yes Root
s1 Swap 8GB Yes Swap
s2 All 73GB
s3 /global 512MB UFS Yes Sun Cluster (SC)
s4 / 30GB UFS YesAlternate Root
s5 /global 512MB UFS Yes Alternate SC
s6 Not used
s7 SDB replica 64MB SVM mgmt
April 18, 2023 Dusan Baljevic, 2007 38
Support for ZFS in Sun Cluster 3.2
• ZFS is supported as a highly available local file system in the Sun Cluster 3.2 release
• ZFS with Sun Cluster offers a file system solution combining high availability, data integrity, performance, and scalability, covering the needs of the most demanding environments
April 18, 2023 Dusan Baljevic, 2007 39
ZFS Best Practices• Run ZFS on servers that run 64-bit kernel • One GB or more of RAM is recommended • Because ZFS caches data in kernel
addressable memory, the kernel will possibly be larger than with other file systems. Use the size of physical memory as an upper bound to the extra amount of swap space that might be required
• Do not use slices on the same disk for both swap space and ZFS file systems. Keep the swap areas separate from the ZFS file systems
• Set up one storage pool using whole disks per system
April 18, 2023 Dusan Baljevic, 2007 40
ZFS Best Practices (continued)• Set up a replicated pool (raidz, raidz2, or
raid1 configuration) for all production environments
• Do not use disk slices for storage pools intended for production use
• Set up hot spares to speed up healing in the face of hardware failures
• For replicated pools, use multiple controllers to reduce hardware failures and improve performance
April 18, 2023 Dusan Baljevic, 2007 41
ZFS Best Practices (continued)• Run zpool scrub on a regular basis to identify
data integrity problems. For consumer-quality drives, set up weekly scrubbing schedule. For datacenter-quality drives, set up monthly scrubbing schedule
• If workloads have predictable performance characteristics, separate loads into different pools
• For better performance, use individual disks or at least LUNs made up of just a few disks
• Pool performance can degrade when it is very full and file systems are updated frequently (busy mail or web proxy server). Keep pool space under 80% utilization to maintain pool performance
April 18, 2023 Dusan Baljevic, 2007 42
ZFS Backups• Currently, ZFS does not provide a
comprehensive backup or restore utility like ufsdump and ufsrestore
• Use the zfs send and zfs receive commands to capture ZFS data streams
• You can use the ufsrestore command to restore UFS data into a ZFS file system
• Use ZFS snapshots as a quick and easy way to backup file systems
• Create an incremental snapshot stream (zfs send -i)
April 18, 2023 Dusan Baljevic, 2007 43
ZFS Backups (continued)• zfs send and zfs receive commands are
not enterprise-backup solutions • Sun StorEdge Enterprise Backup Software
(Legato Networker 7.3.2 and above) can fully backup and restore ZFS files including ACLs
• Veritas NetBackup product can be used to back up ZFS files, and this configuration is supported. However, it does not currently support backing up or restoring NFSv4-style ACL information from ZFS files. Traditional permission bits and other file attributes are correctly backed up and restored
April 18, 2023 Dusan Baljevic, 2007 44
ZFS Backups (continued)• IBM Tivoli Storage Manager backs up and
restores ZFS file systems with the CLI tools, but the GUI seems to exclude ZFS file systems. Non-trivial ZFS ACLs are not preserved
• Computer Associates' BrightStor ARCserve product backs up and restores ZFS file systems. ZFS ACLs are not preserved
April 18, 2023 Dusan Baljevic, 2007 45
ZFS and Data Protector• According to “HP Openview Storage Data
Protector Planned Enhancements to Platforms, Integrations, Clusters and Zero Downtime Backups version 4.2”, dated 12th of January 2007:
Back Agent (Disk Agent) support for Solaris 10 ZFS will be released in Data Protector 6 in March 2007
http://storage.corp.hp.com/Application/View/ProdCenter.asp?OID=326828&rdoType=filter&inpCriteria1=QuickLink&inpCriteria2=nothing&inpCriteriaChild1=3#QuickLink
April 18, 2023 Dusan Baljevic, 2007 46
ZFS by Example• To create a simple ZFS RAID-1 pool:
zpool create custpool mirror c0t0d0 c1t0d0
zpool create p2m mirror c0t0d0 c0t1d0 mirror c0t2d0 c0t3d0
• When only one file system in a storage pool, the zpool create command does everything:*Writes an EFI label on the disk*Creates a pool of the specified name including
structures for data protection * Creates a file system of same name, creates
directory and mounts file system on /custpool * Mount point preserved across reboots
April 18, 2023 Dusan Baljevic, 2007 47
ZFS by Example (continued)• To create a simple ZFS RAID-1 pool by
using more powerfull zfs command:
zfs create custpool mirror c0t0d0 c1t0d0• To create three file systems that share the
same pool, and one of the file systems is limited to, say, 525 MB:
zfs create custpool/apps zfs create custpool/home
zfs create custpool/web
zfs set quota=525m custpool/web
April 18, 2023 Dusan Baljevic, 2007 48
ZFS by Example (continued)• Mount file system via /etc/vfstab
zfs set mountpoint=legacy custpool/db • Check status of pools: zpool status• Umount / Mount all ZFS file systems zfs umount –a zfs mount –a• Upgrade ZFS pool version (from RAID-Z to
RAID-Z2, for example) zpool upgrade custpool
April 18, 2023 Dusan Baljevic, 2007 49
ZFS by Example (continued)• Clear pool's error count zpool clear custpool• Delete a file system zfs destroy custpool/web• Data integrity can be checked by running
manual scrubbing zpool scrub custpool zpool status -v custpool
• Check if any problem pools exist zpool status -x
April 18, 2023 Dusan Baljevic, 2007 50
ZFS by Example (continued)• To offline a failing disk drive zpool offline custpool c0t1d0• Once the drive has been physically
replaced, run the replace command against the device
zpool replace custpool c0t1d0• After an offlined drive has been replaced, it
can be brought back online zpool online custpool c0t1d0• Show installed ZFS version (V2 on Solaris
10u2, and V3 on Solaris 10u3) zpool upgrade
April 18, 2023 Dusan Baljevic, 2007 51
ZFS by Example (continued)• Create a snapshot from a file system zfs snapshot custpool/web@Mon
• Roll-back a snapshot (unmounts and remounts file system automatically)
zfs rollback –r custpool/web@Mon
• Delete a snapshot (important: clone must be deleted firstly – if it exists for a given snapshot!)
zfs destroy custpool/web@Mon
• A destroyed pool can sometimes be recovered
zpool import -D
April 18, 2023 Dusan Baljevic, 2007 52
ZFS by Example (continued)• Display I/O statistics for ZFS pools zfs iostat 5 fsstat zfs fsstat -F 5 5• Shows current mirror/pool device properties
zpool vdevs• Adds mirror to pool zpool add -f pool mirror c0t1d0s3 c0t1d0s4 • The zfs promote command enables you to
replace an existing ZFS file system with a clone of that file system. This feature is helpful when you want to run tests on an alternative version of a file system and then, make that alternative version of the file system the active file system.
April 18, 2023 Dusan Baljevic, 2007 53
ZFS by Example (continued)• Display ZFS properties zfs get -o property,value,source all custpool
• List pool and its snapshots/clones
zfs list -r custpool/web• Detaches device from a mirror. The
operation is refused if there are no other valid replicas of the data
zpool detach custpool c0t2d0
April 18, 2023 Dusan Baljevic, 2007 54
ZFS by Example (continued)• Snapshots cannot be mounted - hence,
they can not be backed up directly. Backups of snapshots can be done in two ways
* Create a clone, or
* Runzfs send custpool/web@Mon | zfs receive bckpool/webtmp
• Incremental backup zfs send –i custpool/web@Mon > /dev/rmt/0
April 18, 2023 Dusan Baljevic, 2007 55
ZFS by Example (continued)• The following commands illustrate how to test out
changes to a file system, and then replace the original file system with the changed one, using clones, clone promotion, and renaming:
zfs create pool/project/production zfs snapshot pool/project/production@today zfs clone pool/project/production@today pool/project/beta
zfs promote pool/project/beta zfs rename pool/project/production pool/project/legacy
zfs rename pool/project/beta pool/project/production
zfs destroy pool/project/legacy
April 18, 2023 Dusan Baljevic, 2007 56
ZFS by Example (continued)• The following commands sends a full
backup and then an incremental to a remote machine, restoring them into poolB/restored/web@a and poolB/restored/web@b, respectively. poolB must contain the file system poolB/restored, and must not initially contain poolB/restored/web
zfs send custpool/web@a | \ ssh host zfs receive poolB/restored/web@a
zfs send -i custpool/web@a \ custpool/web@b| ssh host zfs receive –d \ poolB/restored/web
April 18, 2023 Dusan Baljevic, 2007 57
ZFS-based swap• Do not swap to a file on a ZFS file system. A
ZFS swap file configuration is not supported
• Using a ZFS volume as a dump device is not supported
zfs create -V 5gb custpool/swapvol swap -a /dev/zvol/dsk/custpool/swapvol
April 18, 2023 Dusan Baljevic, 2007 58
Cross-platform ZFS Data Migration• To move a pool from SPARC machine
sunsrv1 to AMD machine amdsrv2:
1. On sunsrv1:
zpool export tank
2. Physically move disks from sunsrv1 to
amdsrv2
2. On amdsrv2:
zpool import tank
April 18, 2023 Dusan Baljevic, 2007 59
Fun with ZFS Pool Planning • Create two 3-way mirror zpool create -f ex1 mirror c1t0d0 c1t1d0 \
c1t2d0
zpool add -f ex1 mirror c1t6d0 c1t7d0 \
c1t8d0
• Striping 3 disks with 2-way mirror eachzpool create -f ex2 mirror c1t0d0 c1t6d0 \
mirror c1t1d0 c1t7d0 mirror c1t2d0 c1t8d0
April 18, 2023 Dusan Baljevic, 2007 60
ZFS and RBAC • Role-Based Access Control already
supports ZFS. /etc/security/prof_attr:
ZFS File System Management:::Create and \
Manage ZFS File \
Systems:help=RtZFSFileSysMngmnt.html
ZFS Storage Management:::Create and Manage \
ZFS Storage \
Pools:help=RtZFSStorageMngmnt.html
April 18, 2023 Dusan Baljevic, 2007 61
ZFS Command HistoryZFS logs the last group of commands on to the
zpool devices. Even if the pool is moved to a
different system (export/import), command history
is still available:
zpool history custpool
History for ‘custpool':
2006-11-10.03:01:56 zpool create custpool raidz c0t0d0 c0t0d1 c1t0d0 c1t1d1
2006-11-10.12:19:47 zpool export custpool
2006-11-10.12:20:07 zpool import custpool
April 18, 2023 Dusan Baljevic, 2007 62
ZFS Boot Recovery• If a panic-reboot loop is caused by a ZFS
software programming error, the server can be instructed to boot without the ZFS file systems
boot -m milestone=none• When the system is up, remount root file
system “rw” and remove file /etc/zfs/zpool.cache. The remainder of the boot can proceed with command
svcadm milestone• At that point, import the good pools. The
damaged pools may need to be re-initialized
April 18, 2023 Dusan Baljevic, 2007 63
ZFS and Containers• Solaris 10 06/06 supports the use of ZFS
file systems• It is possible to install a zone into a ZFS file
system, but the installer/upgrader program does not understand ZFS well enough to upgrade zones that reside on a ZFS file system
• Upgrading a server that has a zone installed on a ZFS file system is not yet supported
April 18, 2023 Dusan Baljevic, 2007 64
ZFS Test (OpenSolaris only)• Ztest was written by the ZFS Developers as a ZFS
unit test. The tool was developed in parallel with the ZFS functionality
• As features were added to ZFS, unit tests were also added to ztest. In addition, a separate test development team wrote and executed more functional and stress tests
• On an OpenSolaris server, ztest is located in:
/usr/bin/ztest Usage: ztest [-v vdevs (default: 5)] [-s sizeofeachvdev (default: 64M)] [-a alignmentshift
(default: 9) (use 0 for random)] [-m mirrorcopies (default: 2)] [-r raidzdisks (default: 4)] [-R raidzparity (default: 1)] [-d datasets (default: 7)] [-t threads (default: 23)] [-g gangblockthreshold (default: 32K)] [-i initialize pool i times (default: 1)] [-k kill percentage (default: 70%)] [-p poolname (default: ztest)] [-f file directory for vdev files (default: /tmp)] [-V(erbose)] (use multiple times for ever more blather) [-E(xisting)] (use existing pool instead of creating new one) [-T time] total run time (default: 300 sec) [-P passtime] time per pass (default: 60 sec)
April 18, 2023 Dusan Baljevic, 2007 65
ZFS Current Status• Solaris 10 Update 3 (11/06) has been
released. It brings a large number of features that have been in OpenSolaris into the fully supported release, including: ZFS command improvements and changes,
including RAID-Z2, hot-spares, recursive snapshots, promotion of clones, compact NFSv4 ACLs, destroyed pools recovery, error clearing, ZFS integration with FMA, and more
• As of OpenSolaris Build 53, a powerful new feature: ZFS and iSCSI integration
• One of the missing features: file systems are not mountable after detaching “submirrors”
April 18, 2023 Dusan Baljevic, 2007 66
ZFS Current Status (continued)• Improvements for NFS over ZFS. ZFS
complies with all NFS semantics even with write caches enabled. Disabling the ZIL (setting zil_disable to 1 using mdb and then mounting the filesystem) is one way to generate an improper NFS service. With the ZIL disabled, commit request are ignored with potential client's view corruption
• No support for reducing a zpool's size by removing devices
April 18, 2023 Dusan Baljevic, 2007 67
ZFS Mountroot (currently X86 only)• ZFS Mountroot provides capability of configuring a
ZFS root file system. It is not a complete boot solution - it relies on the existence of a small UFS boot environment
• ZFS Mountroot was integrated in Solaris Nevada Build 37 - OpenSolaris release (disabled by default)
• The ZFS Mountroot does not work on SPARC currently
• ACLs are not fully preserved when copying files from a UFS root to a ZFS root. ZFS will convert UFS style ACLs to the new NFSv4 ACLs, but they may not be entirely identical