37
Spectrum Scale 4.1 System Administration Spectrum Scale Information Lifecycle Management (ILM) Tools © Copyright IBM Corporation 2015

Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

  • Upload
    xkinanx

  • View
    521

  • Download
    5

Embed Size (px)

Citation preview

Page 1: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

Spectrum Scale 4.1 System Administration

Spectrum Scale

Information Lifecycle Management (ILM) Tools

© Copyright IBM Corporation 2015

Page 2: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

Unit objectives

After completing this unit, you should be able to:

• Understand the value Information Life Cycle Management

• Understand how ILM is managed in Spectrum Scale

• Understand Storage Pools & Policy Engine Queries

– File placement/movement policies

– File Analysis Queries

– File management policies

– Working with Filesets

© Copyright IBM Corporation 2015

Page 3: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

What is ILM?

• Based on business rules you define the policy

• Policies Apply to ILM (disk/analysis), HSM(tape movement)

• Its built into Spectrum Scale (no additional Licensing)

– Transparently managing file data using a set of rules.

– Allows for dry run testing (to test your policy before you apply it)

– Not an Easy Tier but it Is a Set it and forget policy engine

© Copyright IBM Corporation 2015

IBM Spectrum Scale can help you achieve ILM efficiencies through powerful policy driven,

automated, tiered storage management. The Spectrum Scale ILM toolkit helps you manage

sets of files and pools of storage and enables you to automate the management of that data.

ILM = Information Lifecycle Management

An ability to manage Information efficiently thru the lifecycle of its value

The Policy engine is competitively superior primarily by using the metadata

engine to walk the metadata rather than the file system to build and analyze

work lists.

Page 4: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

How does it help client ILM?

© Copyright IBM Corporation 2015

ILM = Information Lifecycle Management

An ability to manage Information efficiently thru the lifecycle of its value

This is an incredibly valuable tool for client file system management.

Types of policies:

File placement policies are used to automatically place newly created files in a

specific file system pool.

Useful for tiering for efficiency or performance.

File management policies are used to manage files during their lifecycle by

moving them to another file system pool, moving them to nearline storage,

copying them to archival storage, changing their replication status, or deleting

them.

Analysis discovery policies can be used without the need to move or manage

data, and simply used for understanding something about the data that you

have.

*These tools are a huge competitive advantage used by all clients

Page 5: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

How are ILM policies applied

© Copyright IBM Corporation 2015

Policies are managed by Admin defined Rules

Characteristics of a policy are as follows:

• A policy can contain any number of rules.

• A policy file is limited to a size of 1 MB.

A policy rule is an SQL-like statement that tells Spectrum Scale what to do with the data for a

file in a

specific storage pool, if the file meets specific criteria. A rule can apply to any file being

created or only to files being created within a specific file set or group of file sets.

Rules specify conditions that, when true, cause the rule to be applied. Conditions that cause

Spectrum Scale to apply a rule are as follows:

Date and time when the rule is evaluated, that is, the current date and time

Date and time when the file was last accessed

Date and time when the file was last modified

File set name

File name or extension

File size

User ID and group ID

Creating a policyCreate a text file for your policy with the following guidelines:

– A policy must contain at least one rule.

– The last placement rule of a policy rule list must be as though no other placement rules

apply to a file; the file will be assigned to a default pool.

Installing a policy

Issue the mmchpolicy command

Changing a policy

Edit the text file containing the policy and issue the mmchpolicy command.

Listing policies

The mmlspolicy command displays policy information for a given file system.

Validating policies

The mmchpolicy -I test command validates but does not install a policy file.

Page 6: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

How ILM policies are evaluated by Spectrum Scale

© Copyright IBM Corporation 2015

Spectrum Scale evaluates policy rules in order, from first to last, as they appear in the policy.

The first rule that matches determines what is to be done with that file. For example, when a

client creates a file, Spectrum Scale scans the list of rules in the active file-placement policy to

determine which rule applies to the file. When a rule applies to the file, Spectrum Scale stops

processing the rules and assigns the file to the appropriate storage pool. If no rule applies, an

EINVAL error code is returned to the application.

Several rule types exist:

Placement policies, evaluated at file creation, for example:

– Rule xxlfiles set pool gold for file set xxlfileset rule otherfiles set pool silver

Migration policies, evaluated periodically, for example:

– Rule cleangold migrate from pool gold threshold (90,70) to pool silver

– Rule cleansilver when day_of_week()=monday migrate from pool silver to pool pewter

where access_age > 30 days

Deletion policies, evaluated periodically, for example:

– Rule purgepewter when day_of_month() = 1 delete from pool pewter where

access_age > 365 days

Page 7: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

ILM tools

• Storage pools

– A collection of disks or arrays

with similar properties that are

managed together as a group.

• File placement policies

– Determines where the file data is

placed on creation.

• File management policies

– Migrates or deletes file based on

business rules.

• Filesets

– Logical subtrees within a file

system that act as metadata

containers for files.

© Copyright IBM Corporation 2015

Storage

pool

Storage

pool

Storage

pool

Placement

Policies

Management

Policies

Filesets

Page 8: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

What is a storage pool?

• Two types of Storage pools

– Internal

– External

• Internal: A collection of disks or arrays with similar properties

that are managed together as a group.

– Used to:

• group storage devices and create classes of storage within a file system

• Match the cost of storage to the value of the data

• Improved performance

• Improved reliability.

• External

– An interface to an external application.

© Copyright IBM Corporation 2015

Page 9: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

Internal storage pool properties

• Every file system has at least a “System” storage pool

– Maximum of 8 storage pools

• The pools are created by mmcrfs, mmadddisk or mmchfs –V

• Only one pool, called the System Pool, stores metadata

– The policy file

– May be created a metadaDataOnly

• All other pools are dataOnly and store user data.

• When a pool is full, the user gets E_NOSPC.

• A file system without a valid policy file can only create files in the

system pool.

– An invalid policy file is deleted by mmfsck.

• A storage pool is an extra attribute on the definition of each disk

– Each disk belongs to exactly 1 storage pool.

© Copyright IBM Corporation 2015

Page 10: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

Storage pool properties

• Only the system pool may contain metadataOnly or

dataAndMetadata or descOnly disks

– mmdeldisk is not allowed to delete the system pool.

•mmchdisk and mmrpldisk are not allowed to change the

disk’s storage pool

– Changing the pool would require all existing data to be migrated from the disk – just like mmdeldisk.

•mmlsdisk shows the storage pool for each disk.

© Copyright IBM Corporation 2015

Page 11: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

Defining storage pool properties

• The storage pool is attribute of each disk and defined when a

disk is added to the file system

– Disk Stanza pool attribute

%nsd:

nsd=NsdName

usage={dataOnly | metadataOnly | dataAndMetadata | descOnly}

failureGroup=FailureGroup

pool=StoragePool

servers=ServerList

device=DiskName

© Copyright IBM Corporation 2015

Page 12: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

Storage pool: mmlsdisk

mmlsdisk gpfs1 -L

disk driver sector failure holds holdsstorage

name type size group metadata data status availability disk id pool remarks

------------ -------- ------ ------- -------- ----- ------------- ------------ ------- ------------ ---------

nsdb1_1 nsd 512 5 yes yes ready up 1 system desc

nsdb1_2 nsd 512 2 yes yes ready up 2 system desc

nsdb1_3 nsd 512 1 yes yes ready up 3 system desc

nsdb1_4 nsd 512 2 yes yes ready up 4 system

nsdb2_1 nsd 512 1 yes yes ready up 5 system

nsdb2_2 nsd 512 2 yes yes ready up 6 system

nsdb2_3 nsd 512 1 yes yes ready up 7 system

nsdb2_4 nsd 512 2 yes yes ready up 8 system

nsdb3_2 nsd 512 1 no yes ready up 9 pool3

Number of quorum disks: 3

Read quorum value: 2

Write quorum value: 2

© Copyright IBM Corporation 2015

Page 13: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

Storage pool: mmdf# mmdf sunalpha

disk disk size failure holds holds free KB free KB

name in KB group metadata data in full blocks in fragments

--------------- ------------- -------- -------- ----- -------------------- -------------------

Disks in storage pool: system

nsdb2_3 140095488 1 yes yes 139563776 (100%) 288 ( 0%)

nsdb1_3 140095488 1 yes yes 139389696 ( 99%) 2568 ( 0%)

nsdb2_1 140095488 1 yes yes 139512576 (100%) 1448 ( 0%)

nsdb1_4 140095488 2 yes yes 139489536 (100%) 2644 ( 0%)

nsdb2_2 140095488 2 yes yes 139747328 (100%) 1044 ( 0%)

nsdb1_2 140095488 2 yes yes 139490944 (100%) 2632 ( 0%)

nsdb2_4 140095488 2 yes yes 139835904 (100%) 884 ( 0%)

nsdb1_1 140095488 5 yes yes 139537536 (100%) 1388 ( 0%)

------------- -------------------- -------------------

(pool total) 1120763904 1116567296 (100%) 12896 ( 0%)

Disks in storage pool: pool3

nsdb3_2 140095488 1 no yes 140093312 (100%) 124 ( 0%)

------------- -------------------- -------------------

(pool total) 140095488 140093312 (100%) 124 ( 0%)

============= ==================== ===================

(data) 1260859392 1256660608 (100%) 13020 ( 0%)

(metadata) 1120763904 1116567296 (100%) 12896 ( 0%)

============= ==================== ===================

(total) 1260859392 1256660608 (100%) 13020 ( 0%)

Inode Information

-----------------

Number of used inodes: 4524

Number of free inodes: 542730

Number of allocated inodes: 547254

Maximum number of inodes: 547254

© Copyright IBM Corporation 2015

Page 14: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

Storage pool: mmlsattr

mmlsattr -L /alpha/junk1.p3

file name: /alpha/junk1.p3

metadata replication: 2 max 2

data replication: 2 max 2

flags: exposed,illreplicated,unbalanced

storage pool name: pool3

fileset name: root

snapshot name:

© Copyright IBM Corporation 2015

Page 15: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

Storage pool creation/deletion

• To create a new storage pool

– Define disks with the new pool

mmadddisk or mmcrfs

– Install a new policy file.

• To delete an existing storage pool

– Install a policy file that does not include the pool

– Change the storage pool attribute for all files assigned to the pool

– Migrate the data to a new pool (or delete the files)

– Delete the disks (which deletes storage pool).

© Copyright IBM Corporation 2015

Page 16: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

External storage pools

• A script based interface for external applications.

• Can be used for custom applications

• Benefits to external applications

– Speed of Spectrum Scale policy engine

– Scalability of namespace

– High availability of Spectrum Scale

Supported external applications

– IBM Tivoli Storage Manager HSM (TSM/HSM)

– LTFS

– High Performance Storage System (HPSS)

© Copyright IBM Corporation 2015

Page 17: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

What are filesets?

• A fileset is a sub-tree of a file system namespace that provides

a means of partitioning the file system to allow administrative

operations

– In many ways behaves like an independent file system

– Used to define quotas on data blocks and inodes.

• A fileset has a root directory

– All files belonging to the fileset are only accessible via this root

directory

– No hard links between filesets are allowed

– Renames are not allowed to cross fileset boundaries.

• Max of 10,000 total filesets.

© Copyright IBM Corporation 2015

Page 18: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

Dependent and independent filesets

• Dependent fileset

– Shares inode space

– 10,000 dependent filesets per file system.

• Independent fileset

– Distinct inode space

– 1,000 independent filesets per file system.

© Copyright IBM Corporation 2015

Page 19: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

Why filesets?

• Filesets play an important role in ILM

– Not directly tied to storage pools, although policies may tie storage

pools to filesets.

• Added administrative control

– Per-fileset quotas add an additional dimension to the existing user and

group quotas.

• Fileset quotas

– Implements tree-based quota requirement

– Per-fileset quotas add an additional dimension to the existing user and

group quotas.

© Copyright IBM Corporation 2015

Page 20: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

Fileset properties

• Root fileset is always there.

• At creation a fileset is ‘unlinked’

– It is not visible in the directory space

– The sysadmin can then link the fileset to an arbitrary point within a file

system

– mmlinkfileset is analogous to the file system mount operation.

• Once linked, fileset can be populated via normal means, that is,

by copying and creating files.

• Hard links are not allowed to cross fileset boundaries.

© Copyright IBM Corporation 2015

Page 21: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

Fileset aging and demise

• Once linked into the file system,

– Fileset root directory looks like a normal directory, except that rmdir

can’t remove it.

– A fileset root dir can be moved around with mv (within the confines of

the parent fileset). It can be unlinked and relinked under a different

fileset.

• If you no longer need a fileset it can be unlinked, at which point

it is unreachable, but all files are still there. An unlinked fileset

can then be deleted.

© Copyright IBM Corporation 2015

Page 22: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

Fileset commands

mmcrfileset

mmlinkfileset

mmunlinkfileset

mmdelfileset

mmlsfileset

mmchfileset

mmlsattr

© Copyright IBM Corporation 2015

Page 23: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

Special Spectrum Scale file attributes

• These are Spectrum Scale specific attributes of a file

– Does not effect POSIX compliant file operations

– Additional information for use by Spectrum Scale and accessible by

other applications if needed (TSM for example).

• Attributes are stored in a sparse file

– The i’th block of the xattr file contains attributes for file with inode

number I.

• Current use of extended attributes includes:

– Storage Pool

– Fileset

– DMAPI

– direct-IO

– Custom extended attributes.

© Copyright IBM Corporation 2015

Page 24: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

Policy-based management

• Two types of policies

– File placement

– File management.

• File placement policies

• Determine the initial storage pool for each file’s data

– The data will be striped across all disks in the selected pool

• Also determines the file’s replication factor.

• File management policies

• Determines when a file’s data should be migrated

• Determines where the data should go.

© Copyright IBM Corporation 2015

Page 25: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

Policy rules

• Similar syntax to SQL 92 standard.

• You can have 1MB of rule text.

• Rule order matters

– Rules are evaluated top to bottom

– Once a rule matches processing ends for that file.

• You can use built-in functions. Examples:

– Date – Current_Timestamp, DayOfWeek, DAY(), HOUR()

– String – LOWER(), UPPER(),LENGTH()

– Numeric – INT(), MOD()

© Copyright IBM Corporation 2015

Page 26: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

Rule syntax: Placement policy

• Syntax

RULE ['RuleName']

SET POOL 'PoolName'

[LIMIT (OccupancyPercentage)]

[REPLICATE (DataReplication)]

[FOR FILESET (FilesetName[,FilesetName]...)]

[WHERE SqlExpression]

• Can be set on attributes you know about a file when it is

created

– Name, Location, User.

© Copyright IBM Corporation 2015

Page 27: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

File Management policy processing

• Batch process.

• Very efficient metadata

scans.

• When a batch is executed

there are 3 steps:

– Directory Scan

– Rule Evaluation

– File Operations.

• Can operate in parallel over

multiple machines.

© Copyright IBM Corporation 2011

Scan Files

1

Apply Rules2

Perform File Operations

3

Page 28: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

Rule syntax: Migration policy

• SyntaxRULE [‘rule_name’] [ WHEN time-boolean-expression]

MIGRATE [ FROM POOL ’pool_name_from’

[THRESHOLD(high-occupancy-percentage[,low-occupancy-percentage])]]

[ WEIGHT(weight_expression)]

TO POOL ’pool_name’

[ LIMIT(occupancy-percentage) ]

[ REPLICATE(data-replication) ]

[ FOR FILESET( ‘fileset_name1’, ‘fileset_name2’, ... )]

[ WHERE SQL_expression]

• Operates on existing files

– Allows more attributes in rules

• File size, last accessed time

• Can perform the following operations:

– Migration

– Deletes

– Change of replication status

– Reporting

© Copyright IBM Corporation 2015

Page 29: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

Policy set example

Placement Rules

RULE mpg0 SET POOL “scsi” WHERE UPPER(NAME) LIKE “%.MPG”

RULE dbfiles SET POOL “premium” FOR FILESET “db-fileset”

RULE devfiles SET POOL “normal” WHERE GID = 1100

Migration Rules

RULE mpg30 WHEN (DayOfWeek()=1) MIGRATE FROM POOL “scsi”

TO POOL “sata”

WHERE UPPER(NAME) LIKE "%.mpg" and ACCESS_AGE >30 DAYS

RULE mpg90 WHEN (DayOfWeek()=7) MIGRATE FROM POOL “sata”

TO POOL “tape”

WHERE LOWER(NAME) LIKE "%.mpg" and MODIFICATION_AGE > 90 DAYS

Deletion Rule

RULE mpg999 WHEN (MonthOfYear()=12 and DayOfWeek()=1)

DELETE FROM POOL “tape” WHERE UPPER(NAME) LIKE "%.MPG"

and CREATION_AGE > 999 DAYS

Exclude Rule

RULE xclude1 EXCLUDE WHERE GID=1

© Copyright IBM Corporation 2015

Page 30: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

Policy language example using macros

define(east_adjustment,

CASE

WHEN XATTR_FLOAT('user.e',1,-1,'DECIMAL') < 0

THEN 180+(180+XATTR_FLOAT('user.e',1,-1,'DECIMAL’))

ELSE XATTR_FLOAT('user.e',1,-1,'DECIMAL')

END )

define(west_adjustment,

CASE

WHEN XATTR_FLOAT('user.w',1,-1,'DECIMAL') < 0

THEN 180+(180+XATTR_FLOAT('user.w',1,-1,'DECIMAL’))

ELSE XATTR_FLOAT('user.w',1,-1,'DECIMAL')

END )

define(north_adjustment, 90+XATTR_FLOAT('user.n',1,-1,'DECIMAL'))

define(south_adjustment, 90+XATTR_FLOAT('user.s',1,-1,'DECIMAL'))

RULE 'listall' list 'geo_files'

SHOW( varchar(kb_allocated)|| ' ' || fileset_name )

WHERE KB_ALLOCATED > 0

AND FILESET_NAME='master_t1'

AND south_adjustment <= 130.993664

AND north_adjustment >= 126.994021

AND east_adjustment >= 250.964755

AND west_adjustment <= 257.946178

AND DAYS(XATTR('user.t')) >= (DAYS(CURRENT_TIMESTAMP)-90)

© Copyright IBM Corporation 2015

Macros

manipulate data

Policy calls

macros

Query custom file

extended attributes

Page 31: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

Policy commands: Placement policies

•mmchpolicy device fileName [-I {yes|no}]

– Sets the placement policy

– Policy file is read into memory and passed to sg mgr

– Rules are validated

– Stored in an internal file and recorded in the sg desc

– Rules are broadcast in a message to all nodes

•mmlspolicy device [-L]

– Display the current policy

© Copyright IBM Corporation 2015

Page 32: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

Display installed policy using mmlspolicy

•#mmlspolicy gpfs1

• Policy file for file system '/dev/gpfs1':

Installed by root@c35f1n01 on Wed May 30 12:27:01 2013.

• First line from original file policyRule was:

rule 'p3' set pool 'pool3' where LOWER(NAME) like '%.p3'

•#mmlspolicy gpfs1 -L

rule 'p3' set pool 'pool3' where LOWER(NAME) like '%.p3'

rule 'default' SET POOL 'system' /* when all else fails

*/

© Copyright IBM Corporation 2015

Page 33: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

Invoking File Management policies

• Command is mmapplypolicy

• Drives Migration/Deletion Policy: What and when

– Invocation manually or via cron

– Runs on node on which invocation was made

– Multi-threaded

– File system must be mounted and in home cluster

– Usage

mmapplypolicy {Device|Directory} [-A IscanBuckets] [-a IscanThreads]

[-B MaxFiles] [-D yyyy-mm-dd[@hh:mm[:ss]]] [-e] [-f FileListPrefix]

[-g GlobalWorkDirectory] [-I {yes|defer|test|prepare}]

[-i InputFileList] [-L n] [-M name=value...] [-m ThreadLevel]

[-N {all | mount | Node[,Node...] | NodeFile | NodeClass}]

[-n DirThreadLevel] [-P PolicyFile] [-q] [-r FileListPathname...]

[-S SnapshotName] [-s LocalWorkDirectory]

– Some Parameters-I Allows you to test

-g Shared Directory for temporary data

-m Number of threads to do processing

© Copyright IBM Corporation 2015

Page 34: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

DMAPI

• Data Management API

• Enable DMAPI at the file system level

-z {yes|no}

• Requires DMAPI listener to mount file system

• Use for:

– Auto retrieval for offline data

– Custom applications

© Copyright IBM Corporation 2015

Page 35: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

Review

• Spectrum Scale ILM tools implement business rules

• Filesets allow you to organize data

• Storage pools provide grouping of storage

• File Placement Policies assign data to pools on file creation

• File management policies automate

migration/deletion/replication/reporting

© Copyright IBM Corporation 2015

Page 36: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

Exercise 3

Pools and Policies

Exercise

© Copyright IBM Corporation 2015

Page 37: Ibm spectrum scale fundamentals workshop for americas part 3 Information LifeCycle Management

Unit summary

Having completed this unit, you should be able to:

• Information Life Cycle Management

• Storage pools

• File placement policies

• File management policies

• Filesets

© Copyright IBM Corporation 2015