Upload
dewei
View
53
Download
3
Embed Size (px)
DESCRIPTION
Overview of Lustre. ECE, U of MN Changjin Hong (Prof. Tewfik ’ s group) [email protected] Monday, Aug. 19, 2002. Outline. Reference Lustre Cluster Lustre System Components Distributed Lock Manager Object Based Storage Conclusion (security issues). Reference. - PowerPoint PPT Presentation
Citation preview
Overview of LustreECE, U of MN
Changjin Hong (Prof. Tewfik’s group)
[email protected], Aug. 19, 2002
Outline• Reference• Lustre Cluster• Lustre System Components• Distributed Lock Manager• Object Based Storage• Conclusion (security issues)
Reference• Lustre: A SAN File System for Linux
– http://www/lustre.org/docs/lustre/luswhite.pdf
• Several presentation materials from Dr. Peter J. Braam
A Lustre Cluster
10,000’s10’s of nodes
1,000’s
Key Design Issue : Scalability
• I/O throughput– How to avoid bottlenecks
• Metadata scalability– How can 10,000’s of nodes work on files in
same folder• Cluster Recovery
– If sth fails, how can transparent recovery happen
• Management– Adding, removing, replacing, systems; data
migration & backup
System Components
Interaction between systems
OST
MDS
Client
CMD protocol(directory) metadata handling,
inodes updates,concurrency
Pre-allocation file creation, recovery purpose, file status,
OS protocolFile I/O, allocation of blocks, striping,
security enforcement
Client File System• A directory tree, subdivision into
filesets for cluster ▷wide Unix file sharing semantics
• CMD protocol– Transaction-based– Authenticated access– Write-behind caching for MD updates
with strict data/metadata coherency
Metadata Service (MDS)• All access to the file is governed by MDS
which will directly or indirectly authorize access.
• To control namespace and manage inodes• Load balanced cluster service for the
scalability (a well balanced API, a stackable framework for logical MDS, replicated MDS)
• Journaled batched metadata updates
Object Storage Targets (OST)
• Keep file data objects• File I/O service ▷Access to the objects• The block allocation for data obj.,
leading distributed and scalability• OST s/w modules
– OBD server, Lock server– Obj. storage driver, OBD filter– Portal API
VAXCluster DLM adapted
Distributed Lock Manager
• For generic and rich lock service• Lock resources: resource database
– Organize resources in trees• High performance
– node that acquires resource manages tree
Big Picture
Resource Tree and namespace
<namespace>Name1Name2Name3Name4
:
Obj.2
Obj.1
Obj.3
Obj.4
Resource manager
RR
R R
distributed resource directory/hash function (LDWV)/lock directory
Apps.
Mechanism in resource dB• Hash binary string % N ▷ get h• Lookup system in lock directory
weightvector [h] ▷ find system K.• Systems
– may occupy 0, 1 or more slots in LDWV– Number of slots is lock directory weight
Lustre DLM features• Low concurrency
– Want write-back caching• High concurrency
– Want load balancing in cluster– Subdivide directories etc with hashes– Want server of request to limit lock
revocations-> ops. on the MD cluster in a client server RPC model
• Deadlock detection
Object Based Storage
Object Based Storage• Object Based Storage Device
– More intelligent than block device• Speak storage at “inode level”
– create, unlink, read, write, getattr, setattr…– Iterators, security, almost arbitrary processing
Components of OB Storage• Storage Object Device Drivers
– Class drivers : attach driver to interface• Targets, clients : remote access• Direct drivers : to manage physical storage• Logical drivers: for intelligence & storage
management• Object storage application (OSA)
– (cluster) file systems– Advanced storage : parallel I/O, snapshots– Specialized apps. : caches, db’s, filesrv
System Interface• Modules
– Load the kernel modules to get drivers of a certain type
– Name devices to be of a certain type– Build stacks of devices with assigned
types
Layering of Object Drivers
Interaction of Obj. Storages/w modules
Benefits-clustering/SM• Suitable for use in a SAN file system• Shared at the level of an individual block• Obj namespace : divided into obj group. Thi
s is very advantageous to be able to create obj w/ given obj id’s. Good for snapshot!
• Hot file migration
Conclusion• Object Based Storage
To process the disk operations on the higher concept of individual files and the file inode level, rather than the low-level h/w disk block level.
• Security Issues– Auxiliary service in cluster
• LDAP, PKI, Kerberos– Purpose
• CFS/ MDS/ OST– Authenticate to each other– Set up session keys
Etc.• GSS-API for authentication and
Integrity Checks• Remote DMA
– Layer for NEVER bypass security processing
– Request processing for checking authentication by a higher level layer in the networking stack