View
216
Download
1
Category
Preview:
Citation preview
www.ischool.drexel.edu
INFO 320Server Technology I
Week 2
Server architectures
1INFO 320 week 2
www.ischool.drexel.edu
OS and Server Architecture
• Last week we outlined the basic functions of an operating system
• Since an OS exists to serve as a connection between apps and the hardware, what kind of hardware is available and how it’s used are critical things to consider
• …So what is a server architecture?
2INFO 320 week 2
www.ischool.drexel.edu
Server Architecture
• Key issues in server architecture include– What is the extent of centralization or
distribution of functions?• There are many possible answers, not just A or B
– The main functions are managing data, performing processing (e.g. running apps), and determining how to display the results to the user
• i.e. Who does what where in your system?
3INFO 320 week 2
www.ischool.drexel.edu
Server Architecture
• Other issues to keep mind could include– Reliability – Availability – Security– Performance
4INFO 320 week 2
www.ischool.drexel.edu
Server Architecture
• So defining server architecture is a key step in the larger process of designing a network
• Once the architecture is set, then can work on details such as – How many of each server are needed? – How big are they (CPUs, RAM, storage)?– What kind of links are needed among them?
5INFO 320 week 2
www.ischool.drexel.edu
Centralization
• Centralizing all or some aspects of a system can be good– Take advantage of economies of scale– Easier to staff support people– Easier to control procurement– Easier to enforce programming and data
structure standards– Easier to manage security
6INFO 320 week 2
www.ischool.drexel.edu
Centralization
• We can centralize computers, as was done with mainframes
• Centralization doesn’t necessarily apply to the entire system though
• We could centralize processing– Data processing, payroll, apps unique to a
given department (CAD) might be centralized
• We could centralize data– Big database server(s)
7INFO 320 week 2
www.ischool.drexel.edu
Distributed data processing
• Distributed data processing (DDP) is a possible step away from centralization– Servers are distributed throughout the
organization in order to meet operational, economic, and/or geographic needs
– Could still have a larger central facility with satellite facilities, or all peer facilities
8INFO 320 week 2
www.ischool.drexel.edu
Distributed data processing
• DDP advantages include– Responsiveness to local needs– Higher availability, more redundancy to
minimize impact of a single system failure– Resource sharing can still be done with
expensive hardware– Incremental growth is easier
• Avoids all or nothing upgrades
– More user involvement, control, productivity
9INFO 320 week 2
www.ischool.drexel.edu
Distributed data processing
• DDP operating systems need – Good networking capability to exchange data– Ability to cluster machines for high availability
and high performance– To manage processes across the distributed
environment
10INFO 320 week 2
www.ischool.drexel.edu
Distributed processing overview
• We’ll look at critical technologies in distributed processing– Client/server computing– Distributed message passing– Remote procedure calls– Clusters
11INFO 320 week 2
www.ischool.drexel.edu
Client/server computing
• In a client/server environment, a client requests information from the servers– An API (Applications Programming Interface),
drivers, or other forms of middleware allows communication between them
• Clients present the information in a user-cuddly GUI format
12INFO 320 week 2
www.ischool.drexel.edu
Client/server computing
• Servers exist to provide shared services to clients– What kind of servers could we see?
• Also keep in mind the network connecting the clients and servers– Is it a LAN, WAN, the Internet, or ???– We need to be aware of the amount of traffic
we expect the network to bear
13INFO 320 week 2
www.ischool.drexel.edu
Client/server characteristics
• A client/server architecture differs from other distributed processing in many ways– Strong emphasis on user-friendly apps for the
user on their system– Often centralize database, network
management, and utility functions to control overhead and support costs
– Open and modular systems are increasingly common – mix products from various vendors
14INFO 320 week 2
www.ischool.drexel.edu
Client/server characteristics
– Networking is critical, hence focus a lot of attention to network management and security issues
• Client/server apps communicate directly, depending on the network protocols (TCP/IP) to make that possible– Even though the client and server often have
different platforms and OS’s– Client/server apps look like Internet apps!
15INFO 320 week 2
www.ischool.drexel.edu
Client/server characteristicsImages from (Stallings, 2009)
16INFO 320 week 2
www.ischool.drexel.edu
Client/server database
• A common client/server app is to use a database server
• The DBMS resides on the server, and is called by the application logic
• Part of the app design challenge is to make sure the network isn’t overwhelmed by the data transfer expectations
17INFO 320 week 2
www.ischool.drexel.edu
Client/server database
18INFO 320 week 2
www.ischool.drexel.edu
Client/server database
• The first example is good use of client/server, since– The server has the job of sorting through one
million records, at which a desktop system might cringe
– The network doesn’t have to support moving the entire database across itself
19INFO 320 week 2
www.ischool.drexel.edu
Client/server classes
• Four classes of client/server (C/S) apps– Host-based processing, much like a
mainframe & dumb terminal, is not really C/S– Server-based processing, the most server-
heavy class of C/S processing– Cooperative processing, processing is locally
optimized on the client– Client-based processing, the most fair split of
workload
20INFO 320 week 2
www.ischool.drexel.edu
Client/server classes
(b) Is a “thin” client app (c) and (d) are “fat” client apps
21INFO 320 week 2
www.ischool.drexel.edu
Three-tier client/server architecture
• In three-tier C/S, we now have a client, a middle tier server, and a backend server– The client is typically a thin client– The middle tier is often an application server
• It acts as a server to the client, and as a client to the backend server
– The backend server is often one or more database servers
• The app server chooses which one is needed
22INFO 320 week 2
www.ischool.drexel.edu
File consistency
• Clients and servers often cache files which are frequently used
• When a file or database record is being changed, the cache can be inconsistent with the correct version
• Often address this by locking files or records, hence the level at which data is locked can be a key performance issue
23INFO 320 week 2
www.ischool.drexel.edu
What is middleware?
24INFO 320 week 2
www.ischool.drexel.edu
Middleware
• Development of C/S apps has exceeded anyone’s ability to make standardized application support tools
• APIs and other programming interfaces help address this, and are generically known as middleware– ‘Common definitions are that middleware is
the "glue" between software components or between software and the network or it is the slash in Client/Server.’ From here
25INFO 320 week 2
www.ischool.drexel.edu
Middleware
26INFO 320 week 2
www.ischool.drexel.edu
Middleware
• Middleware describes software that connects two or more software applications so they can exchange data
• There are many types of middleware, hence the confusion– Message Oriented Middleware, Object
Middleware, RPC Middleware, Database Middleware, Transaction Middleware, Portals
27INFO 320 week 2
www.ischool.drexel.edu
Distributed message passing
• Within one computer, processes can pass messages via semaphores
• In distributed systems, processes are on different systems, so that isn’t possible– One issue is message reliability (did it get
there?) – Can processing continue before getting a
response? (if so, called nonblocking or asynchronous)
28INFO 320 week 2
www.ischool.drexel.edu
Distributed message passing
29INFO 320 week 2
www.ischool.drexel.edu
Distributed message passing
30INFO 320 week 2
www.ischool.drexel.edu
Remote procedure calls
• Remote procedure calls (RPCs) allow distributed systems to communicate as though they were on the same machine– A remote interface can have named
operations with specific types • Allows clearly defined documentation and static
error checking
– Helps generate code automatically, and port code to different platforms and OS’s
31INFO 320 week 2
www.ischool.drexel.edu
Remote procedure calls
This expands on image (b) on slide 29.32INFO 320 week 2
www.ischool.drexel.edu
Remote procedure calls
• Issues with using RPC include– Passing parameters by value or pointer– Representation of parameters (int, float, $, …)– Client/server binding
• Nonpersistent (always make new connection)• Persistent (keep the same binding until it expires)
– Asynchronous (let other processes continue) or synchronous (block everything until done)
– Object-oriented RPC (see OLE or CORBA)
33INFO 320 week 2
www.ischool.drexel.edu
SMP
• In order to get lots of computational power, symmetric multiprocessing (SMP) was the first option– SMP has multiple processors– They share main memory (RAM) and I/O – They are connected by a bus– Are processors are the same type (hence
the ‘symmetric part’)
34INFO 320 week 2
www.ischool.drexel.edu
Clustering
• As the need for more computational power grew, clustering was developed– What kind of problems need massive CPU
power?
• Clustering is a group of interconnected standalone computers working together as one– Each computer in a cluster is a node
35INFO 320 week 2
www.ischool.drexel.edu
Clustering
• Clustering has several benefits– Absolute scalability – can keep adding more
systems to get as much power as you can afford
– Incremental scalability – you can add a little more power as well, avoiding complex upgrade paths
– High availability – lots of separate computers means if one fails it’s not a big deal
36INFO 320 week 2
www.ischool.drexel.edu
Clustering
– Superior price/performance since cheap computers can be clustered
• Clusters can be classified based on whether they share hard disks (among other ways)– In the first approach, each standby server has
separate disks, and they communicate via a high speed link
– In the second approach, they share a RAID array
37INFO 320 week 2
www.ischool.drexel.edu
Clustering
38INFO 320 week 2
www.ischool.drexel.edu
Clustering
• A better approach for cluster classification is by functionality– Passive Standby– Active secondary– Separate servers– Servers connected to disks– Servers share disks
39INFO 320 week 2
www.ischool.drexel.edu
Clustering
• Passive Standby– A second server takes over if the primary fails– Easy to implement– Wastes second server since it’s mostly
unused– Doesn’t improve performance over a single
server– Often not considered a true cluster
40INFO 320 week 2
www.ischool.drexel.edu
Clustering
• Active secondary– The second server is also used for processing
tasks– Cheaper since second server is now used– Increased complexity
41INFO 320 week 2
www.ischool.drexel.edu
Clustering
• Separate servers (is (a) on slide 38)– Servers have separate disks– Data is copied from primary to second server– Gives high availability and high performance– High network and server overhead due to
copying
42INFO 320 week 2
www.ischool.drexel.edu
Clustering
• Servers connected to disks– Also called the shared nothing approach– Servers are connected to a set of disks, but
each server has its own disks in that set– Reduces need for copying among servers– Often needs mirroring or RAID in case of disk
failure– Windows Cluster Server is an example
43INFO 320 week 2
www.ischool.drexel.edu
Clustering
• Servers share disks (is (b) on slide 38)– Multiple servers share a set of disks– Low network and server overhead– Reduced chance of disk failure– Requires lock manager software, plus
mirroring and/or RAID
44INFO 320 week 2
www.ischool.drexel.edu
Clustering and the OS
• Clustering produces interesting OS problems– Failure management
• Either a high availability approach or a fault tolerant approach can be used
• The latter is better at handling partial transactions if a system fails
• Failover is the function of handing off an app and its data when there’s a failure; the opposite is failback
45INFO 320 week 2
www.ischool.drexel.edu
Clustering and the OS
– Load balancing• How do you balance how much work each system
is performing?• A load-balancing facility must handle this and
schedule tasks accordingly
– Parallelized computation• How is the application run on multiple systems?
– Could have a parallelizing compiler– A parallelized application is written to run on a cluster– Parametric computing tools can be used for simulations
that require a lot of similar runs with different conditions
46INFO 320 week 2
www.ischool.drexel.edu
Clustering architecture
• A cluster presents itself to the user as a single system, the single-system image– This is possible thanks to the clustering
middleware– The middleware also may perform load
balancing and respond to system failures
47INFO 320 week 2
www.ischool.drexel.edu
Clustering architecture
48INFO 320 week 2
www.ischool.drexel.edu
Clustering architecture
• The single-system image ensures that– Single entry point
• The user logs into the cluster, not a machine
– Single file hierarchy• The user sees files in a single file structure
– Single control point• There is a default node used to manage the cluster
– Single virtual networking• Any node can access the rest of the cluster
49INFO 320 week 2
www.ischool.drexel.edu
Clustering architecture
– Single memory space• Distributed shared memory allows programs to
share variables
– Single job management system• A user can commit a job to run without specifying
where it runs (which node)
– Single user interface• The same GUI supports users regardless of where
they log into the cluster
50INFO 320 week 2
www.ischool.drexel.edu
Clustering architecture
• To improve availability, the OS allows– Single I/O space
• Any node can access any I/O peripheral or disk device no matter where it is
– Single process space• A uniform process identification scheme is used
– Checkpointing• Saves process state and data in case of failure
– Process migration• Enables load balancing
51INFO 320 week 2
www.ischool.drexel.edu
SMP versus clustering
• SMP is more mature technology, is easier to manage and configure than a cluster– SMP takes less space and power
• Clusters win when scalability, either absolute or incremental, is critical– Availability for clusters is also higher
52INFO 320 week 2
www.ischool.drexel.edu
Clustering examples
• Windows Cluster Server is a shared-nothing approach
• Sun Cluster is an object-oriented approach using CORBA– The object framework handles calls to
other nodes– A virtual node (vnode) file system is used
53INFO 320 week 2
www.ischool.drexel.edu
Beowulf
• Beowulf (no, not Beowulf) is one of the oldest clustering approaches, started in 1994 using clustered PCs– Most Beowulf clusters use Linux systems,
connected by Ethernet (LAN) or via TCP/IP
• Each node runs an autonomous Linux kernel, yet participates in global namespaces
54INFO 320 week 2
www.ischool.drexel.edu
Beowulf
• Key pieces of Beowulf software are– BPROC, the distributed process space
package, which allows a process to span multiple nodes and can allow a new process to be created on other nodes
– Ethernet Channel Bonding, which joins multiple local networks into one high speed network and does load balancing
55INFO 320 week 2
www.ischool.drexel.edu
Beowulf
– Pvmsync is a programming environment which helps perform synchronization and shares data objects among processes
– EnFuzion is a set of tools for parametric computing; creating a lot of jobs with different input parameters or initial conditions
56INFO 320 week 2
www.ischool.drexel.edu
References
• Operating Systems Internals and Design Principles, by William Stallings, 6th Ed, Pearson/Prentice Hall 2009. ISBN 0136006329– His web site
• What is Middleware? http://www.middleware.org/whatis.html
57INFO 320 week 2
Recommended