Click here to load reader
Upload
hassan-ahmed-khan
View
213
Download
1
Embed Size (px)
Citation preview
Computer Systems Modeling (CS-417) 1
1
CS-417: COMPUTER SYSTEMS MODELING
Performance Evaluation of High Parallel
Systems Architecture
S. Zaffar Qasim
Assistant Professor (CIS)
BE (CIS)
Spring Semester 2014
2
Evaluation of High Parallel Systems Architecture
� The system chosen here is more indicative of realistic systems
where multiple servers are interconnected to serve several users.
� Here we have essentially the problem of memory allocation to
processors.
� A processor can have all of the memory or none of the memory or
anything in between.
Fig 1: Multibank shared memory model
Computer Systems Modeling (CS-417) 2
3
Evaluation of High Parallel Systems Architecture
� Allocations are done using the entire memory module.
� That is, a CPU cannot share a memory module with another CPU
during a cycle.
� On each CPU cycle, each processor makes a memory request.
� If there is a free memory meeting the CPU's request, it gets filled;
otherwise, the CPU must wait until the next cycle.
� When several processors make memory module requests to the
same memory module, only one is served (chosen at random from
those requesting).
� New memory requests for each processor are chosen randomly
from the M memory modules using a uniform distribution.
� Let the system state be the number of memory requests for each
memory module:-
K = (k1, k2, k3, …, km)where ki represents the memory request by processors for memory bank i.
4
Evaluation of High Parallel Systems Architecture
� At the start of a cycle the sum of all requests cannot exceed the number
of processors in the system, N:-
k1 + k2 + k3 + … + km = N
� The total number of possible states is related to the number of ways N
processor requests can be distributed to M memory modules:-
or, in other terms, how to allocate N balls to M cells.
� For N = 2 and M = 4 (see Fig 2) the possible way to allocate the four
memory modules to processors (indistinguishable from each other) is
shown in Table 1.
Fig 2: Multiprocessor system with N = 2 and M= 4.
Computer Systems Modeling (CS-417) 3
5
Evaluation of High Parallel Systems Architecture
� and is found by:
Table 1
6
Evaluation of High Parallel Systems Architecture
� We can see that if the number of processors requesting
memory modules and the number of memory modules are
increased,
o the number of possible states grows very quickly,
o making this analysis difficult for even relatively small
problems, as shown in Table 2.
Table 2
Computer Systems Modeling (CS-417) 4
7
Evaluation of High Parallel Systems Architecture
� Let H = (h1,h2, ... ,hm) represent the intermediate state, when
the memory access requested on a cycle has been filled and
the new requests have not yet been made:
� Let G represent a new (feasible) system state:
G = (g1, g2, g3, …, gm)
� First, let's define:-
8
Properties
1. If G is reachable from K in one cycle, the probability it will in
fact be the next state is given by:-
where x represents the number of new requests.
2. The system can be described by a Markov chain, since the
next state probabilities at any time depend only on the
current state.
3. The system is aperiodic, since a one-step transition from a
state to itself is possible at any time.
4. The system is irreducible, since it can reach any other in a
finite number of steps.
Computer Systems Modeling (CS-417) 5
9
Performance Assessment
� Also, since these conditions hold, there is an equilibrium state
probability distribution, Π, so that:-
ΠΠΠΠ=ΠΠΠΠ P
where P is the state transition matrix
Π = (Π1, Π2, Π3, Π4, …, Πj)
� A performance assessment typically made in such system
configurations to determine what the Effective processor
power of the N processors with M memory system is:
o EP (N, M) = the expected number of instructions executed
per second compared with an N =1, M =1 system.
� Let Proc(i) represent the number of memory requests
serviced (instructions executed) when the system is in state i:-
10
Performance Assessment
� For the simple case where N = 2 and M = 2, we have the
system illustrated in Fig 3.
Fig 3: Multiprocessor system with N = 2 and M= 2.
Fig 4: Probability state transition diagram.
Computer Systems Modeling (CS-417) 6
11
Performance Assessment
� The possible states this model could be in, representing the
requested memory requested by the two processors, is
described as (see Fig 4):-
� which represents the probability of being in state (2,0) and
transitioning to state (1,1).
12
Performance Assessment
� Similarly, the probability of being in state (1,1) and traversing
to state (2,0) would be found as:
and so on.
� The balance equations for this Markov chain can be found
using the relationship:-
Flow In = Flow Out
Computer Systems Modeling (CS-417) 7
13
Performance Assessment
� The discovered effective processor power is computed using the
relationship:-
EP(2,2 ) = 1ΠΠΠΠ1 + 2ΠΠΠΠ2 + 1ΠΠΠΠ3 = 0.25 + 1.0 + 0.25 = 1.5
� Limitations: The model does not take into account memory interference
caused by I/O operations.
o It also assumes the processors and memory are synchronized, as are
memory access/cycle.
14
Evaluation of Parallel Systems Architecture Petri net Perspective
� Assumptions: There are
o np processors,
o nm shared memory modules, and
o nb data buses.
� Each of the processors has local memory,
o gets used until a page miss
o new page being loaded into local memoryfrom external memory module.
� The miss rate (λλλλ) is exponentially distributed.
� The access time (1/µµµµ) to shared memory is alsoassumed to be exponentially distributed.
Computer Systems Modeling (CS-417) 8
15
Evaluation of Parallel Systems Architecture Petri net Perpective
� The model depicted contains two places per memory moduleo one place for processor tokens and one place for bus tokens and o one timed transition (for memory allocation and use).
� There are also two immediate transitions associated withsynchronizing and controlling the memory access.
� We have total nine places, four timed transitions, and siximmediate transitions.
Fig 5: Petri net model for multiprocessor system (np= 5, nm = 3, and nb = 2)
16
Petri net Perpective
� Tokens in place P1 represent processors executing on their localmemory.
� Tokens in place P2 represent data buses available for use.
� An important assumption: every processor and memory module actin an identical manner.
� When a processor completes its local memory access (has a pagemiss resulting in firing transition t1) and requires more sharedmemory resources, a token is moved from place P1 to place P3.
Computer Systems Modeling (CS-417) 9
17
Petri net Perpective
� A processor determines which memory it needs by firing the immediatetransition, t2, on the memory module it has chosen using a probabilisticbranch.
� Once t2 fires, a token is moved from place 3 to place 4.
� Once a token is in place 4, the processor is requesting access to a data bus.
� The processor acquires the memory desired, and then acquires a data bus toretrieve the needed information.
� Once a processor has the bus, signaled by the firing of transition t3, and hasacquired the memory (indicated by the token in place, P5), it begins tomodel using the memory module by initiating the timer on transition t4.
18
Petri net Perpective
� Upon completion of using the bus, the token representing theprocessor and the bus are routed back to their initial places, P2 andP1.
� If we run this model with inputs similar to what were applied tothe queuing model, we would find results that very closely matchthe queuing model case.
� That is, we would find out that the effective processor powerwould be proportional to about 2.05 with the configuration asspecified.