2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many...

112/04/21 Part I 1

Models of Parallel Processing

112/04/21 Part I 2

• Parallel processors come in many different varieties.

• Thus, we often deal with abstract models of real machines.

112/04/21 Part I 3

Development of Early Models (1)

• Associative processing (AP) was perhaps the earliest form of parallel processing. – Associative or content-addressable memories (AMs, CAMs),

which allow memory cells to be accessed based on contents rather than their physical locations within the memory array.

– AMI AP architectures are essentially based on incorporating simple processing logic into the memory array so as to remove the need for transferring large volumes of data through the limited-bandwidth interface between the memory and the processor (the von Neumann bottleneck)

112/04/21 Part I 4

• the AM/AP model has evolved through the incorporation of additional capabilities, so that it is in essence converging with SIMD-type array processors.

112/04/21 Part I 5

• neural networks

• Cellular automata

112/04/21 Part I 6

112/04/21 Part I 7

112/04/21 Part I 8

SIMD Vs. MIMD (1)

• Most early parallel machines had SIMD designs.

• Within the SIMD category, two fundamental design choices exist: – Synchronous versus asynchronous SIMD

• A possible cure is to use the asynchronous version of SIMD, known as SPMD

– Custom- versus commodity-chip SIMD

112/04/21 Part I 9

SIMD Vs. MIMD (2)

• In the 1990s, the MIMD paradigm has become more popular recently.

• MIMD machines are most effective for medium- to coarse-grain parallel applications, where the computation is divided into relatively large subcomputations or tasks whose executions are assigned to the various processors.

112/04/21 Part I 10

SIMD Vs. MIMD (3)

• Within the MIMD class, three fundamental issues or design choices are subjects of ongoing debates in the research community. – MPP-massively or moderately parallel processor

• Is it more cost-effective to build a parallel processor out of a relatively small number of powerful processors or a massive number of very simple processors

– Tightly versus loosely coupled MIMD• network of workstations (NOW), cluster computing, Grid

Computing

– Explicit message passing versus virtual shared memory

112/04/21 Part I 11

Global Vs. Distributed Memory (1)

• Within the MIMD class of paranel processors, memory can be global or distributed.

• Global memory may be visualized as being in a central location where all processors can access it with equal ease.

• memory latency-hiding techniques must be employed. An example of such methods is the use of multithreading.

112/04/21 Part I 12

112/04/21 Part I 13

• Examples for both the processor-to-memory and processor-to-processor networks include:

• an abstract model of global-memory computers, known as PRAM.

• One approach to reducing the amount of data that must pass through the processor-to memory interconnection network is to use a private cache memory. (locality of data access, cache coherence problem)

112/04/21 Part I 14

112/04/21 Part I 15

• Distributed-memory architectures can be conceptually viewed as in Fig. 4.5.

• In addition to the types of interconnection networks enumerated for shared-memory parallel processors, distributed-memory MIMD architectures can also be interconnected by a variety of direct networks. (as nonuniform memory access (NUMA) architectures)

112/04/21 Part I 16

112/04/21 Part I 17

PRAM Shared-Memory Model (1)

• The theoretical model used for conventional or sequential computers (SISD class) is known as the random-access machine (RAM)

• The parallel version of RAM (PRAM), constitutes an abstract model of the class of global-memory parallel processors. The abstraction consists of ignoring the details of the processor-to-memory interconnection network and taking the view that each processor can access any memory location in each machine cycle, independent of what other processors are doing.

112/04/21 Part I 18

112/04/21 Part I 19

• In the formal PRAM model, a single processor is assumed to be active initially. In each computation step, each active processor can read from and write into the shared memory and can also activate another processor.

• Even though the global-memory architecture was introduced as a subclass of the MIMD class, the abstract PRAM model depicted in Fig. 4.6 can be SIMD or MIMD.

112/04/21 Part I 20

112/04/21 Part I 21

• This implies that each instruction cycle would have to consume Ω(log p) real time.

• The above point is important when we try to compare PRAM algorithms with those for distributed-memory models. An O(log p)-step PRAM algorithm may not be faster than an O(1og2 p)-step algorithm for a hypercube architecture.

112/04/21 Part I 22

Distributed-Memory or Graph Models (1)

• Given the internal processor and memory structures in each node, a distributed-memory architecture is characterized primarily by the network used to interconnect the nodes.

• This network is usually represented as a graph.

• Important parameters of an interconnec tion network include– Network diameter: the longest of the shortest paths between various pairs

of nodes – Bisection (band)width: the smallest number (total capacity) of links that

need to be cut in order to divide the network into two subnetworks of half the size.

– Vertex or node degree: the number of communication ports required of each node

112/04/21 Part I 23

112/04/21 Part I 24

112/04/21 Part I 25

Distributed-Memory or Graph Models (2)

• Even though the distributed-memory architecture was introduced as a subclass of the MIMD class, machines based on networks of the type shown in Fig. 4.8 can be SIMD- or MIMD-type.

• Fig. 4.9 are available for reducing bus traffic by taking advantage of the locality of communication within small clusters of processors.

112/04/21 Part I 26

2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many...

Documents

Resumen I2 (1)

1.1. Namjena VI ID · 2019. 3. 19. · Sop Otok Svibovski Črnec Rugvički K1 I2 ... K3-2 K3-1 V K2 K3-1 R5 I2 I2 K2 K1 K1 I2 I2 R6 I2 I2 I2 I2 I2 F R6 A-3 A-3 A-3 A-3 Ž-1036 Ž-1036

Siemens i2 21

Atelier i2 c

Segway PT i2 Fender, Tire & Wheel€¦ · i2 • i2-Schutzblech und i2-Reifen und Radaufhängung • Guardabarros del i2 y Neumático y rueda del i2 & i2 • フェンダーおよび

FRGP – NLR/I2

Derecho Tributario I2

i2.res.24o.it€¦ · i2.res.24o.it

Reglasb Sicasdeconvivencia I2

Apriori Algorithm Example & Improvement · Apriori Algorithm Example & Improvement Pantelis Vikatos . TID List of item IDs T100 I1, I2, I5 T200 I2, I4 T300 I2, I3 T400 I1, I2, I4

FLM‑420‑I2 bemeneti csatolómodulokresource.boschsecurity.com/documents/FLM_420_I2_Data_sheet_huHU... · Tűzjelző rendszerek | FLM‑420‑I2 bemeneti csatolómodulok Az FLM‑420‑I2

Taller Mimate i2

I2 4 esob_grp_20

Ejecucion Agosto I2

2016/1/6Part I1 A Taste of Parallel Algorithms. 2016/1/6Part I2 We examine five simple building-block parallel operations and look at the corresponding

I2 - Infectious Disease

Totes 2011 i2

AMIPI INC.amipi.com/downloads/jewelry.pdf · si2 si3 i1 si2 si3 i1 si2 si3 i1 si3/i1 i1/i2+ i2 si3/i1 i1/i2+ i2 si3/i1 i1/i2+ i2 0.23 - 0.29 $842 $700 $624 $700 $586 $520 $596 $520

Import Ante i2

I2 adn pdf1