SOFTWARE EXPLOITS OF INSTRUCTION-LEVEL PARALLELISM FOR SUPERCOMPUTERS

Embed Size (px)

Citation preview

  • 7/30/2019 SOFTWARE EXPLOITS OF INSTRUCTION-LEVEL PARALLELISM FOR SUPERCOMPUTERS

    1/20

    SOFTWARE EXPLOITS OF INSTRUCTION-LEVEL PARALLELISM FOR

    SUPERCOMPUTERS

    1S. N. TAZI, 2PRAKASH MEENA, 3ISHITA SHARMA, 4A. K. DUBEY& 5NEETU SHARMA

    1, 2, 3M.Tech Scholar- Computer Engineering, Govt. Engineering College, Ajmer-305002, Rajasthan

    3M.Tech Scholar- Computer Engineering,Govt. Women Engineering College, Ajmer-305002, Rajasthan

    4Ieee Member, India

    5Govt. Engineering College Ajmer-305002, Rajasthan, India

    ABSTRACTFor decades hardware algorithms have dominated the field of parallel processing. But with the Moores law

    reaching its limit need for software pipelining is being felt. This area has eluded researchers since long. Significant

    measure of success has been obtained in graphics processing using software approaches to pipelining. This project aims at

    developing software to detect various kinds of data dependencies like data flow dependency, anti-dependency and output

    dependency for a basic code block. Graphs would be generated for the various kinds of dependencies present in a code

    block and would be combined to obtain a single data dependency graph. This graph would be further processed to obtain a

    transitive closure graph and finally an ILP graph. The ILP graph can be used to predict the possible combinations of

    instructions that may be executed in parallel. A scheduling algorithm would be developed to obtain an instruction schedule

    in the form of instruction execution start times. The schedule obtained would be used to compute various performance

    metrics like speed-up factor, efficiency, throughput, etc.

    KEYWORDS:Ilp, Dependencies, System Design, Agile, Performance Metrics

    INTRODUCTION

    Instruction-level parallelism (ILP) is measuring amount of operation computer program performed simultaneously.

    The main objective consider by designer to design compiler and processor is, identification of ILP and gain all its beneficial

    points as much as possible. Commonly programs are written in a order execution model. Where all the instructions are

    executed one after the other explicited by the programmer. ILP facilitate both compiler & processor for overlapping to the

    execution of multiple instructions or change the executon order of instructions[1]. To achiving approximate standard of high

    performance, supercomputers uses both super-pipelining & EPIC (Explicitly Parallel Instruction Set Computing) processors.

    In this work is exploits software based approach from two comman approaches i.e., Hardware and software based approach.

    The ILP existence amont specify the application values of program. In specific field of graphics and scientific computing the

    existing amount of ILP is much more in compare to cryptography. The exploit ILP are used Micro-architectural techniquesthat include: Instruction pipelining for execution of multiple instructions which can be partially overlapped. VLIW,

    Superscalar execution are closely related to the concept of Parallel Instruction Computing, in which execute multiple

    instructions in parallel by using multiple execution units. Instructions execute in random arrangement that does not violate

    data dependencies in sequence of out-of-order excuetion.This technique is independent for both pipelining and superscalar.

    Current implementations, without proper sequencing of execution pertaining to extract ILP from ordinary programs. If etract

    this parallelism at compile time then, how convey appropriate information to the hardware. Every instruction of encoded

    multiple independent operations is clearify and sufficiently improved. The repetation process to examine again and again is

    followed by industry for instruction sets to control the complexity arises in squencial order instructions.

    International Journal of Computer Science Engineeringand Information Technology Research (IJCSEITR)ISSN 2249-6831Vol.2, Issue 4, Dec 2012 19-38 TJPRC Pvt. Ltd.,

  • 7/30/2019 SOFTWARE EXPLOITS OF INSTRUCTION-LEVEL PARALLELISM FOR SUPERCOMPUTERS

    2/20

    20 S. N. Tazi, Prakash Meena, Ishita Sharma, A. K. Dubey, Neetu Sharma

    A technique used for renaming register is to turn away accidental serialization of program operations which is

    imposed by reuse of registers and those particular operations. All the internal part of speculative execution are executed

    before the determination of target control flow instructions . Branch prediction, which is used with speculative execution to

    turn away stalling for control dependencies which may resolved. [1]

    Figure 1.1: A Canonical Five-Stage Pipeline in a RISC Machine (IF = Instruction Fetch, ID = Instruction Decode,

    EX = Execute, MEM = Memory Access, WB = Register Write Back) [2]

    DEPENDENCIES

    In computer science, data dependency shows instruction or a program statement which, refers to the data of

    preceding statement. According to compiler theory, dependence analysis is a technique, use for discovering data

    dependencies from statements (or instructions). Two common type of dependencies are as follow:

    DATA DEPENENCIES

    Lets assume statement S1 and S2, S2 depends on S1 if:

    [I (S1) O (S2)] U [O (S1) I (S2)] U [O (S1) O (S2)]

    where,

    I( Si ) represent set of memory locations read by Si and

    O( Sj ) represent set of memory locations written by Sj

    And S1 to S2 represent the feasible run-time execution path

    This condition is called Bernstein Condition, named after A.J. Bernstein.

    Three cases exist:

    True (data) dependence: O(S1) I(S2), S1 -> S2 and S1 writes something read by S2 Anti-dependence: I(S1) O(S2), mirror relationship of true dependence Output dependence: O(S1) O(S2), S1 -> S2, both are writing into same memory location.

    True Dependencies

    A true dependencie, also known as data dependencies. It occurs when the current instructions depends on the previous

    instructions results.

  • 7/30/2019 SOFTWARE EXPLOITS OF INSTRUCTION-LEVEL PARALLELISM FOR SUPERCOMPUTERS

    3/20

    Software Exploits of Instruction-Level Parallelism for Supercomputers 21

    Anti-dependence

    Anti-dependencie occurs when required value for a particular instructions, updated later.

    Output Dependencies

    An output dependencie occurs when the final output value of a variable is affected by instructions order.

    A commonly used convention for the data dependencies is the following:

    Read-after-Write (true dependence) Write-after-Write (output dependence) Write-after-Read (anti-dependency)

    CONTROL DEPENDENCY

    In the program instruction are executed according to sequencial execution modal, under this modal, instruction used

    one after to other, atomically. However, dependencies among instructions may execute parallel execution of multipleinstructions, by a processor exploiting instruction level parallelism without considering related dependencies may cause

    danger of getting wrong results, namely hazards.

    We restrict ourselves to data dependencies in this project without dealing of control dependencies.[2]

    REQUIREMENT ELICITATION

    Basic

    The requirement elicitation of system campture all the relevant information related to the system development, i.e.,

    customerdetails, problem identification of client & appropriate developer for particular problem.

    The requirement elicitation role work as interface between system specification (in developer team) and

    custmers records (problem). The main motive is to focus on the custmors view of the system.[3][4]

    In the analysis phase of requirement, analysier mainly focus on two basic thing: clarifcation & understandibilty of

    the real problem is one thing and procedure to solve the cpcoming problem is another one. The automation of system and

    automation in development environment could a common problem another one make the combination of these two.

    Heavy systems have a lot of features, and its necessary to perform all these different tasks, one of the most commn

    task is to understood the requirements of the system.. The problem analysier, analysis real mean of problem and its context.

    They required the complete report generated by previous analyzer to understood the system and its individual automated

    parts.

    Proposed System

    This project aims at developing a software to detect various kinds of data dependencies like data flow dependency,

    anti-dependency and output dependency for a basic code block.

  • 7/30/2019 SOFTWARE EXPLOITS OF INSTRUCTION-LEVEL PARALLELISM FOR SUPERCOMPUTERS

    4/20

    22 S. N. Tazi, Prakash Meena, Ishita Sharma, A. K. Dubey, Neetu Sharma

    Figure 1.2: ILP Application Algorithm

    Graphs would be generated for the various kinds of dependencies present in a code block and would be combined

    to obtain a single data dependency graph. This graph would further be processed using certain backtracking algorithms to

    obtain a transitive closure graph (TCG). The TCG is an indication of the various kinds of dependencies and can be used to

    predict the possible combinations of instructions that may execute in parallel. Finally an ILP graph would be obtained. A

    scheduling algorithm would be applied to obtain an instruction schedule in the form of instruction start times. Certain

    performance metrics would then be computed.

    Specificaton of Software & Hardware

    Processor: Intel Core2 Duo @ 2.66GHz

    RAM: 2GB DDR2

    Hard Disk: Samsung HD 161(160 GB)

    Operating system: Fedora 10 (Linux kernel version 2.6.27.5-117.fc10.i686)

    Detect data flow dependency

    Detect anti-dependency

    Detect output dependency

    Obtain data dependency graph

    Obtain transitive closure graph

    Obtain architectural restrictions graph

    Obtain dependence graph

    Obtain ILP graph

    Obtain instruction schedule

    Compute performance metrics

    (Instructions without ILP)

  • 7/30/2019 SOFTWARE EXPLOITS OF INSTRUCTION-LEVEL PARALLELISM FOR SUPERCOMPUTERS

    5/20

    Software Exploits of Instruction-Level Parallelism for Supercomputers 23

    X-Windows system: GNOME

    Editor: Gedit

    Development kit: JDK 1.6

    Programming paradigm: Object Oriented

    Programming language: JAVA 2 SE

    Development philosophy: Agile

    Process model: Scrum

    Technology: Open Source

    Image manipulator: GIMP

    SYSTEMDESIGN

    The objective of analysis modeling is to create a variety of representations that depict software requirements for

    information, function, and behavior. To accomplish this, two different modeling philosophies can be applied: structured

    analysis and object-oriented analysis. Structured analysis views software as an information transformer.That may support

    software engineer to identify data object and relationship between different object.It also transform the object in a flow

    through a systematic manner by the use of function. Object-oriented analysis examines a problem domain defined as a set of

    use-cases in an effort to extract classes that define the problem. Each class has a set of attributes and operations. Classes are

    related to one another in a number of ways and are modeled using UML diagrams. Four modeling element such as :scenario-

    base, class-base, flow and behavioral models are composed with analysis .

    Scenario-Based Modeling

    This model consist software requirements on the basis of users view. The use-case- a narrative or template driven

    description of an interaction between an actor and the software- is the primary modeling element. Derived during

    requirement elicitation, the use-case defines the key steps for a specific function or interaction. The degree of use-case

    formality and details varies, but the end result provides necessary input to all other analysis modeling activities. Scenarios

    can also be described using an activity diagram- a flowchart-like graphical representation that depicts the processing flow

    within a specific scenario.

    Figure 2: Use-Case Diagram

  • 7/30/2019 SOFTWARE EXPLOITS OF INSTRUCTION-LEVEL PARALLELISM FOR SUPERCOMPUTERS

    6/20

    24 S. N. Tazi, Prakash Meena, Ishita Sharma, A. K. Dubey, Neetu Sharma

    A use-case captures the interactions that occur between producers and consumers of information and the system

    itself. Requirements gathering mechanisms are used to identify stakeholders, define the scope of the problem, specify overall

    operational goals, outline all known functional requirements, and describe the objects that will be manipulated by the

    system.

    Flow Modeling

    Flow models focus on the flow of data objects as they are transformed by processing functions. Derived from

    structured analysis, flow models use the data flow diagram, a modeling notation that depicts how input is transformed into

    output as data objects move through a system.

    Figure 3: Context-Level DFD

    Each software function that transforms data is described by a process specification or narrative. In addition to data

    flow, this modeling element also depicts control flow- a representation that illustrates how events affect the behavior of a

    system. The DFD takes an input-process-output view of a system. That is, data objects flow into the software, are

    transformed by processing elements, and resultant data objects flow out of the software.

    Figure 4: Level 1 DFD

    Figure 5: Level 2 DFD that Refines the Detect Direct Dependency Process

  • 7/30/2019 SOFTWARE EXPLOITS OF INSTRUCTION-LEVEL PARALLELISM FOR SUPERCOMPUTERS

    7/20

    Software Exploits of Instruction-Level Parallelism for Supercomputers 25

    Class-Based Modeling

    Figure 6: Class Diagram

    Behavioral Modeling

    Data objects are represented by labeled arrows and transformations are represented by bubbles. The DFD is

    presented in a hierarchical fashion. That is, the first data flow model (sometimes called a level 0 DFD or context diagram)

    represents the system as a whole. Subsequent data flow diagrams refine the context diagram, providing increasing detail with

    each subsequent level.

    Figure 7: Sequence Diagram

    SYSTEM ANALYSIS

    Agile Design Philosophy

    Agile is a philosophy ,guidelines to build a software . This philosophy encourages client satisfaction and delivery of

    a software before deadline;It gave a motivation to development team and minimize the workdone on software product.

    These guidline hammering both analyser and developer for better communication to client.

  • 7/30/2019 SOFTWARE EXPLOITS OF INSTRUCTION-LEVEL PARALLELISM FOR SUPERCOMPUTERS

    8/20

    26 S. N. Tazi, Prakash Meena, Ishita Sharma, A. K. Dubey, Neetu Sharma

    Manifesto for agile software development:

    We unveil improve the developing process of software by doing it and also assist others do it. Entire this processing

    work we have come to significance:

    $ Individuals and interactions accomplished processes and tools

    $ Working software over comprehensive documentation

    $ Customer collaboration over contract negotiation

    $ Responding to changes aloft following a plan

    i.e, while there is a value item in the right, we changes and make value the items on the left more. [3][4]

    Software engineers and other project stakeholders work together on an agile team- a team that is self-organizing

    and in control of its own destiny.

    Figure 8: Agile v/s waterfall

    An agile team fosters communication and collaboration among all who serve on it. Agile development may be best

    termed as software engineering lite. The basic framework activities- customer communication, planning, modeling,

    construction, delivery and evaluation remain. But they morph into a minimal task set that pushes the project team toward

    construction and delivery (some argue that this is done at the expense of problem analysis and solution design). Customers

    and software engineers who have adopted the agile philosophy have the same view- the only really important work product

    is an operational software increment that is delivered to the customer on the appropriate commitment date.

    The Agile Alliance defines 12 principles for those who want to achieve agility [5]:

    1. Our highest priority is to made satisfaction to customer during whole phases from initiation to deliverycontinuously of valuable software.

    2. Adapt required changes in the requirements, flush later in development. Agile processes tackle changes madefor customers competitive advantage.

    3. To gave the preference for shorter time scaling relate to deliver process of software. It may from small timeduration (couple of week) to continuous increment in time(couple of month).

    4. During the project development process developers and business people must work together daily.5. Develop the projects near about indviduals motivation. Provide appropriate environment and support

    according to their need, and belief them to achive the job done.

  • 7/30/2019 SOFTWARE EXPLOITS OF INSTRUCTION-LEVEL PARALLELISM FOR SUPERCOMPUTERS

    9/20

    Software Exploits of Instruction-Level Parallelism for Supercomputers 27

    6. Face-to-face conversation is the most common and effective with efficient method to fetch information forboth developers team member and other.

    7. Primary measure of progress is covered by working software.8. Agile processes promote credible development. All the involving affective teams for project (i.e; sponsors,

    customers and developers) should be keep-up a continuous fix pace indefinitely.

    9. Continuous concentration to technical preeminence and good design embellish agility.10. Lack of adornment- cover the amount of essential work which is not completed.11. The self-organizing teams emerge best architectures, requirements and designs.12. Time to time the team reflects regularly on increment of effectiveness and maintain their behavioral tune

    accordingly.

    Agility can be applied to any software process. However, to accomplish this, it is essential that the process be

    designed in a way that allows the project team to adapt tasks and to streamline them, conduct planning in a way that

    understands the fluidity of an agile development approach, eliminate all but the most essential work products and keep them

    lean, and emphasize an incremental delivery strategy that gets working software to the customer as rapidly as feasible for the

    product type and operational environment. [10]

    Any agile software process is characterized in a manner that addresses three key assumptions about the majority of

    software projects [6][7][11]:

    It is difficult to predict in advance which software re1quirements will persist and which will change. It isequally difficult to predict how customer priorities will change as a project proceeds.

    For many types of software, design and construction are interleaved. That is, both activities should be performedin tandem so that design models are proven as they are created. It is difficult to predict how much design is

    necessary before construction is used to prove the design.

    Analysis, design, construction, and testing are not as predictable (from a planning point of view) as we mightlike.

    A number of key traits must exist among the people on agile team [8][9]:

    Common focus Competence Collaboration Decision-making ability Fuzzy problem-solving ability Mutual trust and respect Self-organization

  • 7/30/2019 SOFTWARE EXPLOITS OF INSTRUCTION-LEVEL PARALLELISM FOR SUPERCOMPUTERS

    10/20

    28 S. N. Tazi, Prakash Meena, Ishita Sharma, A. K. Dubey, Neetu Sharma

    Scrum Process Model[13]

    Scrum is an agile process model that was developed by Jeff Sutherland and his team in the early 1990s. In recent

    years, further development of the Scrum methods has been performed by Schwaber and Beedle.The Scrum principle consist

    with the agile manifesto.

    Figure 9: Scrum[13]

    Scrum emphasizes the use of a set of software process patterns that have proven effective for projects with tight

    timelines, changing requirements, and business criticality. Each of these process patterns defines a set of development

    activities:

    Backlog : Choose the maximum priority from list of requierment, according to the business value. Items can be added to the

    backlog at any time (this is how changes are introduced). The project manager assesses the backlog and updates the

    priorities as required.

    Figure 10: Prevalence of Scrum[13]

  • 7/30/2019 SOFTWARE EXPLOITS OF INSTRUCTION-LEVEL PARALLELISM FOR SUPERCOMPUTERS

    11/20

    Software Exploits of Instruction-Level Parallelism for Supercomputers 29

    Sprints-By the getting of priority requirment from backlog to fit work unit , that must completed the task within predefined

    deadline . During the sprint, the backlog items that the sprint work units address are frozen (i.e. changes are not introduced

    during the sprint). Hence, the sprint allows the team members to work in a short-term, but stable environment.

    Scrum meetings- are short meetings held daily by the Scrum teams. Three key questions are asked and answered by all

    team members:

    What did you observe since the last meeting?

    What obstacles are you encountering?

    What do you plan to accomplish by the next team meeting?

    Demos- deliver the software increment to the customer so that functionality that has been implemented can be

    demonstrated and evaluated by the customer. It is important to note that the demo may not contain all planned

    functionality, but rather those functions that can be delivered within the time-box that was established.

    CONSTRUCTION

    Base

    The programming paradigm used for coding is object oriented. It provides the ease of development with the use of

    constructs like classes, constructors, inheritance, interface, encapsulation, and packages.

    Java provides a rich set of language features like pre-defined classes and methods in the form of packages,

    interfaces for establishing guidelines for methods, data hiding, event handling with awt, etc.

    The user interface for this software has been designed using Swing. It provides light weight components as

    compared to the awt. The use of awt in this software is restricted to event handling.

    Java provides this software its present platform independent form. The security is ensured by the sandbox model

    of JVM. Packages most prominently used in the development of this software include javax.swing, java.awt, java.util,

    java.awt.geom, and java.awt.event.

    The interface used in this software include ActionListener and Runnable interface.

    Transitive Closure Graph

    It is the summation of both the direct and indirect dependencies. Given that G is a n-vertex digraph, we construct

    the transitive closure graph of the digraph G as another n-vertex digraph by adding edges to G, following this rule. In H,

    add an edge (i, j) directed from vertex i to j if, and only if, there is a directed path (of any length -1,2,3,,n-1) from i to

    j in G. To estimate the transitive closure of G in

    (n3) time that saves time and space in practice we substitute logical

    operations V (logical OR) and (logical AND) for the arithmetic operations min and + in the Floyd-Warshall algorithm.

    For i, j, k = 1,2,3,.n

    We construct the transitive closure according to Floyd-Warshall algorithm[12], G* = (V, E*) by putting edge (i,

    j) into E* if and only if tij(n) = 1.

  • 7/30/2019 SOFTWARE EXPLOITS OF INSTRUCTION-LEVEL PARALLELISM FOR SUPERCOMPUTERS

    12/20

    30 S. N. Tazi, Prakash Meena, Ishita Sharma, A. K. Dubey, Neetu Sharma

    if tij(n)

    = 1.

    tij(0) = { 0 if i j and (i, j) E, 1

    if i = j or (i, j) E, and

    for k >= 1,

    tij(k) = tij

    (k-1) (tik(k-1) tkj

    (k-1) ).

    Transitive-Closure (G)

    1 n V [G]

    2 for i 1 to n

    3 do for j 1 to n

    4 do if i = j or (i, j) E[G]

    5 then tij(0) 1

    6 else tij(0) 0

    7 for k 1 to n

    8 do for i 1 to n

    9 do for j 1 to n

    10 do tij(k) tij(k-1) (tik(k-1) tkj(k-1))

    11 return T(n)

    Scheduling Algorithm

    Schedule (T, Index)

    1 unscheduled_count := index

    2 initialize inst_state to 0

    3 initialize pipeline_stage to 0

    4 while unscheduled_count > 0

    5 do ifstage = EMPTY

    6 then sel_stage stage

    7 for j 1 to index

    8 ifinst_state = UNPROCESSED

    9 then while dependency or unprocessed predecessor exists

    10 ifsched_condition

    11 break

  • 7/30/2019 SOFTWARE EXPLOITS OF INSTRUCTION-LEVEL PARALLELISM FOR SUPERCOMPUTERS

    13/20

    Software Exploits of Instruction-Level Parallelism for Supercomputers 31

    12 else stage = OCCUPIED

    13 sel_stage = stage_no

    14 inst_state : = PROCESSED

    15 time_array[index] := clock

    16 update stage counters and clock

    17 return time_array

    T is the ILP graph time_array is an array that stores the execution start times of instructions sel_stage represents the pipeline stage to which an instruction has been supplied index denotes the total number of instructions inst_state denotes whether instruction has been scheduled or not

    stage denotes whether a pipeline stage is empty or occupied.

    Instruction Set

    This software operates on a basic code block written in a generic instruction set. All instructions are assumed to

    be of five clock cycles.

    Transfer Instructions Like

    MOV RD , RS

    MVI R, 8-BIT

    OUT [ADDRESS]

    IN [ADDRESS]

    Arithmetic Instructions Like

    ADD R

    ADI 8-BIT

    SUB R

    SUI 8-BIT

    INR R

    DCR R

    Logic Instructions Like

    ANA R

    ANI 8-BIT

    ORA R

  • 7/30/2019 SOFTWARE EXPLOITS OF INSTRUCTION-LEVEL PARALLELISM FOR SUPERCOMPUTERS

    14/20

    32 S. N. Tazi, Prakash Meena, Ishita Sharma, A. K. Dubey, Neetu Sharma

    ORI 8-BIT

    XRA R

    XRI 8-BIT

    Machine Control Instructions Like

    HLT

    NOP

    Notes:

    The implicit register are used in accumulator . Instructions like INR are presumed to both use and modify the associated register. Branch instructions like the JMP have not been scheduled because we have not dealt with the control

    dependencies at this stage of the project.

    Being a fundamental law of computer science GIGO is also applicable here. This software has no explicit errorhandling facility.

    Architectural Restrictions

    Some processors may have some restrictions on which instructions can be combined in parallel. Architectural

    restrictions may be represented by an architectural restrictions graph, which depicts which instructions cannot be combined

    in parallel. We have considered the following architectural restrictions in this software:

    ADD MOV ADDF MULF SUBF DIVF SUB MOV INR DIV

    Performance Metrics

    For a K-stage linear pipeline processor with clock period :

  • 7/30/2019 SOFTWARE EXPLOITS OF INSTRUCTION-LEVEL PARALLELISM FOR SUPERCOMPUTERS

    15/20

    Software Exploits of Instruction-Level Parallelism for Supercomputers 33

    Testing

    This software has been tested using a modular testing approach. Finally the integrated product has been tested as a

    single unit and the detected flaws have been removed.

    It has been tested on both the Linux and windows platforms for consistent performance and absence of errors of

    any sort.

    Let us consider a test case to understand the working of the software. Code sequence are given as below:

    ADDF R1 R2 R3

    SUB R4 R2 R1

    MOV R2 PORT#1

    INR R4

    DCR R1

    ORA R2

    DIV R7 R5 R3

    MULF R6 R8 R9

    The code consists of a block of eight instructions. The instructions may be defined as:

    ADDF floating-point add the contents of R2 and R3 and store in R1

    SUB subtract the contents of R1 from R2 and store the result in R4

    MOV move the data from port#1 to R2

    INR increment register R4

    DCR decrement register R1

    ORA perform an OR operation over the contents of R2 and accumulator

    DIV divide R5 by R3 and store result in R7

    MULF floating-point multiply R8 and R9 and store result in R6

    Figure 11: Data Flow Dependency Graph Figure 12: Anti-Dependency Graph

  • 7/30/2019 SOFTWARE EXPLOITS OF INSTRUCTION-LEVEL PARALLELISM FOR SUPERCOMPUTERS

    16/20

    34 S. N. Tazi, Prakash Meena, Ishita Sharma, A. K. Dubey, Neetu Sharma

    Figure 13: Output Dependency Graph Figure 14: Data Dependency Graph

    Figure 15: Transitive Closure Graph Figure 16: Architectural Restrictions Graph

    Figure 17: Dependence Graph Figure 18: ILP Graph

  • 7/30/2019 SOFTWARE EXPLOITS OF INSTRUCTION-LEVEL PARALLELISM FOR SUPERCOMPUTERS

    17/20

    Software Exploits of Instruction-Level Parallelism for Supercomputers 35

    Figure 19: Performance Metrics

    DEPLOYMENT

    System Implementation

    When the theoretical design concept is turned out into a working system, then ths stage is known as implementation

    of the project. Therefore, it considered as most danger stage in achieving a successful newly system and in giving the user,

    confidence that the newly system will work proper and be effective. The implementation stage involves investigation of the

    existing system, careful planning and implementation constraints, methods of design to manage conversion and judgment of

    conversion methods. Though the software has been developed on the Linux platform but it has been implemented on the

    windows platform as well. The platform independent nature of the software is due the platform independence of Java. The

    platform has been tested on both the platforms for consistent performance. The final working software has been packaged by

    assembling all the required class files in a jar file archive. The delivered software provides benefit for the end-user, but it

    also provides useful feedback for the software team. An appropriate statement is given by the end user to increase the

    characteristics of software such as reliability, user friendly and other comments to their functions and feature. Feedback

    should be collected and recorded by the software team and used to:

    $ Make immediate modifications to the delivered increment (if required)

    $ Define changes to be incorporated into the next planned increment

    $ Make necessary design modifications to accommodate changes

    $ Revise the plan for the next increment to reflect the changes

    CONCLUSIONSANDFUTURESCOPE

    Conclusions

    For decades hardware algorithms like Tomasulo algorithm for the IBM System/360s FPU, Scoreboarding for the

    CDC 6600 computer, etc. have dominated the scenario of pipelining in processors. But with the Moores law reaching its

    limit, it is no longer feasible to depend purely on hardware pipelining. A paradigm shift is expected in the nearby future from

    the hardware-centric approaches to a software-oriented approach to exploit the instruction level parallelism. Intel, IBM,

    AMD and other companies have already begun intense research in this field. An area where this approach has found

    significant application is graphics processing, as the graphics data contains a considerable amount of redundancy and

  • 7/30/2019 SOFTWARE EXPLOITS OF INSTRUCTION-LEVEL PARALLELISM FOR SUPERCOMPUTERS

    18/20

    36 S. N. Tazi, Prakash Meena, Ishita Sharma, A. K. Dubey, Neetu Sharma

    parallelism. A prominent example is the Graphics Processing Unit (GPU) technology which relies heavily on software

    approaches to pipelining.

    Another example is the Itanium processor developed by the Intel Corporation. This processor has found a very

    significant application as the processor for the Intel supercomputer at NASA. Itanium has features like software pipelining

    for loop optimization, rotating registers, speculative branch prediction, etc. This is a field of intense research and provides

    ample of opportunities for the developers and scientists. This field also presents significant challenges for the system

    programmers.

    Future Scope

    The project has covered almost all the requirements initially laid out. Further requirements and improvements can

    be easily incorporated since the coding is mainly modular in nature. The agile nature of the project has provided the scope

    for easy accommodation of changes and emerging requirements. Some of the extensions may be in the form of:

    $ GCD tests before computing dependencies

    $ Use of expanded instruction set

    $ Inclusion of control dependencies to extend the software functionality for handling complex branching code blocks

    $ Application of global code scheduling algorithms like Trace scheduling

    $ Refinement of the scheduling algorithm to handle resource dependencies

    REFERENCES

    1. Yahoo answer on Hardware and Software approaches for instruction Level parallesism By Sumanta .in 20112. John L. Hennessy, David A. Patterson (2003), Computer Architecture: A Quantitative Approach (3rd

    ed.), Morgan Kaufmann. ISBN 1-55860-724-2.

    3. Beck, Kent; et al. (2001). "Manifesto for Agile Software Development". Agile Alliance. Retrieved 14 June 2010.4. Ambler, S.W. "Examining the Agile Manifesto". Retrieved 6 April 2011.5. Beck, Kent, et al, "Principles behind the Agile Manifesto", Agile Alliance, Archivedfrom the original on 14 June

    2010, Retrieved 6 June 2010.

    6. Black S. E. , Boca P. P. , Bowen J. P., Gorman J., Hinchey M. G. , "Formal versus agile:- Survival of thefittest", IEEE Computer 49 (9): 3945, September 2009.

    7. Boehm, B.R. Turner , Balancing Agility and Discipline:- A Guide for the Perplexed, Boston, MA, Addison-Wesley ISBN 0-321-18612-5, Appendix A, pages 165-194.

    8. Mark Seuffert, Piratson Technologies, Sweden, "Karlskrona test, A generic agile adoption test", Piratson.se.Retrieved 6 June 2010.

    9. "How agile are you, a scrum-specific test", Agile-software-development.com, Retrieved 6 June 2010.10. http://www.cloudspace.com/blog/2010/08/25/agile-principle-11-the-best-architectures-requirements-and-designs-

    emerge-from-self-organizing-teams/ Posted on August 25, 2010 by Tim Rosenblatt.

    11. Software Engineering:- A Practitioners Approach, by Roger S. Pressman, chapter 04.

  • 7/30/2019 SOFTWARE EXPLOITS OF INSTRUCTION-LEVEL PARALLELISM FOR SUPERCOMPUTERS

    19/20

    Software Exploits of Instruction-Level Parallelism for Supercomputers 37

    12. http://serverbob.3x.ro/IA/DDU0157.html By The Floyd-Warshall algorithm13. Agile Software Development with Scrum, by Ken Schwaber and Mike Beedle.14. Paolo Faraboschi, Joseph A. Fisher and Cliff Young, Instruction Scheduling for Instruction Level Parallel

    Processors, IEEE Proceedings , VOL 89, No. 11, November 2001.

    15. Rainer Leupers ,Exploiting Conditional Instructions in Code Generation for Embedded VLIW Processors16. Alexandru Nicolau and Joseph A. Fisher, Measuring the Parallelism Available for Very Long Instruction Word

    Architectures, IEEE transactions on computers, VOL c-33, No. 11, November 1984

    17. Lei Wang and Gui Chen Architecture-dependent Register allocation and Instruction Scheduling on VLIW, 2010IEEE.

    18. Advanced computer architecture: Parallelism, Scalability, Programmability, by Kai Hwang19. Computer architecture and Parallel processing, by Faye A. Briggs and Kai Hwang.

  • 7/30/2019 SOFTWARE EXPLOITS OF INSTRUCTION-LEVEL PARALLELISM FOR SUPERCOMPUTERS

    20/20