Ankit_multi Core Processing

8/8/2019 Ankit_multi Core Processing

http://slidepdf.com/reader/full/ankitmulti-core-processing 1/29

SEMINAR REPORT ON

Multi Core Processing

SUBMITTED BY:

ANKIT MISHRA

UNIV ROLL NO. 0606331018

SUBMITTED TO:

DEPARTMENT OF

ELECTRONICS & COMMUNICATION

G.L.A. INSTITUTE OF TECHNOLOGY &

MANAGEMENT, MATHURA



ACKNOWLEDGEMENTS

I would like to thank Mr.Asheesh Shukla for his help and guidance.i would also like

to express my gratitude to all our Faculty members who helped and guided me

from time to time.

I would also take this opportunity to thank all my friends who helped me with the

completion of this report



Department of Electronics & Communication Engineering

G.L.A. Institute of Technology & Management,Mathura

Date: 3,sept,2009

CERTIFICATE

This is to certify that Ankit Mishra has successfully delivered a seminar on

the topic ³Multi Core Processing´ for partial fulfillment of the

requirement for the award of Bachelor of Technology degree in Electronics

& Communication Engineering from Uttar Pradesh Technical University,

Lucknow.

SEMINAR IN CHARGE SUPERVISOR



LIST OF CONTENTS:

1.Preface

2.Evolution of Intel Microprocessor Family

3.Introduction

4.Current options to address power and cooling considerations

5.Scalablity of Multi Core processors

6.Hyper Threading Technology

7.Cache Memories

8.What is Multi Core Processing?

9.How does a Dual Core Processor works?

10.Multi Core power consumption

11.Balancing power and performance

12.Power efficiency

13.Architectural Floor plans of Multi Cores

14.Features and advantages of Multi Core Processing

15.Disadvantages and Uses of Multi Core Processing

16.80 Core Processor

17.Conclusion

18.Bibliography



Preface

A microprocessor is a programmable digital electronic component that incorporates the

functions of a central processing unit on a single semi conducting integrated circuit.

Microprocessor made possible the advent of personal computers.But as the demands grew the

the microprocessors grew faster employing newer technologies.But as saturation arrived new

means were searched for one of them being Multi Core Processing.

Multicore uses many cores and hence increasing the performance of the computers.The

markets are now flooded with dual core and quad core processors.the buzz is around and

maybe hundreds of cores may be added on a single chip over the next few years.

The evolution of micro processors followed moores law.But now a gap has been created andthe number of transistors on a chip is increasing exponentially instead of doubling itself.



Evolution of the Intel Microprocessor Family:



INTRODUCTION:

Server density has grown dramatically over the past decade to keep pace with escalating

performance requirements for enterprise applications. Ongoing progress in processor designshas enabled servers to continue delivering increased performance, which in turn helps fuel the

powerful applications that support rapid business growth.However, increased performance

incurs a corresponding increase in processor power consumptionand heat is a consequence

of power use. As a result, administrators must determine not only how to supply large amounts

of power to systems, but also how to contend with the large amounts of heat that these

systems generate in the data center.As more applications move from proprietary to standards

based systems, the performance demands on industry standard servers are spiraling upward.

Today, in place of midrange and large mainframe systems, tightly packed racks of stand-alone

servers and blade servers can be clustered to handle the same types of business-critical

application loads that once required large proprietary systems.Organizations are using

databases such as Microsoft® SQL Server, Oracle® Database 10g, and MySQL to enhance

business decision making along with enterprise-wide messaging applications such as Microsoft

Exchange.Meanwhile, network infrastructure, Internet connectivity,and e-commerce are

growing at tremendous rates.

Altogether, the result is a steady increase in performance demands as user loads and

processing loads grow, driving a steady increase in the density of systems in the data center,

which is intensified by ever-faster processorsand in turn this can create power and coolingchallenges for many IT organizations.



Current options to address power and cooling challenges:

Historically, processor manufacturers have responded to the demand for more processing

power primarily by delivering faster processor speeds. However, the challenge of managingpower and cooling requirements for todays powerful processors has prompted a reevaluation

of this approach to processor design. With heat rising incrementally faster than the rate at

which signals move through the processor, known as clock speed, an increase in performance

can create an even larger increase in heat.

IT organizations must therefore find ways to enhance the performance of databases, messaging

applications, and other enterprise systems while contending with a corresponding increase in

system power consumption and heat. Although faster processors are one way to improve

server performance, other approaches can help boost performance without increasing clock

speed and incurring an attendant increase in power consumption and heat. In fact, excellent

overall processing performance may be achieved by reducing clock speed while increasing the

number of processing unitsand the consequent reduction in clock speed can lead to lower

heat output and greater efficiency. For example, by moving from a single high speed core,

which generates a corresponding increase in heat, to multiple slower cores, which produce a

corresponding reduction in heat, enterprises can potentially improve application performance

while reducing their thermal output.

Balancing perf ormance across each platf orm:

The first step is to optimize performance across all platform elements. Designing,integrating,

and building complete platforms that balance computing capabilities across processor, chip set,

memory, and I/O components can significantly improve overall application performance and

responsiveness. By integrating flexible technologies and balancing performance across all

platform components, administrators can help provide the headroom required to support

business growth (such as increases in users, transactions, and data) without having to upgrade

the entire server. This approach can help the systems in place today support increased business

demands, enhancing scalability for future growth. At the same time, this strategy can help

extend the life of existing data center components by enabling administrators to optimize theperformance of repurposed platforms when next-generation applications are deployed.

Harnessing multithreading technology:

The second step is to improve the efficiency of computer platforms by harnessing the power of

multithreading. Industry-standard servers with multiple processors have been available for



many years, and the overwhelming majority of networked applications can take advantage of

the additional processors, multiple software threads, and multitasked computing

environments. These capabilities have enabled organizations to scale networked applications

for greater performance. The next logical step for multiprocessing advancements is expected to

come in the form of multiple logical processing units, or processor cores, within a single chip.Multicore processorscoupled with advances in memory, I/O, and storagecan be designed

to deliver a balanced platform that enables the requisite performance and scalability for future

growth.

Optimizing sof tware applications:

The third step, software optimization, can be an efficient way to enable incremental

performance gains without increasing power consumption and heat. Many of todays leading

software tools, along with Intel® compilers, can enable significant performance improvements

over applications that have not been compiled or tuned using such optimization tools.1 Actualperformance gains will depend on the specific system configuration and application

environment. To get the most performance from existing data center components,

administrators must not overlook potential gains from optimizing software applications during

the infrastructure planning processes



Scalability potential of multicore processors

Processors plug into the system board through a socket. Current technology allows for one

processor socket to provide access to one logical core. But this approach is expected to change,

enabling one processor socket to provide access to two, four, or more processorcores. Future

processors will be designed to allow multiple processor cores to be contained inside a single

processor module. For example, a tightly coupled set of dual processor cores could be designed

to compute independently of each otherallowing applications to interact with the processor

cores as two separate processors even though they share a single socket. This design would

allow the OS to thread the application across the multiple processor cores and could help

improve processing efficiency. A multicore structure would also include cache modules. These

modules could either be shared or independent. Actual implementations of multicore

processors would vary depending on manufacturer and product development over

time.Variations may include shared or independent cache modules, bus implementations, and

additional threading capabilities such as Intel Hyper-Threading (HT) Technology.

A multicore arrangement that provides two or more low-clockspeed cores could be designed to

provide excellent performance while minimizing power consumption and delivering lower heat

output than configurations that rely on a single high-clock-speed core. The following example

shows how multicore technology could manifest in a standard server configuration and how

multiple low-clock-speed cores could deliver greater performance than a single high-clock-

speed core for networked applications. This example uses some simple math and basic

assumptions about the scaling of multiple processors and is included for demonstration

purposes only. Until multicore processors are available, scaling and performance can only beestimated based on technical models. The example described in this article shows one possible

method of addressing relative performance levels as the industry begins to move

fromplatforms based on single-core processors to platforms based on

multicoreprocessors.Othermethods are possible, and actual processor performance and

processor scalability are tied to a variety of platform variables, including the specific

configuration and application environment. Several factors can potentially affect the internal

scalability of multiple cores, such as the system compiler as well as architectural considerations

including memory, I/O, frontside bus (FSB), chip set, and so on. For instance, enterprises can

buy a dual-processor server today to run Microsoft Exchange and provide e-mail, calendaring,and messaging functions. Dual-processor servers are designed to deliver excellent

price/performance for messaging applications. A typical configuration might use dual 3.6 GHz

64-bit Intel Xeon processors supporting HT Technology. In the future, organizations might

deploy the same application on a similar server that instead uses a pair of dual-core processors

at a clock speed lower than 3.6 GHz.The four cores in this example configuration might each run

at 2.8 GHz. The following simple example can help explain the relative performance of a low-



clock-speed, dual-core processor versus a high-clock-speed, dual-processor counterpart. Dual-

processor systems available today offer a scalability of roughly 80 percent for the second

processor, depending on the OS, application, compiler, and other factors.2 That means the first

processor may deliver 100 percent of its processing power, but the second

processor typically suffers some overhead from multiprocessing activities. As a result, the two

processors do not scale linearlythat is, a dual-processor system does not achieve a 200

percent performance increase over a single-processor system, but instead provides

approximately 180 percent of the performance that a single-processor system provides. In this

article, the single-core scalability factor is referred to as external, or socket-to-socket,scalability. When comparing two single-core processors in two individual sockets, the dual 3.6

GHz processors would result in an effective performance level of approximately 6.48 GHz (see

Figure 1).For multicore processors, administrators must take into account not only socket-to-

socket scalability but also internal, or core-to-core, scalabilitythe scalability between multiple

cores that reside within the same processor module. In this example, core-to-core scalability is

estimated at 70 percent, meaning that the second core delivers 70 percent of its processing



power. Thus, in the example system using 2.8 GHz dual-core processors, each dual-core

processor would behave more like a 4.76 GHz processor when the performance of the two

cores2.8 GHz plus 1.96 GHzis combined. For demonstration purposes, this example

assumes that, in a server that combines two such dual-core processors within the same system

architecture, the socket-to-socket scalability of the two dualcore processors would be similar tothat in a server containing two single-core processors80 percent scalability. This would lead

to an effective performance level of 8.57 GHz (see Figure 2).



What is Hyper Threading technology?

Todays 64-bit Intel Xeon, Pentium® 4, and Celeron® processors include HT Technology, which

enables the processor to execute multiple threads of an application simultaneously.

Multithreaded applications perceive a single physical processor as two separate, logical

processors and will execute threads independently on each logical processor to help speed

overall processing execution. Recent benchmark tests by Intel of 64-bit Intel Xeon processor

based platforms have shown a performance gain of up to 33 percent by enabling HT Technology

on applications that are HT Technologyaware as compared to running the same

applicationswith HT Technology disabled.*

Today, individual Intel NetBurst® microprocessors appear to the OS as two logical processors.

On a dual-processor system supporting HT Technology, the application perceives four processor

threads (two physical processors and two logical processors). Equipped with multicore

processors, that same dual-socket system could have a total of four processor cores. Throughthe effective use of HT Technology, those four processor cores could appear to the application

as eight total processors. By leveraging HT Technology, a properly compiled application can

achieve performance increases because of the improved utilization of the existing system

processors, compared to the same application running with HT Technology disabled. Most

multiprocessor-aware applications can take advantage of HT Technology, and applications that

have been specifically designed for HT Technology have the potential to achieve a significant

performance increase.

By combining multicore processors with HT Technology, Intel aims to provide greater scalability

and better utilization of processing cycles within the server than is possible using single-core

processors with HT Technology. The addition of HT Technology to multicore processor

architecture could present an excellent opportunity to help improve the utilization and

scalability of future processor subsystems



Computer Caches:

A computer is a machine in which we measure time in very small increments. When the

microprocessor accesses the main memory (RAM), it does it in about 60 nanoseconds (60

billionth of a second). Thats pretty fast, but it is much slower than the typical microprocessor.

Microprocessors can have cycle times as short as 2 nanoseconds, so to a microprocessor 60

nanoseconds seems like an eternity.

What if we build a special memory bank in the motherboard, small but very fast (around 30

nanoseconds)? Thats already two times faster than the main memory access. Thats called a

level 2 cache or a 1.2 cache. What if we build an even smaller but faster memory system

directly into the microprocessors chip? That way, this memory will be accessed at the speed of

the microprocessor and not the speed of the memory bus. Thats a L1 cache, which is two times

faster than the access to main memory.

Some microprocessors have two levels of cache built right into the chip. In this case, the

motherboard cache- the cache that exists between the microprocessor and the main system

memory- becomes level 3, or L3 cache.

There are a lot of subsystems in a computer; you can put cache between many of them to

improve performance. Here is an example. We have the microprocessor (the fastest thing in the

computer). Then there is the L1 cache that caches the L2 cache that caches the main memory

which can be used (and is often used) as a cache for even slower peripherals like hard disks and

CD-ROMs. The hard disks rare also used to cache an even slower medium- your Internet

connection.



What is Multi Core Processing?

Multi-core processing refers to the use of multiple microprocessors, called "cores," that are

built onto a single silicon die. The chip is mounted onto a computer motherboard in precisely

the same way as a traditional CPU. There is nothing new about the concept of stringing

processors together, a technique known as multiprocessing; however, a multi-core processor is

a bit different. A multi-core processor acts as a single unit. As such, it is more efficient, and

establishes a standardized platform, for which mass-produced software can easily be

developed.

The design of a multi-core processor allows for each core to communicate with the others, so

that processing tasks may be divided and delegated appropriately. However, the actual

delegation is dictated by software. When a task is completed, the processed information from

all cores is returned to the motherboard via a single shared conduit. This process can often

significantly improve performance over a single-core processor of comparable speed. The

degree of performance improvement will depend upon the efficiency of the software code

being run.

In addition to raw speed, these new chips vastly increase the amount of multi-tasking that

computers can do. Initially, the practical applications of multi-core processors were severely

limited, because many software products of the time were not designed to take full advantageof them. The gap was quickly closed, as a new generation of operating systems became

available, along with new generations of commercial software, including games, simulation

products, and even office productivity applications. Software developers quickly shifted their

priorities to exploit the new hardware to its fullest.

Multi-core processing has interrupted the on-going race among chip designers to create ever

faster processors. By using multiple slower cores, it is possible to achieve higher clock-speeds

more efficiently than by designing super-fast individual processors. When personal computersusing multi-core processing technology first became widely available to consumers in 2003 and

2004, the new CPUs featured only dual-core processors. This quickly changed in subsequent

years, with multi-core processing becoming the standard. Quad-core, and octo-core processors

will then allow for chips containing literally hundreds of cores, or more.



How Does A Dual Core Processor Works?

Obviously having two CPUs working together would improve performance, but two processors

working together is more expensive, and would create problems with the mother board and

chipset hosting them. So the computer engineers came up with another approach: take two

CPUs, and push together in to one chip and you get the power of two CPUs, that only take one

socket on the motherboard. Dual core technology allows for the power of two CPUs (also

known as cores, hence the name "Dual Core") with a cost that is less than two separate chips.

So the "Dual Core" processor is a CPU with two separate cores on the same die, each with its

own cache (and newer chips allow cache sharing which improves the functionality of the

processor). It's the equivalent of getting two microprocessors in one.

How does it work?

In the single core processor the CPU is fed strings of instructions it must order, execute, and

then selectively store in its cache for quick retrieval. When data outside the cache is required, it

is retrieved from the random access memory (RAM). Accessing this data slows down

performance to the maximum speed of the RAM (or the maximum speed of the bus that



connects the RAM to the CPU), which is far slower than the speed of the CPU. The situation is

gets more complicated and difficult when we start multi-tasking. In these cases the processor

must switch back and forth between two or more sets of data streams and programs. CPU

resources are depleted and performance suffers.

In a dual core processor system each core handles incoming data strings simultaneously to

improve efficiency. Since two heads working on the same problem is better then one, so are

two hands or two processors. When one is executing the other can be accessing the system bus

or executing its own code. Adding to this favorable scenario, both AMD and Intel's dual-core

flagships are 64-bit (which increases the amount of data the CPU can process at one switch).

Is it worth it?

There are subtle differences between the Intel and AMD dual core systems (how they

combined two cores onto one chip, and the speeds they run each core at) that affect how much

of a boost in performance you can get from having a dual core CPU. Also, different types of

programs get differing benefits from having a dual core chip.

To utilize a dual core processor, the operating system must be able to recognize multi-

threading. An operating system with multi threading will take advantage of the dual core,

because the scheduler has twice the CPU processing power. The scheduler is the part of the

windows operating system which tells the CPU what program to be running at any given

time.When we multi task and run a lot of programs simultaneously, a computer can begin to

seem slow, since Windows' scheduler has to divert the computer's CPU resources in many

directions. With a dual core processor the scheduler suddenly has twice as much CPU resourceto work with. This would allow the scheduler to have one core run specifically for a video

editing, while using the other core to do "background" things that keep the rest of the system

running.

Software will only take advantage of dual core processing if the software has simultaneous

multi-threading technology (SMT) written into its code. SMT enables parallel multi-threading

wherein the cores are served multi-threaded instructions in parallel. Without SMT the software

will only recognize one core. Adobe Photoshop is an example of SMT-aware software. SMT is

also used with multi-processor systems common to servers. If you are running a single program

and it is not "multi-threaded", you will not see a benefit from more than one CPU or core.

SoDual-core processor, offer immediate advantages for people looking to buy systems that

boost multitasking computing power and improve the throughput of multithreaded

applications. An Intel dual-core processor consists of two complete execution cores in one

physical processor both running at the same frequency speed. Both cores share the same



packaging and the same interface with the chipset/memory. Overall, a dual-core processor

offers a way of delivering more capabilities while balancing energy-efficient performance. It

seems that dual core processors are the first step in the multi-core processor future.Multi-core

processing refers to the use of multiple microprocessors, called "cores," that are built onto a

single silicon die. The chip is mounted onto a computer motherboard in precisely the same wayas a traditional CPU. There is nothing new about the concept of stringing processors together, a

technique known as multiprocessing; however, a multi-core processor is a bit different. A multi-

core processor acts as a single unit. As such, it is more efficient, and establishes a standardized

platform, for which mass-produced software can easily be developed.



Multi Core Power Consumption:

Power consumption levels are not only becoming an increasing concern in the desktop

computing world, but they are also bordering on unacceptable for embedded markets. What do

designers mean when they refer to power? Until now designers only had to consider the AC

component when talking about a devices power consumption. Power is consumed due to the

charging and discharging of gates,as explained by the following equation:

P(AC) = (nCV)^2*f

Where,

n is the fraction of active switching transistors within the device

C is capacitance of all of the gates

V is the transistor voltage

f is the core frequency Leakage Problems

Another component to this equation is the leakage or static power that is also commonly

referred to as the DC power component. It is caused due to the leakage current present when

the transistor is in an off state. It comprises two inherent transistor characteristics. The first is

gate leakage, which is the tunnelling of carriers through the gates oxide insulator. The second

is a weak sub-threshold leakage current that causes the transistor to switch on before thethreshold voltage is applied.

Reducing the gate width and its oxide thickness would reduce gate leakage; however, this is not

an option because the critical dimensions of gate width and oxide thickness are fixed to ensure

correct operating characteristics of transistors for a given process geometry. For example, a 60-

nm gate width and a 1.5-nm-thick oxide, typical for 130 nm, drops to around 50 nm and 1.2 nm

respectively in 90 nm. Work is underway on a next-generation 65-nm process with high-k

dielectric material for gate insulators that would allow thicker layers, but these are more

difficult to make than silicon dioxide layers available currently. With a high-k material, the

carriers have higher mobility. But because the thickness is increased, the overall relative

mobility of the carriers remains the same, so the operational characteristics of the transistor

are not affected.

It can be seen that the sub-threshold current decreases for increased threshold voltages.

However, this increased threshold causes loss of switching speed performance that translates



to a hit in terms of clock frequency. Now the equation has the significant added static

component and looks like this:

P = (nCV)^2*f + ILEAKAGE

Where,

ILEAKAGE is the total of leakage currents present across all transistors

The reduced operating voltage benefits of smaller 90-nm technology are counterbalanced by

the static power, no longer negligible, because leakage currents associated with smaller

geometry processes are more dominant. 90-nm leakage current is two to three times that of

130 nm at the same voltage. These increased leakage currents are worse with smaller geometry

processes because shrinking the transistor reduces the distance between gate, drain, and

source. As a result, tunnelling carriers meet relatively low channel resistance, increasing their

mobility and creating larger leakage currents. These currents generate static powercomponents that can account for more than half of the total power in some 90-nm devices.

Smart techniques are being used to minimize such static power effects. Some designers are

developing high-performance processes that reduce power by reducing the supply voltage,

whereas others are designing a lower-power embedded process from the ground up. The latter

design exploits the static power relationship with threshold current by tightly controlling

threshold voltage at the individual functional block. This approach ensures that the threshold

voltage for a given block is specified according to the level of performance and hence the

frequency that it requires.

Silicon-on-insulator (SOI) technology offers even lower power and voltage operating conditions

at higher frequencies. SOI reduces parasitic capacitance by insulating the transistors, resulting

in up to 25% faster switching frequency, or as much as a 20% power reduction. These two

benefits can be traded off to achieve target frequency-to-power ratios. The effectiveness of the

90-nm SOI process is seen in its AC power reduction factor of 3.38, which offsets almost exactly

the leakage effects in 90 nm. The improvements in the process facilitate a reduced operating

voltage and a smaller die area that is proportional to capacitance. The additional die-size

reduction factor of 0.5 yields the following:

P(AC) = (nCV)^2*f

AC Power Reduction 130nm 90nm = (A130nm V2130nm)/(A90nm V290nm)

= (1*1.3*1.3)/(0.5*1.0*1.0)

= 3.38 times



Balancing Power with Perf ormance:

Doubling the frequency on a single core demands faster switching transistors with higher

operating voltages, increasing power dissipation. It also requires deeper pipelines. This

increases the complexity of todays already complicated microprocessors and significantly

increases latency following pipeline flushes for interrupts and mispredicted branches. Thus,

performance is seriously impeded.

Additional clocking overhead is introduced at higher frequencies in terms of skew distribution,

because there is effectively less cycle period time with respect to the skew, which remains

almost fixed, regardless of frequency. Because faster clocks result in smaller timing windows for

system designers, dual-core designs can be less timing-sensitive and give system designers the

chance to address power-sensitive markets while offering comparable performance to faster-

clocking single-core devices. The fact that higher-performance superscalar discrete processors

can introduce a hot spot exceeding 60 W on a small area makes such devices impractical in anembedded environment, where the power budget is typically limited to 10 W per core. In

addition, such high-power devices cannot be cooled using a passive heat sink alone; a fan must

be mounted on the device, and fans that meet the general embedded markets minimum

requirements of 10-year reliability are expensive.



Multi Core Power Eff iciency:





Features and Advantages of Multi Core Processing:

Intel wide Dynamic Execution:

An advancement enabling delivery of more instructions per clock cycle.Every execution core is

33% wider than previous generations,allowing each core to fetch ,dispatch,execute and retireup to 4 instructions simultaneously.

Intel Intelligent Power Capablity:

A power management strategy designed to reduce power consumption amd design

requirements.This feature manages the runtime power consumption of all processor execution

cores.It includes an advanced power grating capablity that allows for n ultra fine grained logic

control that turns on individual processor logic subsystems only if they are needed.

Int

el S

martcac

he

:

The multi core optimized cache significantly reduces latency to frequently used data.it

improves performance and efficiency by improving the probablity that each execution core

can acces data from a higher performance,more efficient cache subsystem.

Intel smart Memory access:

This memory access innovation enhances system performance by utilizing available data

bandwidth from memory subsystem and hiding the latency of memory subsystem.

Boost multitasking power with improved performance for highly multithread andcompute intensive applications

Reduce costs and use less power with energy efficient processors

Enjoy flexiblity and the performance to handle robust content creation with multimedia.



Disadvantages of Multi Core Processing:

Adjustments must be made to the operating systems and other softwares to

accommodate the structure and function of CPU.

The only efficient way to operate multiple applications is multi threading,and some

applications do not function well with this concept.

Sharing of the same bus and memory causes increased bandwidth,limiting the speed of

the process.

Uses of multi core processing:

Multi tasking,Multi-threading,virtually eliminates high latency while running several

programs,or background applications such as anti virus softwares.

The use of silicon surface area is increased and hence making better use of supplies and

driving down costs.

The speed of normal processes is increased.

More processing power than standard CPUs used.



80 Core Processor:

Intel demonstrated the processor in San Francisco and the company will present a paper on the

project during the International Solid State Circuits Conference in the city this week.

The chip is capable of producing 1 trillion floating-point operations per second, known as ateraflop. That's a level of performance that required 2,500 square feet of large computers a

decade ago.

Intel first disclosed it had built a prototype 80-core processor during last fall's Intel Developer

Forum, when CEO Paul Otellini promised to deliver the chip within five years. The company's

researchers have several hurdles to overcome before PCs and servers come with 80-core

processors--such as how to connect the chip to memory and how to teach software developers

to write programs for it--but the research chip is an important step.

A company called ClearSpeed has put 96 cores on a single chip. ClearSpeed's chips are used as

co-processors with supercomputers that require a powerful chip for a very specific

purpose.Intel's research chip has 80 cores, or "tiles," Rattner said. Each tile has a computing

element and a router, allowing it to crunch data individually and transport that data to

neighboring tiles.

Intel used 100 million transistors on the chip, which measures 275 millimeters squared. By

comparison, its Core 2 Duo chip uses 291 million transistors and measures 143 millimeters

squared. The chip was built using Intel's 65-nanometer manufacturing technology, but any

likely product based on the design would probably use a future process based on smallertransistors. A chip the size of the current research chip is likely too large for cost-effective

manufacturing.

The computing elements are very basic and do not use the x86 instruction set used by Intel and

Advanced Micro Devices' chips, which means Windows Vista can't be run on the research chip.

Instead, the chip uses a VLIW (very long instruction word) architecture, a simpler approach to

computing than the x86 instruction set.

There's also no way at present to connect this chip to memory. Intel is working on a stacked

memory chip that it could place on top of the research chip, and it's talking to memorycompanies about next-generation designs for memory chips,.Intel's researchers will then have

to figure out how to create general-purpose processing cores that can handle the wide variety

of applications in the world. The company is still looking at a five-year timeframe for product

delivery.



But the primary challenge for an 80-core chip will be figuring out how to write software that

can take advantage of all that horsepower. The PC software community is just starting to get its

hands around multicore programming, although its server counterparts are a little further

ahead. Still, Microsoft, Apple and the Linux community have a long way to go before they'll be

able to effectively utilize 80

individual processing units with their PC operating systems.

"The operating system has the most control over the CPU, and it's got to change," said Jim

McGregor, an analyst at In-Stat. "It has to be more intelligent about breaking things up," he

said, referring to how tasks are divided among multiple processing cores.

"I think we're sort of all moving forward here together," Intel said. "As the core count grows

and people get the skills to use them effectively, these applications will come." Intel hopes to

make it easier by training its army of software developers on creating tools and libraries, he

said.

Intel demonstrated the chip running an application created for solving differential equations. At

3.16GHz and with 0.95 volts applied to the processor, it can hit 1 teraflop of performance while

consuming 62 watts of power. Intel constructed a special motherboard and cooling system for

the demonstration in a San Francisco hotel.



CONCLUSION:

The Multi Core Processing has a large number of advantages over the current

microprocessors.Not only it increases and tends to satisfy the ever increasing for speed but also

manages the power better than any of its predessors.

The Multi Core Processing increases the speed and provides newer and faster platforms.

The number of cores over a die will increase over the next few years and so will the softwares

supporting this new advancement in technology.This new will also way for enhancement of

Multi tasking and Multi programming.

Rightfully has this created a revolution in the mordern It world and paved way for further

advancements.



BIBLIOGRAPHY:

Advanced microprocessors-K.M.BHURCHANDI

www.intel.com

www.google.com

www.dualcoretechnology.com

http://multicore.amd.com

From Wikipedia

Documents

Ankit_multi Core Processing