Lecture 17: Memory Hierarchy and Cache Coherence · 2018. 2. 12. · Lecture 17: Memory Hierarchy...

Lecture17:MemoryHierarchyandCacheCoherence

ConcurrentandMul7coreProgramming

DepartmentofComputerScienceandEngineeringYonghongYan

yan@oakland.eduwww.secs.oakland.edu/~yan

ParallelisminHardware

•  Instruc7on-LevelParallelism–  Pipeline–  Out-of-orderexecu7on,and–  Superscalar

•  Thread-LevelParallelism–  Chipmul7threading,mul7core–  Coarse-grainedandfine-grainedmul7threading–  SMT

•  Data-LevelParallelism–  SIMD/Vector–  GPU/SIMT

ComputerArchitecture,AQuan7ta7veApproach.5THEdi7on,TheMorganKaufmann,September30,2011byJohnL.Hennessy(Author),DavidA.PaWerson

Topics(Part2)

•  Parallelarchitecturesandhardware–  Parallelcomputerarchitectures–  Memoryhierarchyandcachecoherency

•  ManycoreGPUarchitecturesandprogramming–  GPUsarchitectures–  CUDAprogramming–  IntroducGontooffloadingmodelinOpenMPandOpenACC

•  Programmingonlargescalesystems(Chapter6)–  MPI(pointtopointandcollec7ves)–  IntroducGontoPGASlanguages,UPCandChapel

•  Parallelalgorithms(Chapter8,9&10)–  Densematrix,andsor7ng

Outline

•  Memory,LocalityofreferenceandCaching•  Cachecoherenceinsharedmemorysystem

Memoryun7lnow…

•  We’vereliedonaverysimplemodelofmemoryformostthisclass–  MainMemoryisalineararrayofbytesthatcanbeaccessed

givenamemoryaddress–  Alsousedregisterstostorevalues

•  Realityismorecomplex.ThereisanenGrememorysystem.–  Differentmemoriesexistatdifferentlevelsofthecomputer–  Eachvaryintheirspeed,size,andcost

Random-Access Memory (RAM)

•  Keyfeatures–  RAMispackagedasachip.–  Basicstorageunitisacell(onebitpercell).–  MulGpleRAMchipsformamemory.

•  Sta7cRAM(SRAM)–  Eachcellstoresbitwithasix-transistorcircuit.–  Retainsvalueindefinitely,aslongasitiskeptpowered.–  RelaGvelyinsensiGvetodisturbancessuchaselectricalnoise.–  FasterandmoreexpensivethanDRAM.

Random-Access Memory (RAM)

•  DynamicRAM(DRAM)–  Eachcellstoresbitwithacapacitorandtransistor.–  Valuemustberefreshedevery10-100ms.–  SensiGvetodisturbances.–  SlowerandcheaperthanSRAM.

Memory Modules… real lifeDRAM

•  Inreality,–  SeveralDRAMchipsarebundledintoMemoryModules

•  SIMMS-SingleInlineMemoryModule•  DIMMS-DualInlineMemoryModule•  DDR-DualdataRead

–  Readstwiceeveryclockcycle•  QuadPump:SimultaneousR/W

Source for Pictures: http://en.kioskea.net/contents/pc/ram.php3

SDR, DDR,QuadPump

MemorySpeeds

•  ProcessorSpeeds:1GHzprocessorspeedis1nseccycleGme.

•  MemorySpeeds(50nsec)•  AccessSpeedgap

–  InstrucGonsthatstoreorloadfrommemory

DIMMModuleChipType ClockSpeed(MHz) BusSpeed(MHz) TransferRate(MB/s)

PC1600DDR200 100 200 1600

PC2100DDR266 133 266 2133

PC2400DDR300 150 300 2400

registers

on-chip L1 cache (SRAM)

main memory (DRAM)

local secondary storage (local disks)

Larger, slower,

and cheaper

(per byte) storage devices

remote secondary storage (distributed file systems, Web servers)

Local disks hold files retrieved from disks on remote network servers.

Main memory holds disk blocks retrieved from local disks.

off-chip L2 cache (SRAM)

L1 cache holds cache lines retrieved from the L2 cache memory.

CPU registers hold words retrieved from L1 cache.

L2 cache holds cache lines retrieved from main memory.

Smaller, faster, and

costlier (per byte) storage devices

MemoryHierarchy(Review)

main memory I/O

bridge bus interface L2 cache

ALU register file

cache bus system bus memory bus

L1 cache

CacheMemories(SRAM)

•  Cachememoriesaresmall,fastSRAM-basedmemoriesmanagedautomaGcallyinhardware.–  Holdfrequentlyaccessedblocksofmainmemory

•  CPUlooksfirstfordatainL1,theninL2,theninmainmemory.

•  Typicalbusstructure:

Processor

HowtoExploitMemoryHierarchy

•  Availabilityofmemory–  Cost,size,speed

•  Principleoflocality–  Memoryreferencesarebunchedtogether–  AsmallporGonofaddressspaceisaccessedatanygivenGme

•  Thisspaceinhighspeedmemory–  Problem:notallofitmayfit

Typesoflocality

•  Temporallocality–  TendencytoaccesslocaGonsrecentlyreferenced

•  SpaGallocality

–  TendencytoreferencelocaGonsaroundrecentlyreferenced–  LocaGonx,thenotherswillbex-korx+k

X X X t

Sourcesoflocality

•  Temporallocality–  Codewithinaloop–  SameinstrucGonsfetchedrepeatedly

•  SpaGallocality–  Dataarrays–  Localvariablesinstack–  Dataallocatedinchunks(conGguousbytes)

for(i=0;i<N;i++){A[i]=B[i]+C[i]*a;}

Whatdoeslocalitybuy?

•  AddressthegapbetweenCPUspeedandRAMspeed•  SpaGalandtemporallocalityimpliesasubsetofinstrucGonscanfitinhighspeedmemoryfromGmetoGme

•  CPUcanaccessinstrucGonsanddatafromthishighspeedmemory

•  Smallhighspeedmemorycanmakecomputerfasterandcheaper

•  Speedof1-20nsecatcostof$50to$100perMbyte•  ThisisCaching!!

Inser7nganL1CacheBetweenCPUandMainMemory

a b c d block 10

p q r s block 21

w x y z block 30

The big slow main memory has room for many 4-word blocks.

The small fast L1 cache has room for two 4-word blocks.

The tiny, very fast CPU register file has room for four 4-byte words. The transfer unit between

the CPU register file and the cache is a 4-byte block.

line 0

line 1 The transfer unit between the cache and main memory is a 4-word block (16 bytes).

Whatinfo.Doesacacheneed

•  Cache:Asmaller,fasterstoragedevicethatactsasastagingareaforasubsetofthedatainalarger,slowerdevice.

•  YouessenGallyallowasmallerregionofmemorytoholddatafromalargerregion.Nota1-1mapping.

•  WhatkindofinformaGondoweneedtokeep:–  Theactualdata–  Wherethedataactuallycomesfrom–  Ifdataisevenconsideredvalid

CacheOrganiza7on

•  Mapeachregionofmemorytoasmallerregionofcache•  Discardaddressbits

–  Discardlowerorderbits(a)–  Discardhigherorderbits(b)

•  Cacheaddresssizeis4bits•  Memoryaddresssizeis8bits•  Incaseof a)

–  0000xxxxismappedto0000incache•  Incaseofb)

–  xxxx0001ismappedto0001incache

memory

Findingdataincache

•  Partofmemoryaddressappliedtocache•  Remainingisstoredastagincache•  Lowerorderbitsdiscarded•  Needtocheckif00010011

–  Cacheindexis0001–  Tagis0011

•  Iftagmatches,hit,usedata•  Nomatch,miss,fetchdatafrommemory

address tag

tag set 0:

B = 2b bytes per cache block

E lines per set

S = 2s sets

t tag bits per line

1 valid bit per line

Cache size: C = B x E x S data bytes

• • •

tag set 1: • • •

valid tag

set S-1: • • •

• • •

Cache is an array of sets.

Each set contains one or more lines.

Each line holds a block of data.

0 1 • • • B–1

GeneralOrgofaCacheMemory

t bits s bits b bits

Address A:

tag set 0: • • •

tag set 1: •

tag set S-1: • • •

• • •

The word at address A is in the cache if the tag bits in one of the <valid> lines in set <set index> match <tag>.

The word contents begin at offset <block offset> bytes from the beginning of the block.

AddressingCaches

0 1 • • • B–1

0 1 • • • B–1 • • 0 1 • • • B–1

0 1 • • • B–1

set 0: valid tag cache block

Direct-MappedCache

•  Simplestkindofcache•  Characterizedbyexactlyonelineperset.

valid tag

• • •

set 1:

set S-1:

E=1 lines per set

cache block

set 0: valid tag

valid tag

• • •

set 1:

set S-1: t bits s bits

set index block offset0 m-1

b bits

selected set

cache block

cache block 0 0 0 0 1

AccessingDirect-MappedCaches

•  SetselecGon–  Usethesetindexbitstodeterminethesetofinterest.

=1? (1) The valid bit must be set

1 0110

t bits s bits

set index block offset0 m-1

b bits

selected set (i):

(3) If (1) and (2), then cache hit,

and block offset selects

starting byte.

(2) The tag bits in the cache = ? line must match the

tag bits in the address

AccessingDirect-MappedCaches

•  LinematchingandwordselecGon–  Linematching:Findavalidlineintheselectedsetwitha

matchingtag –  WordselecGon:Thenextracttheword

3 0 1 2 7 4 5 6

0110 i 100

w0 w1 w2 w3

Example:Directmappedcache

•  32bitaddress,64KBcache,32byteblock•  Howmanysets,howmanybitsforthetag,howmanybitsfortheoffset?

• • •

set 0:

set 1:

cache block

cache block set n-1:

Write-throughvswrite-back

•  Whattodowhenanupdateoccurs?•  Write-through:immediately

–  Simpletoimplement,synchronouswrite–  Uniformlatencyonmisses

•  Write-back:writewhenblockisreplaced–  RequiresaddiGonaldirtybitormodifiedbit–  Asynchronouswrites–  Non-uniformmisslatency–  Cleanmiss:readfromlowerlevel–  Dirtymiss:writetolowerlevelandread(fill)

WritesandCache

•  ReadinginformaGonfromacacheisstraightforward.•  WhataboutwriGng?

–  Whatifyou’rewriGngdatathatisalreadycached(write-hit)?–  Whatifthedataisnotinthecache(write-miss)?

•  Dealingwithawrite-hit.–  Write-through-immediatelywritedatabacktomemory–  Write-back-deferthewritetomemoryforaslongaspossible

•  Dealingwithawrite-miss.–  write-allocate-loadtheblockintomemoryandupdate–  no-write-allocate-writesdirectlytomemory

•  Benefits?Disadvantages?•  Write-througharetypicallyno-write-allocate.•  Write-backaretypicallywrite-allocate.

size: speed: $/Mbyte: line size:

200 B 3 ns

8 B 32 B larger, slower, cheaper

8-64 KB 3 ns

1-4MB SRAM 128 MB DRAM 60 ns $1.50/MB 8 KB

30 GB 8 ms $0.05/MB

Memory

L1 d-cache

Regs Unified

L2 Cache

Processor

6 ns $100/MB 32 B

L1 i-cache

Mul7-LevelCaches

•  OpGons:separatedataandinstrucGoncaches,oraunifiedcache

CachePerformanceMetrics

•  MissRate–  FracGonofmemoryreferencesnotfoundincache(misses/

references)–  Typicalnumbers:

•  3-10%forL1•  canbequitesmall(e.g.,<1%)forL2,dependingonsize,etc.

•  HitTime–  Timetodeliveralineinthecachetotheprocessor(includesGmeto

determinewhetherthelineisinthecache)–  Typicalnumbers:

•  1clockcycleforL1•  3-8clockcyclesforL2

•  MissPenalty–  AddiGonalGmerequiredbecauseofamiss

•  Typically25-100cyclesformainmemory

int sumarrayrows(int a[M][N]) {

int i, j, sum = 0;

for (i = 0; i < M; i++) for (j = 0; j < N; j++)

sum += a[i][j]; return sum;

int sumarraycols(int a[M][N]) {

int i, j, sum = 0;

for (j = 0; j < N; j++) for (i = 0; i < M; i++)

sum += a[i][j]; return sum;

Miss rate = 1/4 = 25% Miss rate = 100%

Wri7ngCacheFriendlyCode

•  Repeatedreferencestovariablesaregood(temporallocality)

•  Stride-1referencepaxernsaregood(spaGallocality)•  Examples:

–  coldcache,4-bytewords,4-wordcacheblocks

MatrixMul7plica7onExample

•  MajorCacheEffectstoConsider–  Totalcachesize

•  Exploittemporallocalityandblocking)–  Blocksize

•  ExploitspaGallocality

•  DescripGon:–  MulGplyNxNmatrices–  O(N3)totaloperaGons–  Accesses

•  Nreadspersourceelement•  NvaluessummedperdesGnaGon

–  butmaybeabletoholdinregister

/* ijk */ for (i=0; i<n; i++) {

for (j=0; j<n; j++) { sum = 0.0; for (k=0; k<n; k++)

sum += a[i][k] * b[k][j]; c[i][j] = sum;

Variable sum held in register

MissRateAnalysisforMatrixMul7ply

•  Assume:–  Linesize=32BYTES(bigenoughfor464-bitwords)–  Matrixdimension(N)isverylarge

•  Approximate1/Nas0.0–  CacheisnotevenbigenoughtoholdmulGplerows

•  AnalysisMethod:–  Lookataccesspaxernofinnerloop

LayoutofCArraysinMemory(review)

•  Carraysallocatedinrow-majororder–  eachrowinconGguousmemorylocaGons

•  Steppingthroughcolumnsinonerow:–  for(i = 0; i < N; i++)sum+= a[0][i];–  accessessuccessiveelements–  ifblocksize(B)>4bytes,exploitspaGallocality

•  compulsorymissrate=4bytes/B•  Steppingthroughrowsinonecolumn:

–  for(i = 0; i < n; i++)sum += a[i][0];

•  accessesdistantelements•  nospaGallocality!

–  compulsorymissrate=1(i.e.100%)

MatrixMul7plica7on(ijk)

MatrixMul7plica7on(jik)

MatrixMul7plica7on(kij)

MatrixMul7plica7on(ikj)

MatrixMul7plica7on(jki)

MatrixMul7plica7on(kji)

SummaryofMatrixMul7plica7on

for (i=0; i<n; i++) { for (j=0; j<n; j++)

sum = 0.0;

for (k=0; k<n; k++) sum += a[i][k] * b[k][j];

c[i][j] = sum;

ijk (& jik): kij (& ikj): jki (& kji): • 2 loads, 0 stores • misses/iter = 1.25

for (k=0; k<n; k++) { for (i=0; i<n; i++)

r = a[i][k];

for (j=0; j<n; j++) c[i]

[j] += r * b[k][j];

for (j=0; j<n; j++) { for (k=0; k<n; k++)

{ r = b[k][j];

for (i=0; i<n; i++) c[i]

[j] += a[i][k] * r; }

• 2 loads, 1 store • misses/iter = 0.5

• 2 loads, 1 store • misses/iter = 2.0

Outline

•  Memory,LocalityofreferenceandCaching•  Cachecoherenceinsharedmemorysystem

Sharedmemorysystems

•  Allprocesseshaveaccesstothesameaddressspace–  E.g.PCwithmorethanoneprocessor

•  DataexchangebetweenprocessesbywriGng/readingsharedvariables–  Sharedmemorysystemsareeasytoprogram–  CurrentstandardinscienGficprogramming:OpenMP

•  Twoversionsofsharedmemorysystemsavailabletoday–  CentralizedSharedMemoryArchitectures–  DistributedSharedMemoryarchitectures

CentralizedSharedMemoryArchitecture

•  AlsoreferredtoasSymmetricMulG-Processors(SMP)•  Allprocessorssharethesamephysicalmainmemory

•  MemorybandwidthperprocessorislimiGngfactorforthistypeofarchitecture

•  Typicalsize:2-32processors

Memory

CPU CPU

Centralizedsharedmemorysystem(I)

•  IntelX7350quad-core(Tigerton)–  PrivateL1cache:32KBinstrucGon,32KBdata–  SharedL2cache:4MBunifiedcache

CoreL1

sharedL2

CoreL1

sharedL2

1066MHzFSB

Centralizedsharedmemorysystems(II)

•  IntelX7350quad-core(Tigerton)mulG-processorconfiguraGon

Socket0 Socket1 Socket2 Socket3

MemoryControllerHub(MCH)

Memory Memory Memory Memory

8GB/s8GB/s8GB/s8GB/s

DistributedSharedMemoryArchitectures

•  AlsoreferredtoasNon-UniformMemoryArchitectures(NUMA)

•  Somememoryisclosertoacertainprocessorthanothermemory–  ThewholememoryissGlladdressablefromallprocessors–  Dependingonwhatdataitemaprocessorretrieves,theaccess

Gmemightvarystrongly

Memory

CPU CPU

Memory

CPU CPU

Memory

CPU CPU

Memory

CPU CPU

NUMAarchitectures(II)

•  ReducesthememoryboxleneckcomparedtoSMPs•  Moredifficulttoprogramefficiently

–  E.g.firsttouchpolicy:dataitemwillbelocatedinthememoryoftheprocessorwhichusesadataitemfirst

•  Toreduceeffectsofnon-uniformmemoryaccess,cachesareo}enused–  ccNUMA:cache-coherentnon-uniformmemoryaccess

architectures•  Largestexampleasoftoday:SGIOriginwith512processors

DistributedSharedMemorySystems

CacheCoherence

•  Real-worldsharedmemorysystemshavecachesbetweenmemoryandCPU

•  CopiesofasingledataitemcanexistinmulGplecaches•  ModificaGonofashareddataitembyoneCPUleadstooutdatedcopiesinthecacheofanotherCPU

Memory

Originaldataitem

CopyofdataitemincacheofCPU0 Copyofdataitem

incacheofCPU1

Cachecoherence(II)

•  TypicalsoluGon:–  Cacheskeeptrackonwhetheradataitemissharedbetween

mulGpleprocesses–  UponmodificaGonofashareddataitem,‘noGficaGon’of

othercacheshastooccur–  Othercacheswillhavetoreloadtheshareddataitemonthe

nextaccessintotheircache•  CachecoherenceisonlyanissueincasemulGpletasksaccessthesameitem–  MulGplethreads–  MulGpleprocesseshaveajointsharedmemorysegment–  ProcessisbeingmigratedfromoneCPUtoanother

CacheCoherenceProtocols

•  SnoopingProtocols–  Sendallrequestsfordatatoallprocessors–  Processorssnoopabustoseeiftheyhaveacopyandrespondaccordingly–  Requiresbroadcast,sincecachinginformaGonisatprocessors–  Workswellwithbus(naturalbroadcastmedium)–  Dominatesforcentralizedsharedmemorymachines

•  Directory-BasedProtocols–  KeeptrackofwhatisbeingsharedincentralizedlocaGon–  Distributedmemory=>distributeddirectoryforscalability

(avoidsboxlenecks)–  Sendpoint-to-pointrequeststoprocessorsvianetwork–  ScalesbexerthanSnooping–  Commonlyusedfordistributedsharedmemorymachines

Categoriesofcachemisses

•  Uptonow:–  CompulsoryMisses:firstaccesstoablockcannotbeinthecache(cold

startmisses)–  CapacityMisses:cachecannotcontainallblocksrequiredfortheexecuGon–  ConflictMisses:cacheblockhastobediscardedbecauseofblock

replacementstrategy•  InmulG-processorsystems:

–  CoherenceMisses:cacheblockhastobediscardedbecauseanotherprocessormodifiedthecontent•  truesharingmiss:anotherprocessormodifiedthecontentoftherequestelement

•  falsesharingmiss:anotherprocessorinvalidatedtheblock,althoughtheactualitemofinterestisunchanged.

BusSnoopingTopology

LargerSharedMemorySystems

•  TypicallyDistributedSharedMemorySystems•  Localorremotememoryaccessviamemorycontroller•  Directorypercachethattracksstateofeveryblockineverycache

–  Whichcacheshaveacopyofblock,dirtyvs.clean,...•  Infopermemoryblockvs.percacheblock?

–  PLUS:Inmemory=>simplerprotocol(centralized/onelocaGon)–  MINUS:Inmemory=>directoryisƒ(memorysize)vs.ƒ(cachesize)

•  Preventdirectoryasboxleneck?distributedirectoryentrieswithmemory,eachkeepingtrackofwhichprocessorshavecopiesoftheirblocks

DistributedDirectoryMPs

•  Falsesharing–  Whenatleastonethreadwritetoa

cachelinewhileothersaccessit•  Thread0:=A[1](read)•  Thread1:A[0]=…(write)

•  SoluGon:usearraypadding

int a[max_threads]; #pragma omp parallel for schedule(static,1) for(int i=0; i<max_threads; i++) a[i] +=i;

int a[max_threads][cache_line_size]; #pragma omp parallel for schedule(static,1) for(int i=0; i<max_threads; i++) a[i][0] +=i;

FalseSharinginOpenMP

Getting OpenMP Up To Speed

RvdP/V1 Tutorial IWOMP 2010 – CCS Un. of Tsukuba, June 14, 2010

False Sharing

CPUs Caches Memory

A store into a shared cache line invalidates the other copies of that line:

The system is not able to distinguish between changes

within one individual line

NUMAandFirstTouchPolicy

•  DataplacementpolicyonNUMAarchitectures

•  FirstTouchPolicy

–  Theprocessthatfirsttouchesapageofmemorycausesthatpagetobeallocatedinthenodeonwhichtheprocessisrunning

A generic cc-NUMA architecture

NUMAFirst-touchplacement/1

About “First Touch” placement/1

for (i=0; i<100; i++) a[i] = 0;

a[0] :a[99]

First TouchAll array elements are in the memory of

the processor executing this thread

int a[100]; Onlyreservethevm

address

NUMAFirst-touchplacement/2

About “First Touch” placement/2

for (i=0; i<100; i++) a[i] = 0;

a[0] :a[49]

#pragma omp parallel for num_threads(2)

First TouchBoth memories each have “their half” of

the array

a[50] :a[99]

WorkwithFirst-TouchinOpenMP

•  First-touchinpracGce–  IniGalizedataconsistentlywiththecomputaGons

#pragmaompparallelforfor(i=0;i<N;i++){a[i]=0.0;b[i]=0.0;c[i]=0.0;}readfile(a,b,c);#pragmaompparallelforfor(i=0;i<N;i++){a[i]=b[i]+c[i];}

ConcludingObserva7ons

•  ProgrammercanopGmizeforcacheperformance–  Howdatastructuresareorganized–  Howdataareaccessed

•  Nestedloopstructure•  Blockingisageneraltechnique

•  Allsystemsfavor“cachefriendlycode”–  Ge�ngabsoluteopGmumperformanceisverypla�orm

specific•  Cachesizes,linesizes,associaGviGes,etc.

–  Cangetmostoftheadvantagewithgenericcode•  Keepworkingsetreasonablysmall(temporallocality)•  Usesmallstrides(spaGallocality)

–  WorkwithcachecoherenceprotocolandNUMAfirsttouchpolicy

References

•  ComputerArchitecture,AQuanGtaGveApproach.5THEdiGon,TheMorganKaufmann,September30,2011byJohnL.Hennessy(Author),DavidA.Paxerson

•  APrimeronMemoryConsistencyandCacheCoherenceDanielJ.SorinMarkD.HillDavidA.Wood,SYNTHESISLECTURESONCOMPUTERARCHITECTUREMarkD.Hill,SeriesEditor,2011

Lecture 17: Memory Hierarchy and Cache Coherence · 2018. 2. 12. · Lecture 17: Memory Hierarchy...

Documents

Frontier Disks - John Deere Disk_v… · Frontier Disks Heavy Offset Disks Tandem Disks. Smooth, high-performing disks to break through hard, compacted soil With a weight-per-blade

Secondary Storage Devices: Magnetic Disks Optical Disks Floppy Disks Magnetic Tapes

Magnetic disks

6/3/20141 PSUs CS 587 Chapter 9, Disks and Files The Storage Hierarchy Disks Mechanics Performance RAID Disk Space Management Buffer Management Files of

Virtual Memory: Concepts./213/lectures/17-vm-concepts.pdf · Remember: Memory Hierarchy Regs L1 cache (SRAM) Main memory (DRAM) Local secondary storage (local disks) Larger, slower,

The Memory/Storage Hierarchy and Virtual Memory · 2020. 4. 15. · Typical Storage Hierarchy registers main memory (RAM) local secondary storage (local disks, SSDs) Larger Slower

File Disks

AttractSPE™ Disks · AttractSPE™ Disks 4 AttractSPE™Disks Environment 8 AttractSPE™Disks BioMol 10 AttractSPE™Prefilters 12 SPE accessories - SPE Disks manifolds 13 AttractSPE™Disk

Disk Technologies. Magnetic Disks Purpose: – Long-term, nonvolatile, inexpensive storage for files – Large, inexpensive, slow level in the memory hierarchy

Optical Disks

rotating disks

Memory Hierarchy - Electrical and Computer Engineeringstrukov/ece154aFall2013/viewgraphs/memory.pdf · •registers memory –by compiler (programmer?) ... •main memory disks

Memory Technology 15-213 Oct. 20, 1998 Topics Memory Hierarchy Basics Static RAM Dynamic RAM Magnetic Disks Access Time Gap class17.ppt

Topics Memory Hierarchy Basics Static RAM Dynamic RAM Magnetic Disks Access Time Gap class17.ppt 15-213 Memory Technology March 14, 2000

Electronic Coherence, Vibrational Coherence, and Solvent ...chemgroups.northwestern.edu/hupp/Publications/85.pdf · Electronic Coherence, Vibrational Coherence, and Solvent Degrees

File Organizations and Indexing Chapter 8. Review: Memory, Disks, & Files Everything won’t fit in RAM (usually) Hierarchy of storage, RAM, disk, tape

Secondary Storage Devices: Magnetic Disks Optical Disks ...saksagan.ceng.metu.edu.tr/courses/ceng351/documents/week2... · Secondary Storage Devices: Magnetic Disks Optical Disks

[Coherence] coherence 모니터링 v 1.0

OmpSs Tutorial: AGENDA - PRACE Research … Shared Memory Machines SMP: Simetric MultiProcessors – Shared Memory Single Address Space – Memory hierarchy must keep coherence/consistency

Soft Disks: Proto-Planetary Disks in your Computer Garrelt Mellema