59
Parallel Compu,ng in R BioC 2009, Sea7le, July 2009

Parallel Programing in R - Revolutions

  • Upload
    vanbao

  • View
    222

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Parallel Programing in R - Revolutions

ParallelCompu,nginR

BioC2009,Sea7le,July2009

Page 2: Parallel Programing in R - Revolutions

WhoareREvolu,onCompu,ng?

REvolu,onCompu,ng:–  Isacommercialopen‐sourcecompany,foundedin2007–  ProvidesservicesandproductsbasedonR

•  The“RedHat”®forR–  Producesfreeandsubscrip,on‐basedhigh‐performance,enhanceddistribu,onsofR

–  Offerssupport,training,valida,onandotherservicesaroundR

–  Hasexper,seinhigh‐performanceanddistributedcompu,ng

–  IsafinancialandtechnicalcontributortotheRcommunity–  Hasopera,onsinNewHaven,Sea7le,andSanFrancisco

Page 3: Parallel Programing in R - Revolutions

The People of REvolution

•  Mar$nSchultz,ChiefScien,ficOfficer(ArthurKWatsonProfessorofComputerScience,YaleUniversity;founderofScien,ficCompu,ngAssociates;researchinalgorithmdesign,parallelprogrammingenvironmentsandarchitectures)

•  DavidSmith,DirectorofCommunity&Rblogger(co‐authorofAnIntroduc+ontoR,ESS)

•  BryanLewis,AmbassadorofCool(akaDirectorofSystemsEngineering;appliedmathinterestsinnumericalanalysisofinverseproblems;formerCEOofRocketcalc)

•  DaneseCooper,OpenSourceDiva(boardofdirectors,OpenSourceIni,a,ve;member,ApacheSo`wareFounda,on;advisoryboard,Mozilla.org;previouslyseniordirector,opensourcestrategiesatIntelandSun)

•  SteveWeston,SeniorResearchScien,st,DirectorofEngineering(REvolu,onandScien,ficCompu,ngAssociates;developmentofNetWorkSpaces–parallelprogrammingwithR,Python,Ruby,andMatlab–NetworkLinda,Paradise,andPiranha).

•  JayEmerson,DeptSta,s,cs,YaleUniversity(authorofbigmemorypackage)

Page 4: Parallel Programing in R - Revolutions

WhatisREvolu,onR?

•  REvolu,onRisthefreedistribu,onofR–  Op,mizedforspeed–  Usesmul,pleCPUs/coresforperformance–  ForWindowsandMacOS(soon:Ubuntu)–  Supportviacommunityforums

•  REvolu,onREnterpriseisourenhanced,subscrip,on‐onlydistribu,onofR–  Telephone/emailsupportfromrealRexperts–  Suitableforuseinregulated/validatedenvironments–  IncludesproprietaryParallelRpackagesforreliabledistributedcompu,ngwithR

•  onclustersorinthecloud–  Supportedon64‐bitWindows,Linux

Page 5: Parallel Programing in R - Revolutions

Suppor,ngtheRCommunity

Weareanopensourcecompanysuppor,ngtheRcommunity:

•  BenefactorofRFounda$on•  FinancialsupporterofRconferencesandusergroups•  Newfunc$onalitydevelopedincoreRtocontributedunderGPL

•  64‐bitWindowssupport•  Step‐debuggingsupport

•  REvangelism

“Revolu$ons”Blog:blog.revolu,on‐compu,ng.comDailyNewsaboutR,sta1s1cs,andopen‐source

5

Page 6: Parallel Programing in R - Revolutions

Intoday’slab:

•  Introduc,ontoParallelProcessing•  Mul,‐ThreadedProcessing

– Compu,ngontheGPU

•  Iterators•  Theforeachloop•  Usingmul,plecores:SMP

•  ClusterCompu,ng

•  Mul,‐Stratumparallelism

•  Q&A/Exercises6

Page 7: Parallel Programing in R - Revolutions

GemngStarted

•  R2.10.x,and2.9.xonWindows:install.packages("foreach",type="source")

install.packages("iterators",type="source")

•  R2.9.x,Mac/Linuxonly:install.packages("doMC")

require(doMC)

•  Windows/Mac:–  InstallREvolu,onREnterprise2.0(R2.7.2)require(doNWS)

7

Page 8: Parallel Programing in R - Revolutions

8

Introduc,ontoParallelProcessing

WithanasidetoHigh‐PerformanceCompu,ng

Page 9: Parallel Programing in R - Revolutions

HPCo`enmeansefficientlyexploi,ngspecializedhardware

ImagescopyrightCray,Xlinix,NVIDIAfromupper‐left,clockwise.

WhatisHigh‐PerformanceCompu,ng(HPC)?

Page 10: Parallel Programing in R - Revolutions

Imagefromncbr.sdsc.edu

WhatisHigh‐PerformanceCompu,ng(HPC)?

•  Thesedays,HPCisfrequentlyassociatedwithCOTS*clustercompu,ngandwithSIMDvectoriza,onandpipelining(GPUs)

*Commodity,offtheshelf

•  New:cloudcompu,ng

Page 11: Parallel Programing in R - Revolutions

•  HPCiso`enconcernedwithmul,‐processing(parallel

processing),thecoördina,onof

mul,ple,simultaneously

running(sub)programs

–  Threads

–  Processes

–  Clusters

ImageCopyrightLawrenceLivermoreNationalLab

WhatisHigh‐PerformanceCompu,ng(HPC)?

Page 12: Parallel Programing in R - Revolutions

HPCo`eninvolveseffec,velymanaginghugedatasets

–  Parallelfilesystems(GPFS,PVFS2,Lustre,GFS2,S3…)

–  Paralleldataopera,ons(map‐reduce)

–  Workingwithhigh‐performancedatabases

–  bigmemorypackageinR

ImageCopyrightHP(amulti‐petabytestoragesystem)

WhatisHigh‐PerformanceCompu,ng(HPC)?

Page 13: Parallel Programing in R - Revolutions

ATaxonomyofParallelProcessing

•  Mul$‐node/cluster/cloudcompu$ng(heavyweightprocesses)–  Memorydistributedacrossnetwork–  Examples:foreach,SNOW,Rmpi,batchprocessing

•  Mul$‐core/mul$‐processorcompu$ng(heavyweightprocesses)–  SMP:SymmetricMul,‐Processing–  Independentmemoryinsharedspace–  Naturallyscalestomul,‐nodeprocessing–  Examples:mul,core(Windows/Unix),foreach

•  Mul$‐threadedprocessing(lightweightprocesses)–  Usuallysharedmemory–  Hardertoscaleoutacrossnetworks–  Examples:threadedlinear‐algebralibrariesforR(ATLAS,MKL);GPU

processors(CUDA/NVIDIA;ct/INTEL)

13

Page 14: Parallel Programing in R - Revolutions

14

Mul,‐ThreadedProcessing

Page 15: Parallel Programing in R - Revolutions

Whatisthreadedprogramming?

•  Athreadisakindofprocessthatsharesaddressspacewithitsparentprocess

•  Created,destroyed,managedandsynchronizedinCcode– POSIXthreads– OpenMPthreads

•  Fast,butdifficulttoprogram– Easytooverwritevariables– Needtoworryaboutsynchroniza,on

15

Page 16: Parallel Programing in R - Revolutions

Exploi,ngthreadswithR

•  RlinkstoBLAS(BasicLinearAlgebraSubprograms)librariesforefficientvector/matrixopera,ons

•  Linux:NeedtocompileandlinkwiththreadedBLAS(ATLAS)

•  Windows/Mac:REvolu,onRlinkedtoIntelMKLlibraries,usesasmanythreadsascores– Manyhigher‐levelopera,onsop,mizedaswell

•  MacOS:CRANbinaryusesveclibBLAS–  threaded,pre7ygoodperformance

16

Page 17: Parallel Programing in R - Revolutions

REvolution R SVD Performance

Exampledatamatrix150,000x500fast.svd

Quad‐coreIntelCore2CPU,WindowsVista64‐bitWorkstation

Revolu,onRPerformance

Mul,‐ThreadedProcessing

17

Page 18: Parallel Programing in R - Revolutions

18

GPUProgramming

Page 19: Parallel Programing in R - Revolutions

WhatisaGPU?

•  Dedicatedprocessingchip(orcard)dedicatedtofastfloa,ng‐pointopera,ons– Originallyfor3‐Dgraphicscalcula,ons

•  Highlyparallel:100’sofprocessorsonasinglechip,capableofrunning1000’softhreads

•  Usuallyincludesdedicatedhigh‐speedRAM,accessibleonlybyGPU– Needtotransferdatain/out

•  ProgrammeddirectlyusingcustomCdialect/compilers

•  >90%ofnewdesktops/laptopshaveanintegratedGPU

19

Page 20: Parallel Programing in R - Revolutions

GeForce8800GT

•  LaunchedOct29,2007•  512Mbof256‐bitmemory•  128processors•  512simultaneousthreads•  <$200

•  DownloadNVIDIACUDATools:– h7p://www.nvidia.com/object/cuda_home.html

•  Tutorial– h7p://www.ddj.com/architect/207200659

20

Page 21: Parallel Programing in R - Revolutions

PerformanceComparison

Method Time(seconds)

BaseRconvolvefunc,on 9.89

AMDACML 6.29

FFTW(8threads) 3.75

CUDAonGeForce8800GT 1.88(singleprecision)

21

  Convolve2vectorsoflength2^22  60Mbofdata

  Quaddual‐coreprocessor/GPU

Page 22: Parallel Programing in R - Revolutions

22

IntroducingIterators

Page 23: Parallel Programing in R - Revolutions

ThoughtExperiment:DrawingCards

•  Youaretheteacherof10grade‐schoolpupils.•  Classproject:draweachofthe52playingcardsasaposter.

•  Eachchildhassuppliesofposterpaperandcrayons,butrequiresareferencecardtocopy.

•  Howtoorganizethepupils?

23

Page 24: Parallel Programing in R - Revolutions

Iterators

> require(iterators) •  Generalizedloopvariable•  Valueneednotbeatomic

– Rowofamatrix– Randomdataset– Chunkofadatafile– Recordfromadatabase

•  Createwith:iter •  Getvalueswith:nextElem •  Usedasindexingargumentwithforeach

Page 25: Parallel Programing in R - Revolutions

Iteratorsarememoryfriendly

•  Allowdatatobesplitintomanageablepiecesonthefly

•  Helpsalleviateproblemswithprocessinglargedatastructures

•  Piecescanbeprocessedinparallel

Page 26: Parallel Programing in R - Revolutions

Iteratorsactasadaptors

•  Allowsyourdatatobeprocessedbyforeachwithoutbeingconverted

•  Caniterateovermatricesanddataframesbyroworbycolumn:

it <- iter(Boston, by="row") nextElem(it)

Page 27: Parallel Programing in R - Revolutions

NumericIterator

> i <- iter(1:3) > nextElem(i) [1] 1

> nextElem(i) [1] 2 > nextElem(i) [1] 3 > nextElem(i) Error: StopIteration

Page 28: Parallel Programing in R - Revolutions

Longsequences

> i <- icount(1e9) > nextElem(i) [1] 1 > nextElem(i) [1] 2 > nextElem(i) [1] 3 > nextElem(i) [1] 4 > nextElem(i) [1] 5

Page 29: Parallel Programing in R - Revolutions

Matrixdimensions

> M <- matrix(1:25,ncol=5) > r <- iter(M,by="row") > nextElem(r) [,1] [,2] [,3] [,4] [,5] [1,] 1 6 11 16 21 > nextElem(r) [,1] [,2] [,3] [,4] [,5] [1,] 2 7 12 17 22 > nextElem(r) [,1] [,2] [,3] [,4] [,5] [1,] 3 8 13 18 23

Page 30: Parallel Programing in R - Revolutions

DataFile

> rec <- iread.table("MSFT.csv",sep=",", header=T, row.names=NULL) > nextElem(rec) MSFT.Open MSFT.High MSFT.Low MSFT.Close MSFT.Volume MSFT.Adjusted 1 29.91 30.25 29.4 29.86 76935100 28.73 > nextElem(rec) MSFT.Open MSFT.High MSFT.Low MSFT.Close MSFT.Volume MSFT.Adjusted 1 29.7 29.97 29.44 29.81 45774500 28.68 > nextElem(rec) MSFT.Open MSFT.High MSFT.Low MSFT.Close MSFT.Volume MSFT.Adjusted 1 29.63 29.75 29.45 29.64 44607200 28.52 > nextElem(rec) MSFT.Open MSFT.High MSFT.Low MSFT.Close MSFT.Volume MSFT.Adjusted 1 29.65 30.1 29.53 29.93 50220200 28.8

30

Page 31: Parallel Programing in R - Revolutions

Database

> library(RSQLite) > m <- dbDriver('SQLite') > con <- dbConnect(m, dbname="arrests") > it <- iquery(con, 'select * from USArrests', n=10) > nextElem(it) Murder Assault UrbanPop Rape Alabama 13.2 236 58 21.2 Alaska 10.0 263 48 44.5 Arizona 8.1 294 80 31.0 Arkansas 8.8 190 50 19.5 California 9.0 276 91 40.6 Colorado 7.9 204 78 38.7 Connecticut 3.3 110 77 11.1 Delaware 5.9 238 72 15.8 Florida 15.4 335 80 31.9 Georgia 17.4 211 60 25.8

31

Page 32: Parallel Programing in R - Revolutions

Infinite&Irregularsequences

iprime <- function() { lastPrime <- 1 nextEl <- function() { lastPrime <<- as.numeric(nextprime(lastPrime)) lastPrime } it <- list(nextElem=nextEl) class(it) <- c('abstractiter','iter') it}

> require(gmp) > p <- iprime() > nextElem(p) [1] 2 > nextElem(p) [1] 3

Page 33: Parallel Programing in R - Revolutions

33

Loopingwithforeach

Page 34: Parallel Programing in R - Revolutions

Loopingwithforeach

foreach (var=iterator) %dopar% { statements }

  Evaluatestatementsuntiliteratorterminates  statementswillreferencevariablevar   Valuesof{ … }blockcollectedintoalist

  Runssequentially(bydefault)(orforcewith%do% )

34

Page 35: Parallel Programing in R - Revolutions

> foreach (j=1:4) %dopar% sqrt (j)

[[1]] [1] 1

[[2]] [1] 1.414214

[[3]] [1] 1.732051

[[4]] [1] 2

Warning message: executing %dopar% sequentially: no parallel backend registered

35

Page 36: Parallel Programing in R - Revolutions

CombiningResults> foreach(j=1:4, .combine=c) %dopar% sqrt(j) [1] 1.000000 1.414214 1.732051 2.000000

> foreach(j=1:4, .combine='+’, .inorder=FALSE) %dopar% sqrt(j)

[1] 6.146264

  Whenorderofevaluationisunimportant,use.inorder=FALSE

36

Page 37: Parallel Programing in R - Revolutions

Referencingglobalvariables> z <- 2 > f <- function (x) sqrt (x + z)

> foreach (j=1:4, .combine='+') %dopar% f(j)

[1] 8.417609

  foreachautomaticallyinspectscodeandensuresunboundobjectsarepropagatedtotheevaluationenvironment

37

Page 38: Parallel Programing in R - Revolutions

Nestedforeachexecu,on

•  foreach opera,onscanbenestedusing%:%operator

•  Allowsparallelexecu,onacrossmul,pleloopslevels,“unrolling”theinnerloops

foreach(i=1:3, .combine=cbind) %:% foreach(j=1:3, .combine=c) %dopar% (i + j)

Page 39: Parallel Programing in R - Revolutions

39

Speedingupcodewithforeach

SMPProcessing

Page 40: Parallel Programing in R - Revolutions

Quickreviewofparallelanddistributedcompu,nginR

•  NetWorkSpaces(packagenws;SMP,distributed)–  GPL,alsocommerciallysupportedbyREvolu,onCompu,ng–  Verycross‐pla�orm,distributedshared‐memoryparadigm–  Fault‐tolerant

•  Mul,Core(packagemulticore;SMPonly)–  Linux/MacOS(requiresPOSIX)–  UsesforktocreatenewRprocesses

•  Rmpi(packageRmpi;SMP,distributed)–  Fine‐grainedcontrolallowsveryhigh‐performancecalcula,ons–  Canbetrickytoconfigure–  LimitedWindowsandheterogeneousclustersupport

•  SNOW(packagesnow;SMP,distributed*)–  LimitedWindowssupport(*singlemachineonly)–  Meta‐package:supportsMPI,sockets,NWS,PVM

40

Page 41: Parallel Programing in R - Revolutions

Parallelbackendsforforeach

•  %dopar%behaviourdependsoncurrent“registered”parallelbackend

•  Modularparallelbackends

•  registerDoSEQ(default)•  registerDoNWS(NetWorkSpaces)•  registerDoMC(mul,core,MacOS/Windows)

•  FromTerminal/ESSonly!(R.appGUIwillcrash.)•  registerDoSNOW•  registerDoRMPI

41

Page 42: Parallel Programing in R - Revolutions

GemngStarted:Mul,‐coreProcessing

•  R2.10.x–waitun,lofficialrelease•  R2.9.x

require(doMC)

registerDoMC(cores=2)

•  REvolu,onREnterpriserequire(doNWS)

s <- sleigh(workerCount=2) registerDoNWS()

42

Page 43: Parallel Programing in R - Revolutions

Asimplesimulation:

birthday <- function(n) { ntests <- 1000 pop <- 1:365 anydup <- function(i) any(duplicated( sample(pop, n, replace=TRUE)))

sum(sapply(seq(ntests), anydup)) / ntests }

x <- foreach (j=1:100) %dopar% birthday (j)

43

Page 44: Parallel Programing in R - Revolutions

BirthdayExample‐,mings

Backend Time(s)registerDoSEQ() 41sregisterDoMC()#2cores 28sregisterDoNWS()#2workers 26s(*)

44

Dual‐core2.4GHzIntelMacBook:

system.time{ x <- foreach (j=1:100) %dopar% birthday (j) } # Elapsed

Page 45: Parallel Programing in R - Revolutions

BirthdaySimula,on:Mul,core/NWS

> x <- foreach (j=1:100) %dopar% birthday (j) > plot(1:100, unlist(x),type="l")

Page 46: Parallel Programing in R - Revolutions

46

UsingclusterswithNetworkSpaces

Page 47: Parallel Programing in R - Revolutions

SemngUpaCluster

1.  Iden,fymachinestoformnodesoncluster–  EasiestwithLinux/MacOS

–  PossiblewithWindows

2.  Selectaservermachine–  OKforthisonetobeonWindows

3.  Makesurepasswordlesssshenabledoneachworkernode

–  ssh nodename Revo --versionshouldwork

47

Page 48: Parallel Programing in R - Revolutions

Semngupacluster,part2

4.  Logintoserver,startREvolu,onR5.  Createasleigh

require(doNWS) s <- sleigh(nodeList=c( rep("localhost",2), rep("thor",8), rep("loki",4)), launch=sshcmd) registerDoNWS(s)

6.  Useforeachasbefore7.  (op,onal)usejoinSleightoaddnewnodes

48

Page 49: Parallel Programing in R - Revolutions

ParallelRandomForest

# a simple parallel random forest

library(randomForest) x <- matrix(runif(500), 100)

y <- gl(2, 50) wc <- 2

n <- ceiling(1000 / wc)

registerDoNWS(s) foreach(ntree=rep(n, wc), .combine=combine,

.packages='randomForest') %dopar% randomForest(x, y, ntree=ntree)

•  Easier: randomShrubberyNWS()

49

Page 50: Parallel Programing in R - Revolutions

Conver,ngexis,ngcode

•  Converttheseloopstoforeach:– for:makebodyreturnitera,onvalueand.combine

– apply:useiter(X, by="row”)and.combine •  Oriapply(X,1)

– lapply:useiter(mylist)

50

Page 51: Parallel Programing in R - Revolutions

ppiStatsExample

•  Sequen,al:bpMats1 <- lapply(bpList, function(x) { bpMatrix(x, symMat = TRUE, homodimer = FALSE, baitAsPrey = FALSE, unWeighted = FALSE, onlyRecip = FALSE, baitsOnly = FALSE) })

•  Parallel:bpMats1 <- foreach(x=iter(bpList), .packages = "ppiStats") %dopar% { bpMatrix(x, sysMat = TRUE, homodimer = FALSE, baitAsPrey = FALSE, unWeighted = FALSE, onlyRecip = FALSE, baitsOnly = FALSE) }

51

Page 52: Parallel Programing in R - Revolutions

ppiStatsExample

•  Sequen,al:bpGraphs <- lapply(bpMats1, function(x) {

genBPGraph(x, directed = TRUE, bp = FALSE)

})

•  Parallel:bpGraph <- foreach(x=iter(bpMat1),

.packages = "ppiStats") %dopar% {

genBPGraph(x, directed = TRUE, bp = FALSE) }

52

Page 53: Parallel Programing in R - Revolutions

Excercise

•  Findother“embarassinglyparallel”BioConductorexamples,andconverttoparallelwithforeach.

53

Page 54: Parallel Programing in R - Revolutions

54

Mul,‐StratumParallelism

Page 55: Parallel Programing in R - Revolutions

foreach(iterator)%dopar%{tasks}

foreach…

task task

foreach…

task task

CLUSTER

SMP

Anexampleofexplicitmulti‐stratum||ism

55

Page 56: Parallel Programing in R - Revolutions

require ("doNWS") require ("foreach") require ("doMC")

s <- sleigh(nodelist=c(rep("localhost",2), rep("bladeserver",8)) registerDoNWS(s)

foreach (iterator_i, .packages=c("foreach", "doMC"))%dopar% { registerDoMC() foreach (iteratorj_) %dopar% { tasks… } }

Mul,‐stratumtemplate:NWS/Mul,core

56

Page 57: Parallel Programing in R - Revolutions

Pi�allstoavoid

•  Sequen,alvsParallelProgramming•  RandomNumberGenera,on

–  library(sprngNWS) –  sleigh(workerCount=8, rngType=‘sprngLFG’)

•  Nodefailure•  CosmicRays

57

Page 58: Parallel Programing in R - Revolutions

Conclusions

•  Parallelcompu,ngiseasy!•  Writeloopswithforeach/%dopar%

– Worksfineinasingle‐processorenvironment

– Third‐partyuserscanregisterbackendsformul,processororclusterprocessing

– Speedbenefitswithoutmodifyingcode

•  Easyperformancegainsonmodernlaptops/desktops

•  Expandtoclustersformeatyjobs

58

Page 59: Parallel Programing in R - Revolutions

ThankYou!

•  DavidSmith–  david@revolu,on‐compu,ng.com,@revodavid

•  REvolu,onCompu,ng– www.revolu,on‐compu,ng.com

•  Revolu+ons,theRblog– blog.revolu,on‐compu,ng.com

•  Downloads:– Slides:h7p://,nyurl.com/R‐Bioc‐slides– Script:h7p://,nyurl.com/R‐Bioc‐script