Upload
lamduong
View
222
Download
1
Embed Size (px)
Citation preview
Parallel Computing
Benson Muite
[email protected]://kodu.ut.ee/~benson
https://courses.cs.ut.ee/2016/paralleel/fall/Main/HomePage
24 October 2016
Juku from http://muuseum.at.mt.ut.ee/kogu/165.html Available under CC-BY license.. 1 / 18
Clustering, Accelerators and OpenCL
Juku from http://muuseum.at.mt.ut.ee/kogu/165.html Available under CC-BY license.. 2 / 18
Clustering
Given a data set, decompose it into similar itemsParallel computing useful for large data setsMany possible algorithmsWill look at K-means algorithmPresentation follows F. NielsenIntroduction to HPC with MPI for Data Science
Juku from http://muuseum.at.mt.ut.ee/kogu/165.html Available under CC-BY license.. 3 / 18
K means
Consider grouping points in N-dimensional spaceAs an example, consider dimensions of passenger roadvehiclesMay wish to split them into cars, vans, buses, trainsWhat dimensions would be most useful?
Juku from http://muuseum.at.mt.ut.ee/kogu/165.html Available under CC-BY license.. 4 / 18
K means
Having chosen the dimensions, need an algorithmHave already decided on 4 categoriesTypically do not know anything more about the dataFor simplicity, assume there is at least one representativein each category
Juku from http://muuseum.at.mt.ut.ee/kogu/165.html Available under CC-BY license.. 5 / 18
K means
a) Pick 4 cluster centroids randomlyi) Calculate distance of each point to a centroidii) Put each point in a cluster based on centroid it is closest toiii) Calculate centroids of each clusteriv) Repeat i-iv until sum of distances from centroids stops
decreasing
Juku from http://muuseum.at.mt.ut.ee/kogu/165.html Available under CC-BY license.. 6 / 18
K means
Typically use square of euclidean distanceCan use other distances, depending on application, somemay be better than othersMethod converges, because at each iteration “energy” orsum of squares of euclidean distances always decreases,but remains positive (fixed point theorem)Example athttp://shiny.rstudio.com/gallery/kmeans-example.html
Juku from http://muuseum.at.mt.ut.ee/kogu/165.html Available under CC-BY license.. 7 / 18
K means
ParallelizationCan parallelize calculating distances from centroids, nocommunicationCan parallelize calculating centroids, reduction andbroadcast communications neededReduction and broadcast also needed to check forconvergenceCan use parallel IOShould weak and strong scale quite wellExamples athttp://rbigdata.github.io/documentation/pmclust/01-pmclust_pkmeans.htmlandhttps://github.com/RBigData/pmclust/blob/master/demo/ex_kmeans.rCan later compare speed to own code
Juku from http://muuseum.at.mt.ut.ee/kogu/165.html Available under CC-BY license.. 8 / 18
Accelerators
Heterogeneous architectures for high performance with lowenergy consumptionMany different kinds of hardwareMany programming models
Juku from http://muuseum.at.mt.ut.ee/kogu/165.html Available under CC-BY license.. 9 / 18
Accelerators
Graphics Processing Units (GPU)Field Programmable Gate Arrays (FPGA)Xeon PhiMassively Parallel Processor Array (MPPA)Other specialized processing units, for example forencryption and signal processing
Juku from http://muuseum.at.mt.ut.ee/kogu/165.html Available under CC-BY license.. 10 / 18
Nvidia GPUs
http://www.nvidia.comhttps://en.wikipedia.org/wiki/Nvidia_Tesla
2 Tflop double precision performanceProgramming APIs CUDA, CUDA Fortran, OpenCL,OpenACCFor compute and graphics
Juku from http://muuseum.at.mt.ut.ee/kogu/165.html Available under CC-BY license.. 11 / 18
AMD Firepro GPUs
http://www.amd.comhttps://en.wikipedia.org/wiki/AMD_FirePro
2 Tflop double precision performanceProgramming APIs OpenCL, OpenACC, HCC and HSAILFor compute and graphics
Juku from http://muuseum.at.mt.ut.ee/kogu/165.html Available under CC-BY license.. 12 / 18
Intel Xeon Phi
http://www.intel.comhttps://en.wikipedia.org/wiki/Xeon_Phi
1 Tflop double precision performanceProgramming APIs OpenCL (old versions), OpenMP, MPI,CILK, Fortran, CLatest versions can be self hostedFor compute and graphics
Juku from http://muuseum.at.mt.ut.ee/kogu/165.html Available under CC-BY license.. 13 / 18
Parallela
http://www.parallella.org/https://en.wikipedia.org/wiki/Adapteva
Programming APIs OpenCL, C, pthreadsEmbedded applicationsEnergy efficient computing 50 single precsion Gflops/WattLatest version has 1024 coreshttps://www.parallella.org/blog/
Juku from http://muuseum.at.mt.ut.ee/kogu/165.html Available under CC-BY license.. 14 / 18
Nvidia Tegra K1 and X1
http://www.nvidia.com/https://en.wikipedia.org/wiki/Tegra#Tegra_K1
0.19 Tflops double precisionProgramming APIs CUDA, OpenCL
Juku from http://muuseum.at.mt.ut.ee/kogu/165.html Available under CC-BY license.. 15 / 18
AMD APU
http://www.amd.comhttps://en.wikipedia.org/wiki/AMD_Accelerated_Processing_Unit
0.700 Tflops single precisionProgramming APIs OpenCL, OpenACC, Fortran, CFor compute and graphics
Juku from http://muuseum.at.mt.ut.ee/kogu/165.html Available under CC-BY license.. 16 / 18
Intel HD graphics
http://www.intel.comhttps://en.wikipedia.org/wiki/Intel_HD_and_Iris_Graphics
Programming OpenCLFor compute and graphics
Juku from http://muuseum.at.mt.ut.ee/kogu/165.html Available under CC-BY license.. 17 / 18
Others
Massively Parallel Processor Array (Kalray, Pezy)Field Programmable Gate Array (Xilinix, Altera)
Juku from http://muuseum.at.mt.ut.ee/kogu/165.html Available under CC-BY license.. 18 / 18
Accelerators
Pattern Matching ExampleOpencCL parallelization of Naive, Knuth-Morris-Pratt andBoyer-Moore-Horspool pattern matching algorithmsReport at http://ds.cs.ut.ee/courses/course-files/DS-seminar-Andrii-Rozumnyi.pdfCode at https://github.com/JaakTree/pattern_matching/tree/test
Possible project based on work by Handre Eliashttp://kodu.ut.ee/~handre/
Possible projects related to machine learninghttp://www.oi.ut.ee/en/studies/towards-robot-judges
Juku from http://muuseum.at.mt.ut.ee/kogu/165.html Available under CC-BY license.. 19 / 18
References
Balras G. “Multicore and GPU programming an IntegratedApproach” Morgan Kauffman 2015Nielsen, F. “Introduction to HPC with MPI for Data Science”Springer 2016PBD R https://rbigdata.github.io/
Rozumnyi, A. “ http://ds.cs.ut.ee/courses/course-files/DS-seminar-Andrii-Rozumnyi.pdf
Elias, H. “Wave simulation in a computer game” Proc.European Seminar on Computing 2016http://www.esco2016.femhub.com/media/ESCO2016_Book_of_Abstracts.pdf
Elias, H. “Simulation game in a web browser”http://comserv.cs.ut.ee/ati_thesis/datasheet.php?id=53653&year=2016
Juku from http://muuseum.at.mt.ut.ee/kogu/165.html Available under CC-BY license.. 20 / 18