Haskell Accelerate

Preview:

DESCRIPTION

My slides from Haskell Hackers at Hacker Dojo on 10/16/2014.

Citation preview

GPU Programming with HaskellSteve Severance sseverance@alphaheavy.com

Outline

Introduction to GPUs

When to use a GPU instead of a CPU

Using a GPU with accelerate

Building an options pricer

What is a GPU?

Graphics Processing Unit

Hundreds or Thousands of Cores

High Memory Throughput

Fully Programmable

GPU Architecture

Single Instruction Multiple Data (SIMD)

High Throughput Thread Scheduler

Interleaving Operations

GPU Architecture

CPU MemoryGPU

16GB/s

GPU Circa 1999

Geforce 256

Accelerated Graphics Port (AGP)

Hardware Transform and Lighting (TnL)

Fixed Function Pipeline

GPU Circa 2001

Geforce 3/R200/XBox

First Pixel/Vertex Shaders

Limited C-like Language

GPU Circa 2014

Fully Programmable

Unified Memory

Rich High Level Languages/Tools

GPU Tradeoffs

Limited branching

Limited Memory

High Latency

GPU vs CPU

GPU is about throughput

CPU is about flexibility and latency

Programmability

CUDA

OpenCL

DirectCompute

GPU Problems

Non-branching algorithms

Matrix (cudaBLAS)

Deep Learning

Options Pricing

Can I run GPU Programs?

accelerate requires CUDA

OpenCL is a low level OpenCL wrapper

NVidia CUDA Tools (https://developer.nvidia.com/cuda-toolkit)

Introducing Accelerate

DSL for Parallel Code

Primarily CUDA, Also LLVM

Compiler lowers into CUDA code

Accelerate Basics

Acc is our DSL type. Holds the Abstract Syntax Tree (AST) of our computation

Familiar operators replace Prelude (fold,map,zip,etc…)

Accelerate Basics

Creating a Computation

Acc (Array DIM1 Float) -> Acc (Array DIM1 Float)

Running a Computation

run :: Arrays a => Acc a -> a

Arrays

data Array sh e

Comprised of both a Shape and an Element (Elt)

Elt instances for common numeric types and tuples

Arrays can be multi-dimensional, but not nested

Array Shapes

Z is a Rank-0

:. Operator Increases the Rank by One Dimension

DIM1, DIM2, DIM3, etc…

Computations

Acc is a computation on an array

Exp is a computation on an element

Exp can also be used to pass constants

What run is going to do

Compile our Program

Copy Data to GPU

Execute Program

Copy Results Back to Memory

Black-Sholes

Partial Differential Equation to Compute the Price of an Option

Massive Performance Boost on a GPU

Bloomberg Uses GPUs to compute Options Prices

Black-Sholes Equation

Stolen from investopedia.com

Code/Demo Time

Summary

lift/unlift

use adds an Array to the computation

constant wraps constants

map does what map always does

What next?

accelerate has a rich API

Slices

Aggregation

Recursion

Stencils

Thanks

Nathan Howell

The accelerate Team

You for listening

Further Reading

https://speakerdeck.com/tmcdonell/gpgpu-programming-in-haskell-with-accelerate

http://hackage.haskell.org/package/accelerate

http://quantlib-gpu.sourceforge.net/AcceleratingFinancialApplicationsOnTheGPU-paper.pdf

Recommended