36
Julia A Fresh Approach to GPU Computing

Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2

JuliaA Fresh Approach to GPU Computing

Page 2: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2

What is Julia?

Technical computing language

High-level like Python

Performance of C

function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2 return n-1 end z = z^2 + c end return maxiterend

Page 3: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2

Julia for GPU programming

Page 4: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2

What is GPU programming?

High-level

Low-level

TensorFlow, Keras

ArrayFire, Thrust

cuBLAS, cuDNN

CUB, MGPU

CUDA C

Flux.jl, Knet.jl

GPUArrays.jl

CuArrays.jl

CUDAnative.jl

Page 5: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2

Programming with libraries

� cuBLAS.jl

� cuFFT.jl

� cuRAND.jl

� cuSPARSE.jl

� cuDNN.jl

� cuSOLVER.jl

CuArrays.jl

a = CuArray(Float32,2)

b = curand(Float32, 2)

a*a

fft(a)

qrfact(a)

softmax(a)

Page 6: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2

Programming with kernels

Much harder!

Page 7: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2

Designed for performance

Multiple dispatch

function foo(x) if isa(x, Int64) … elseif isa(x, Float64) … endend

foo(x::Int64) = …

foo(x::Float64) = …

Page 8: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2

Designed for performance

Type inference

function sigmoid(x) temp = exp(-x) return (1 / (1+temp))end

Page 9: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2

Designed for performance

Type inference

function sigmoid(x::Int) temp = exp(-x)::Float64 return (1 / (1+temp))::Float64end

Page 10: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2

Designed for performance

Type inference

function sigmoid(x::Float32) temp = exp(-x)::Float32 return (1 / (1+temp))::Float32end

Machine-native types

Page 11: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2

Designed for performance

Multiple dispatch

Type inference

Machine-native types

Specializing JIT compiler

High-qualitystand-alonemachine code

Page 12: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2

Extensible language

Source AST Julia IR LLVM IR Machine code

Inspect& Inject

Configure

Source

Page 13: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2

How does it look?

function vadd(a, b, c)i = threadIdx().xc[i] = a[i] + b[i]return

end

a = CuArray(randn(2,2))b = CuArray(randn(2,2))c = similar(a)

@cuda threads=4 vadd(a,b,c)

No DSL, no subset, just Julia

CUDA abstraction level

Performance parity

Page 14: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2

How does it run?

Page 15: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2

How does it work?

function vadd(a, b, c)i = threadIdx().xc[i] = a[i] + b[i]return

end

a = CuArray(randn(2,2))b = CuArray(randn(2,2))c = similar(a)

@cuda threads=4 vadd(a,b,c)

vadd

vaddCuArray{Float64,2}

LLVM IR

PTX

Page 16: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2

How does it work?

vadd

LLVM IR

PTX

Run time JIT compiler

Fully transparent

No overhead!

vaddCuArray{Float64,2}

LLVM IR

PTX

vaddCuArray{Int32,3}

Page 17: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2
Page 18: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2
Page 19: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2
Page 20: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2
Page 21: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2

High-level GPU programming

( (a .+ b) ./ d ) .- e

Great performance

Clean & concise

Generic code

Page 22: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2
Page 23: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2

function vadd(a, b, c)i = threadIdx().xc[i] = a[i] + b[i]return

end

W = randn(2, 10)b = randn(2) f(x) = softmax(W * x .+ b)

model = Chain( Dense(10, 5, σ), Dense(5, 2), softmax)

From GPU kernels, to differentiable algorithms, to high-level layer stacking. All on one platform.

Page 24: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2
Page 25: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2
Page 26: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2

Pkg.add(“Flux”)

Page 27: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2

Differential Equations Machine Learning

CUDA Automatic Differentiation

Everything You Build

The Julia Magic

Everything just works with everything else!

Page 28: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2
Page 29: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2

All the HPC Tooling

Generic programming is extremely powerful.

StructOfArrays.jl DistributedArrays.jlJuliaDB.jl & DataFrames.jl

Deep LearningDifferential Equations

Operations Research

Page 30: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2
Page 31: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2

function model(tree)

if isleaf(tree)

tree.value

else

model(tree.left) + model(tree.right)

end

Page 32: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2
Page 33: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2
Page 34: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2

Case Studies

Page 35: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2

JuliaCon – 300 Attendees, 150 Talks

Page 36: Julia - NVIDIA...What is Julia? Technical computing language High-level like Python Performance of C function mandel(z) c = z maxiter = 80 for n = 1:maxiter if abs(z) > 2

https://github.com/JuliaGPU/NVIDIA Parallel Forall blog

https://github.com/FluxML/

Julia