Upload
eleanor-mchugh
View
2.201
Download
1
Tags:
Embed Size (px)
DESCRIPTION
A mad recap of concurrency mechanism in Ruby, and how to steal the best idioms from other languages
Citation preview
Concurrency:Rubies, Plural
Elise Huard & Eleanor McHugh
RubyConf 2010Friday 12 November 2010
manifestoconcurrency matters
Friday 12 November 2010
multicoreit’s a revolution in mainstream computing
and you want to exploit it in Ruby
Friday 12 November 2010
MULTIPROCESSOR/MULTICORE
Friday 12 November 2010
NETWORK ON CHIP(50..96..100 CORES)
Friday 12 November 2010
diminishing returns
• communication takes finite time
• so doubling processors never doubles a system’s realworld performance
• realtime capacity ~70% theoretical capacity
• or less!!!
• independence = performance
Friday 12 November 2010
“... for the first time in history, no one is building a much faster sequential processor. If you want your programs to run significantly faster (...) you’re going to have to parallelize your program.”
Hennessy and Patterson “Computer Architectures” (4th edition, 2007)
Friday 12 November 2010
concurrencywhy it really matters
Friday 12 November 2010
an aide to good design
• decouples independent tasks
• encourages data to flow efficiently
• supports parallel execution
• enhances scalability
• improves program comprehensibility
Friday 12 November 2010
Friday 12 November 2010
Friday 12 November 2010
Friday 12 November 2010
finding green pasturesadopting concurrency idioms from other languages
Friday 12 November 2010
victims of choice
Erlang Actors
Go Concurrent Sequential Processes
Clojure Software Transactional Memory
Icon Coexpressions
Friday 12 November 2010
coroutinessynchronising via transfer of control
Friday 12 November 2010
icon
• a procedural language
• with a single thread of execution
• generators are decoupled coexpressions
• and goal-directed evaluation
• creates flexible flow-of-control
Friday 12 November 2010
icon coexpressionsprocedure main()
n := create(seq(1)\10)s := create(squares())c := create(cubes())while write(@n, “\t”, @s, “\t”, @c)
end
procedure seq(start)repeat {
suspend startstart += 1
}end
procedure squares()odds := 3sum := 1repeat {
suspend sumsum +:= oddsodds +:= 2
}end
procedure cubes()odds := 1sum := 1i := create(seq(2))repeat {
suspend sumsum +:= 1 + (odds * 6)odds +:= @i
}end
1 1 1
2 4 8
3 9 27
4 16 64
5 25 125
6 36 216
7 49 343
8 64 512
9 81 729
10 100 1000
Friday 12 November 2010
ruby fibers
• coexpressions
• bound to a single thread
• scheduled cooperatively
• the basis for enumerators
• library support for full coroutines
Friday 12 November 2010
ruby coexpressionsdef seq start = 1
Fiber.new doloop do
Fiber.yield startstart += 1
endend
end
n = seq()
s = Fiber.new dosum, odds = 1, 3loop do
Fiber.yield sumsum += oddsodds += 2
endend
c = Fiber.new dosum, odds, i = 1, 1, seq(2)loop do
Fiber.yield sumsum += 1 + (odds * 6)odds += i.resume
endend
10.times doputs “#{n.resume}\t#{s.resume}\t#{c.resume}”
end
1 1 1
2 4 8
3 9 27
4 16 64
5 25 125
6 36 216
7 49 343
8 64 512
9 81 729
10 100 1000
Friday 12 November 2010
ruby coroutinesrequire 'fiber'
def login Fiber.new do |server, name, password| puts "#{server.transfer(Fiber.current)} #{name}" puts "#{server.transfer name} #{password}" puts “login #{server.transfer password}” endend
def authenticate Fiber.new do |client| name = client.transfer "name:" password = client.transfer "password:" if password == "ultrasecret" then client.transfer "succeeded" else client.transfer "failed" end endend
login.transfer authenticate, "jane doe", "ultrasecret"login.transfer authenticate, "john doe", "notsosecret"
name: jane doepassword: ultrasecretlogin succeededname: john doepassword: notsosecretlogin failed
Friday 12 November 2010
icon revisitedprocedure powers()
repeat {while e := get(queue) do
write(e, “\t”, e^ 2, “\t”, e^ 3)e @&source
}end
procedure process(L)consumer := get(L)every producer := !L do
while put(queue, @producer) doif *queue > 3 then @consumer
@consumerend
global queue
procedure main()queue := []process{ powers(), 1 to 5, seq(6, 1)\5 }
end
1 1 1
2 4 8
3 9 27
4 16 64
5 25 125
6 36 216
7 49 343
8 64 512
9 81 729
10 100 1000
Friday 12 November 2010
a ruby equivalentrequire 'fiber'
def process consumer, *fibersq = consumer.transfer(Fiber.current)fibers.each do |fiber|
while fiber.alive?q.push(fiber.resume)q = consumer.transfer(q) if q.length > 3
endendconsumer.transfer q
end
powers = Fiber.new do |caller|loop do
caller.transfer([]).each do |e|puts "#{e}\t#{e ** 2}\t#{e ** 3}" rescue nil
endend
end
low_seq = Fiber.new do5.times { |i| Fiber.yield i + 1 }nil
end
high_seq = Fiber.new do(6..10).each { |i| Fiber.yield i }
end
process powers, low_seq, high_seq
1 1 1
2 4 8
3 9 27
4 16 64
5 25 125
6 36 216
7 49 343
8 64 512
9 81 729
10 100 1000
Friday 12 November 2010
processesthe traditional approach to concurrency
Friday 12 November 2010
language VM
OS(kernel processes, other processes)
Your program
multicore - multiCPU
Friday 12 November 2010
processes + threads
Process 2
RAMmemory space
Process 1
thread1
scheduler (OS)
CPU CPU
memory space
thread2 t1 t2 t3
Friday 12 November 2010
cooperative preemptive
active task has full control scheduler controls task activity
runs until it yields control switches tasks automatically
tasks transfer control directly a task can still yield control
within one thread several threads
schedulers
Friday 12 November 2010
3 threads, 2 cores
1
2
3
1
1
2
3 1
23
core1 core2 core1 core2
Friday 12 November 2010
process thread
address space
kernel resources
scheduling
communication
control
private shared
private + shared shared
kernel varies
IPC via kernel in process
children in process
feature comparison
Friday 12 November 2010
process creation
• unix spawns
• windows cuts from whole cloth
• ruby wraps this many ways
• but we’re mostly interested in fork
Friday 12 November 2010
fork
• creates a new process
• in the same state as its parent
• but executing a different code path
• forking is complicated by pthreads
• and ruby’s default garbage collector
Friday 12 November 2010
pipes
• creates an I/O channel between processes
• unnamed pipes join two related processes
• posix named pipes
• live in the file system
• have user permissions
• persist independently of processes
Friday 12 November 2010
def execute &blockchild_input, parent_input = IO.pipepid = fork dochild_input.closeresult = block.callparent_input.write result.to_jsonparent_input.close
endparent_input.closesorted = JSON.parse child_input.readchild_input.closeProcess.waitpid pidreturn sorted
end
forking
Friday 12 November 2010
context switchingOperating System Benchmark Operation Time (ms)
LinuxLinuxLinuxLinuxLinux
Windows NTWindows NTWindows NTWindows NTWindows NT
spawn new process fork() / exec() 6,000
clone current process fork() 1,000
spawn new thread pthread_create 0,300
switch current process sched_yield() 0,019
switch current thread sched_yield() 0,019
spawn new process spawnl() 12,000
clone current process N/A ---
spawn new thread pthread_create() 0,900
switch current process Sleep(0) 0,010
switch current thread Sleep(0) 0,006
C Benchmarks by Gregory Travis on a P200 MMXhttp://cs.nmu.edu/~randy/Research/Papers/Scheduler/
C Benchmarks by Gregory Travis on a P200 MMXhttp://cs.nmu.edu/~randy/Research/Papers/Scheduler/
C Benchmarks by Gregory Travis on a P200 MMXhttp://cs.nmu.edu/~randy/Research/Papers/Scheduler/
C Benchmarks by Gregory Travis on a P200 MMXhttp://cs.nmu.edu/~randy/Research/Papers/Scheduler/
Friday 12 November 2010
shared state hurts
• non-determinism
• atomicity
• fairness/starvation
• race conditions
• locking
• transactional memory
Friday 12 November 2010
semaphores
• exist independently of processes
• provide blocking access
• allowing processes to be synchronised
• nodes in the file system
• usable from Ruby with syscall
Friday 12 November 2010
synchronising processesrequire ‘dl’require ‘fcntl’libc = DL::dlopen ‘libc.dylib’open = libc[‘sem_open’, ‘ISII’]try_wait = libc[‘sem_trywait’, ‘II’]wait = libc[‘sem_wait’, ‘II’]post = libc[‘sem_post’, ‘II’]close = libc[‘sem_close’, ‘II’]
process 1s = open.call(“/tmp/s”, Fcntl::O_CREAT, 1911)[0]wait.call sputs “locked at #{Time.now}”sleep 50puts “posted at #{Time.now}”post.call sclose.call s
process 2s = open.call(“/tmp/s”)t = Time.nowif try_wait.call(s)[0] == 0 then
puts “locked at #{t}”else
puts “busy at #{t}”wait.call sputs “waited #{Time.now - t} seconds”
end
locked at Thu May 28 01:03:23 +0100 2009 busy at Thu May 28 01:03:36 +0100 2009
posted at Thu May 28 01:04:13 +0100 2009 waited 47.056508 seconds
Friday 12 November 2010
complexity
• file locking
• shared memory
• message queues
• transactional data stores
Friday 12 November 2010
threadsthe popular approach to concurrency
Friday 12 November 2010
threads under the hood
from http://www.igvita.com/2008/11/13/concurrency-is-a-myth-in-ruby/ @igrigorik
Friday 12 November 2010
the global lock
• compatibility for 1.8 C extensions
• only one thread executes at a time
• scheduled fairly with a timer thread
• 10 μs for Linux
• 10 ms for Windows
Friday 12 November 2010
synchronisation
• thread groups
• locks address race conditions
• mutex + monitor
• condition variable
• deadlocks
• livelocks
Friday 12 November 2010
threads + socketsrequire 'socket'require 'thread'
class ThreadGroupdef all_but_me &block
list. delete_if { |t|t.name == Thread.current.name
}.each { |t| yield t }end
def select *names, &block
list.delete_if { |t|!names.include? t.name
}.each { |t| yield t }end
end
class Threaddef name= n
raise ArgumentError if key? :nameself[:name] = n self[:Q] = Queue.new
end
def send messageself[:Q].enq message
end
def bind_transmitter socketraise ArgumentError if key? :XMITself[:socket] = socketself[:XMIT] = Thread.new(socket, self[:Q]) do |s, q|
loop docase message = q.deqwhen :EXIT: Thread.current.exitelse s.puts messageendThread.pass
endend
end
def disconnectsend "Goodbye #{name}"send :EXITself[:XMIT].joinself[:socket].closeexit
endend
Friday 12 November 2010
threads + socketsclass ChatServer < TCPServer
def initialize port@receivers = ThreadGroup.new@transmitters = ThreadGroup.new@deceased = ThreadGroup.newsuper
end
def register name, [email protected](t = Thread.current)t.name = [email protected] t.bind_transmitter(socket)broadcast "#{name} has joined the conversation"t.send "Welcome #{name}"
end
def connect socketloop do
beginsocket.print "Please enter your name: "register socket.readline.chomp).downcase, socketbreak
rescue ArgumentErrorsocket.puts "That name is already in use"
endend
end
def send message, name = [email protected](name) { |t| t.send message }
end
def broadcast [email protected]_but_me { |t| t.send message }
end
def listen sockett = Thread.currentloop do
message = socket.readline.chompcase message.downcasewhen "bye" raise EOFErrorwhen "known" list_known_userswhen "quit" raise SystemExitelse broadcast "#{t.name}: #{message}"end
endend
def runwhile socket = accept
Thread.new(socket) do |socket|begin
connect socketlisten socket
rescue SystemExitbroadcast "#{Thread.current.name} terminated this conversation"broadcast :EXITsend :[email protected]_but_me { |t| t.transmitter.join }Kernel.exit 0
rescue EOFErrordisconnect
endend
endend
end
ChatServer.new(3939).run
Friday 12 November 2010
the macruby twist
• grand central dispatch
• uses an optimal number of threads
• state is shared but not mutable
• object-level queues for atomic mutability
Friday 12 November 2010
clojure
lisp dialect for the JVMlisp dialect for the JVM
refs software transactional memory
agents independent, asynchronous change
vars in-thread mutability
check out Tim Bray’s Concur.next seriescheck out Tim Bray’s Concur.next series
Friday 12 November 2010
parallel banking(ns account)
; ref (def transactional-balance (ref 0))
; transfer: within a transaction (defn parallel-transfer [amount] (dosync (alter transactional-balance transfer amount)))
; many threads adding 10 onto account (defn parallel-stm [amount nthreads] (let [threads (for [x (range 0 nthreads)] (Thread. #(parallel-transfer amount)))] (do (doall (map #(.start %) threads)) (doall (map #(.join %) threads)))) @transactional-balance)
Friday 12 November 2010
ruby can do that toorequire 'clojure'
include Clojure
def parallel_transfer(amount) Ref.dosync do @balance.alter {|b| b + amount } endend
def parallel_stm(amount, nthreads) threads = [] 10.times do threads << Thread.new do parallel_transfer(amount) end end threads.each {|t| t.join } @balance.derefend
@balance = Ref.new(0)puts parallel_stm(10,10)
Friday 12 November 2010
enumerableseveryday ruby code which is naturally concurrent
Friday 12 November 2010
vector processing
• collections are first-class values
• single instruction multiple data
• each datum is processed independently
• successive instructions can be pipelined
• so long as there are no side-effects
Friday 12 November 2010
map/reduce
• decompose into independent elements
• process each element separately
• use functional code without side-effects
• recombine the elements
• intrinsically suited to parallel execution
Friday 12 November 2010
a two-phase operationconcurrent sequential
(0..5).to_a.each { |i| puts i }
x = 0(0..5).to_a.each { |i| x = x + i }
(0..5).to_a.map { |i| i ** 2 }
(0..5).to_a.inject { |sum, i| sum + i }
Friday 12 November 2010
parallelrequire 'brute_force'require 'parallel'
# can be run with :in_processes as wellmapped = Parallel.map((0..3).to_a, :in_threads => 4) do |num| map("english.#{num}") # hash the whole dictionaryend
hashed = "71aa27d3bf313edf99f4302a65e4c042"
puts reduce(hashed, mapped) # returns “zoned”
Friday 12 November 2010
algebra, actors + eventssynchronising concurrency via communication
Friday 12 November 2010
process calculi
• mathematical model of interaction
• processes and events
• (a)synchronous message passing
• named channels with atomic semantics
Friday 12 November 2010
go
• statically-typed compiled systems language
• class-free object-orientation
• garbage collection
• independent lightweight coroutines
• implicit cross-thread scheduling
• channels are a first-class datatype
Friday 12 November 2010
package mainimport "syscall"
func (c *Clock) Start() {if !c.active {
go func() {c.active = truefor i := int64(0); ; i++ {
select {case status := <- c.Control:
c.active = statusdefault:
if c.active {c.Count <- i
}syscall.Sleep(c.Period)
}}
}()}
}
type Clock struct {Period int64Count chan int64Control chan boolactive bool
}
func main() {c := Clock{1000, make(chan int64), make(chan bool), false}c.Start()
for i := 0; i < 3; i++ {println("pulse value", <-c.Count, "from clock")
}
println("disabling clock")c.Control <- falsesyscall.Sleep(1000000)println("restarting clock")c.Control <- trueprintln("pulse value", <-c.Count, "from clock")
}
produces:pulse value 0 from clockpulse value 1 from clockpulse value 2 from clockdisabling clockrestarting clockpulse value 106 from clock
a signal generator
Friday 12 November 2010
homework
• write a signal generator in ruby
• hints:
• it’s very easy
• look at the thread + socket example
• use atomic queues
• and yes, ruby grok’s csp
Friday 12 November 2010
actor model
• named actors
• fully independent of each other
• asynchronous message passing
• and no shared state
Friday 12 November 2010
erlang
• implements actors with green processes
• efficient SMP-enabled VM
• functional language
• hot code loading
• fault-tolerant
© ericsson 2007
Friday 12 November 2010
erlang-module(brute_force).-import(plists).-export(run/2).
map(FileName) -> {ok, Binary} = file:read_file(FileName), Lines = string:tokens(erlang:binary_to_list(Binary), "\n"), lists:map(fun(I) -> {erlang:md5(I), I} end, Lines).
reduce(Hashed, Dictionary) -> dict:fetch(Hashed, Dictionary).
run(Hashed, Files) -> Mapped = plists:map(fun(I) -> map(I) end, Files), Values = lists:flatten(Mapped), Dict = dict:from_list(Values), reduce(Hashed, Dict).
Friday 12 November 2010
erlangpmap(F, L) -> S = self(), Pids = lists:map(fun(I) -> spawn(fun() -> pmap_f(S, F, I) end) end, L), pmap_gather(Pids).
pmap_gather([H|T]) -> receive {H, Ret} -> [Ret|pmap_gather(T)] end;pmap_gather([]) -> [].
pmap_f(Parent, F, I) -> Parent ! {self(), (catch F(I))}.
Friday 12 November 2010
rubinius actors
• implemented as ruby threads
• each thread has an inbox
• cross-VM communication
Friday 12 November 2010
rubinius actorsrequire 'quick_sort'require 'actor'
class RbxActorSort
" def execute(&block)" " current = Actor.current" " Actor.spawn(current) {|current| current.send(block.call) }" " Actor.receive # could have filter" end
end
puts q = QuickSort.new([1,7,3,2,77,23,4,2,90,100,33,2,4], RbxActorSort).sort
Friday 12 November 2010
ruby revactor
• erlang-like semantics
• actor spawn/receive
• filter
• uses fibers for cooperative scheduling
• so only works on ruby 1.9
• non-blocking network access in ruby 1.9.2
Friday 12 November 2010
promises + futures
• abstraction for delayed execution
• a promise is calculated in parallel
• a future is calculated on demand
• caller will block when requesting result
• until that result is ready
Friday 12 November 2010
ruby futuresLazy.rb gem (@mentalguy)require 'lazy'require 'lazy/futures'
def fib(n) return n if (0..1).include? n fib(n-1) + fib(n-2) if n > 1end
puts "before first future"future1 = Lazy::Future.new { fib(40) }puts "before second future"future2 = Lazy::Future.new { fib(40) }puts "and now we're waiting for results ... getting futures fulfilled is blocking"
puts future1puts future2
Friday 12 November 2010
we didn’t cover
• tuple spaces (Rinda)
• event-driven IO
• petri nets
• ...
Friday 12 November 2010
reinventing?
some of these problems have been solved before ...
Friday 12 November 2010
to surmise
• beware of shared mutable state
• but: sane ways to handle concurrency
• they are all possible in Ruby
Friday 12 November 2010
fun!
Friday 12 November 2010
http://www.delicious.com/elisehuard/concurrency
http://www.ecst.csuchico.edu/~beej/guide/ipc/
http://wiki.netbsd.se/kqueue_tutorial
http://www.kegel.com/c10k.html
Elise Huard @elise_huard http://jabberwocky.eu
Eleanor McHugh @feyeleanor http://slides.games-with-brains.net
further reading
Friday 12 November 2010