2
The actor model Most of the problems with concurrency--from deadlocks to data corruption--
result from having shared state Solution: Don’t share state!
An alternative to shared state is the actor model, in which independent processes send to and receive messages from one another
The actor model was developed in the Erlang language, and is being incorporated into many new languages
Quoting Alex Miller, http://www.javaworld.com/javaworld/jw-02-2009/jw-02-actor-concurrency1.html?page=2:
The actor model consists of a few key principles: No shared state Lightweight processes Asynchronous message-passing Mailboxes to buffer incoming messages Mailbox processing with pattern matching
3
Basic concepts An actor is an independent flow of control
You can think of an actor as a Thread with extra features An actor does not share its data with any other process
This means you can write it as a simple sequential process, and avoid a huge number of problems that result from shared state
However: It is possible to share state; it’s just a very bad idea Any process can send a message to an actor with the syntax
actor ! message An actor has a “mailbox” in which it receives messages An actor will process its messages one at a time, in the order that it receives
them, and use pattern matching to decide what to do with each message Except: Messages which don’t match any pattern are ignored, but remain in the
mailbox (this is bad) An actor doesn’t do anything unless/until it receives a message
4
A really short example scala> import scala.actors.Actor._
import scala.actors.Actor._
scala> val me = selfme: scala.actors.Actor = scala.actors.ActorProxy@6013a567
self is a method that returns the currently executing actor Since we didn’t call self from an actor, but just from a plain old Thread, it actually returns a
proxy for the Thread
scala> me ! 42 Sending myself the message 42 Doesn’t wait for an answer--just continues with the next code Nothing is printed because the value of this expression is Unit
scala> receive { case x => println(x) }42
The pattern x is a simple variable, so it will match anything The message is received and printed
5
A longer example import scala.actors.Actor
import scala.actors.Actor._
object TGIF { val worker = actor { loop { receive { case "Friday" => println("Thank God it's Friday!") case "Saturday" => exit case x => println("It's " + x + " and I'm working hard.") } } } def main(args: Array[String]) { val days = "Monday Tuesday Wednesday Thursday Friday Saturday Sunday" for (day <- days.split(" ")) worker ! day }}
It's Monday and I'm working hard.It's Tuesday and I'm working hard.It's Wednesday and I'm working hard.It's Thursday and I'm working hard.Thank God it's Friday!Process .../bin/scala exited with code 0
6
Actor is a trait
A Scala trait is used like a Java interface You can extend only one class, but you can with any number
of traits Example: class Employee extends Person with Benefits
Example: class Secretary extends Employee with Actor
However, if you don’t explicitly extend a class, use extends for the first trait
Example: class Person extends Life with Liberty with Happiness
I don’t know the reasons for this rather strange exception
A trait, like an interface, can require you to supply certain methods In an Actor, you must provide def act = ...
7
Two ways to create an Actor
1. You can mix in the Actor trait Example: class Secretary extends Employee with
Actor Example: class Worker extends Actor Your class extends a class, but withs a Trait
1. Exception: If you don’t explicitly extend some class, you must use extends for the first trait
1. I have no clue what the reason is for this rule A Trait, like a Java interface, can require you to supply certain methods
The Actor trait requires you to define an act method (with no parameters)
n You can use the actor factory method Example: val myWorker = actor { ...code for the actor to
execute... } The code is what you would otherwise put in the act method
8
How to start an Actor
When you define a object that mixes in the Actor trait, you need to start it running explicitly
Example:class Worker extends Actor { ... }val worker1 = new Workerworker1 start
When you use the actor factory method, the actor is started automatically
An actor doesn’t have to wait for messages before it starts doing work--you can write actors that already know what to do
9
How to tell an Actor to do one thing
Here’s an actor that does one thing, once: class Worker extends Actor {
def act = receive { case true => println("I am with you 1000%.") case false => println("Absolutely not!") case _ => println("Well, it's complicated....") }}
val worker = new Worker().startworker ! 43
Here’s another: val worker = actor {
receive { case true => println("I am with you 1000%.") case false => println("Absolutely not!") case _ => println("Well, it's complicated....") }}
worker ! 43
10
How to tell an Actor to do several things When an actor finishes its task, it quits To keep an actor going, put receive in a loop Example:
class Counter(id: Int) extends Actor { var yes, no = 0 def act = loop { react { case true => yes += 1 case false => no += 1 case "printResults" => printf("Counter #%d got %d yes, %d no.\n", id, yes, no) case x => println("Counter " + id + " didn't understand " + x) } }}
This is a special kind of loop defined in the Actor object There is also a loopWhile(condition) {...} method Other kinds of loops will work with receive (but not react)
11
Sending and receiving messages
To send a message, use actor ! message The thread sending the message keeps going--it doesn’t wait for a
response
To receive a message (in an Actor), use either receive {...} or react {...}
Both receive and react block until they get a message that they recognize (with case)
When receive finishes, it keeps its Thread Statements following receive{...} will then be executed
When react finishes, it returns its Thread to the thread pool Statements following the react{...} statement will not be executed The Actor’s variable stack will not be retained This (usually) makes react more efficient than receive
Hence: Prefer react to receive, but be aware of its limitations
12
Waiting for a message that never comes
If a (recognized) message never arrives, receive and react will block “forever” This is especially likely when waiting for a response from
another computer Even on the same computer, the sending process may have
crashed Two additional methods, receive (ms: Int) {...} and react (ms: Int) {...}, will time out after the given number of milliseconds if no message is received
13
Getting a result back from an Actor An Actor does not “return” a result, but you can ask it to send a result
import scala.actors.Actorimport Actor._
object SimpleActorTest { def main(args: Array[String]) { val caller = self val adder = actor { var sum = 0 loop { receive { case (x: Int, y: Int) => sum = x + y case "sum" => caller ! sum } } } adder ! (2, 2) adder ! "sum" // This must be done before calling receive! receive { case x => println("I got: " + x) } }}
I got: 4
14
Actors and shared state There’s nothing to prevent Actors from sharing state
import scala.actors.Actorimport Actor._
object SimpleActorTest {
def main(args: Array[String]) { var sum = 0 // this variable is modified by the actor val adder = actor { loop { receive { case (x: Int, y: Int) => sum = x + y // updating sum } } } adder ! (2, 2) println("I got: " + sum) }}
But it’s not a good idea! I got: 0
15
Counting true/false values: Outline import scala.actors.Actor
object ActorTest { def main(args: Array[String]) { // Create and start some actors // Send votes to the actors // Tell the actors to quit } class VoteCounter(id: Int) extends Actor { def act = loop { react { // Handle each case } } }}
16
The main method def main(args: Array[String]) {
// Create and start some actors val actors = (1 to 5) map (new Counter(_)) for (actor <- actors) { actor.start }
// Send votes to the actors (1000 votes each) val random = new scala.util.Random for (i <- 1 to 5000) { actors(i % actors.length) ! random.nextBoolean }
// Tell the actors to quit actors foreach(_ ! "quit") }
17
The Counter class class Counter(id: Int) extends Actor {
var yes, no = 0 def act = loop { react { case true => yes += 1 case false => no += 1 case "quit" => printf("Counter #%d got %d yes, %d no.\n", id, yes, no) case x => println("Counter " + id + " didn't understand " + x) } } }
18
The same program, all on one slide import scala.actors.Actor
object ActorTest { def main(args: Array[String]) {
// Create and start some actors val actors = (1 to 5) map (new Counter(_)) for (actor <- actors) { actor.start } // Send votes to the actors (1000 votes each) val random = new scala.util.Random for (i <- 1 to 5000) { actors(i % actors.length) ! random.nextBoolean } // Tell the actors to quit actors foreach(_ ! "quit")} }
class Counter(id: Int) extends Actor { var yes, no = 0 def act = loop { react { case true => yes += 1 case false => no += 1 case "quit" => printf("Counter #%d got %d yes, %d no.\n", id, yes, no) case x => println("Counter " + id + " didn't understand " + x)} } }
19
Typical results
Counter #4 got 509 yes, 491 no.Counter #1 got 468 yes, 532 no.Counter #2 got 492 yes, 508 no.Counter #3 got 501 yes, 499 no.Counter #5 got 499 yes, 501 no.
20
Counting 3s
In Principles of Parallel Programming by Lin and Snyder, they use the example of counting how many times the number 3 occurs in a large array
This can be done by creating a number of actors, each of which counts 3s in part of the array
The partial counts are then added to get the total count
My version, with timing information, starts out like this: import scala.actors.Actorimport scala.actors.Actor._
object Count3s { val random = new java.util.Random() // to make up data val numberOfActors = 4 // because I have a quad-core machine
21
Main method def main(args: Array[String]) {
val Size = 1000000 var seqCount, conCount = 0 val array = new Array[Int](Size) for (i <- 0 until Size) { array(i) = 1 + random.nextInt(3) } var startTime = System.currentTimeMillis for(runs <- 1 to 1000) seqCount = count3sSequentially(array) var finishTime = System.currentTimeMillis printf("%5d ms. to find %d threes\n", finishTime - startTime, seqCount) startTime = System.currentTimeMillis for(runs <- 1 to 1000) conCount = count3sConcurrently(array) finishTime = System.currentTimeMillis printf("%5d ms. to find %d threes\n", finishTime - startTime, conCount) }
We go through the million location array 1000 times, in order to slow down the program and get more accurate timings
22
count3sSequentially def count3sSequentially(array: Array[Int]) = {
var count = 0 for (n <- array; if n == 3) count += 1 count }
In the next slide, the segment method is used to determine a range of indices (“bottom” to “top”) for each actor to work on
23
count3sConcurrently
def count3sConcurrently(array: Array[Int]) = { val caller = self for ((bottom, top) <- segment(array.length, numberOfActors)) { val counter = actor { // These actors just start; no need to wait for a message var count = 0 for (i <- bottom to top; if array(i) == 3) count += 1 caller ! count } } var total = 0 // Get a number from each and every actor before continuing for (i <- 1 to numberOfActors) { receive { case n: Int => total += n case _ => } } total }
24
The segment method The segment method breaks an array of n locations into k approximately equal
parts Example: segment(1000, 3)
returns Vector((0,333), (334,667), (668,999))
This is just routine programming, but I present it here because it’s surprisingly difficult to get right
def segment(problemSize: Int, numberOfSegments: Int) = { val segmentSize = ((problemSize + 1) toDouble) / numberOfSegments def intCeil(d: Double) = (d ceil) toInt; for { i <- 0 until numberOfSegments bottom = intCeil(i * segmentSize) top = intCeil((i + 1) * segmentSize - 1) min (problemSize - 1) } yield( (bottom, top) )}
25
Typical results
You can see: One core maxed out versus four cores almost maxed out
Typical results:
11075 ms. to find 333469 threes 9146 ms. to find 333469 threes
This is about a 21% speedup
I have four cores! Where’s my 400% speedup?!
running concurrently
running sequentially
26
Analysis of results Almost all of the lack of speedup is due to threading overhead A small part of the problem is having to index explicitly into the
array: for (i <- bottom to top; if array(i) == 3) count += 1instead of the more efficient for (n <- array; if n == 3) count += 1
In this program, the amount of non-concurrent code is probably not a significant factor
Concurrency does work to speed up programs (on a multicore machine), but don’t expect great benefits
27
Minimizing Thread creation I have an array of one million items, and I count the threes in it
one thousand times Each time I did the counting, I created four Actors (Threads), which were
subsequently discarded What if I created four Actors once, and reused them, thus saving 3996
Thread creations/destructions? I won’t show you the code
It takes a significant rewrite, not a minor revision, to do it with reuseable Actors
Typical results: 11054 ms. to find 333253 threes 8888 ms. to find 333253 threes
We’ve gone from a 21% speedup to a 24% speedup That’s not a lot, but it’s pretty consistent
28
Using conventional shared state var sharedTotal = 0
def adder(n: Int) = synchronized { sharedTotal += n }
def count3sConcurrently(array: Array[Int]): Int = { var counters = List[Counter]() for ((bottom, top) <- segment(array.length, numberOfActors)) { counters = new Counter(bottom, top) :: counters } for (counter <- counters) { counter.start } for (counter <- counters) { counter.join } sharedTotal}
class Counter(val bottom: Int, val top: Int) extends Thread { override def run = { var count = 0 for (i <- bottom to top; if array(i) == 3) count += 1 adder(count) }}
9247 ms. to find 333304000 threes Essentially the same time as my first attempt Has a trivial bug (left as an exercise for the reader)
29
Doing it right
Martin Odersky gives some rules for using Actors effectively Actors should not block while processing a message Communicate with actors only via messages
Scala does not prevent actors from sharing state, so it’s (unfortunately) very easy to do
Prefer immutable messages Mutable data in a message is shared state
Make messages self-contained When you get a response from an actor, it may not be obvious what is
being responded to If the request is immutable, it’s very inexpensive to include the request
as part of the response The use of case classes often makes messages more readable