View
223
Download
0
Embed Size (px)
Citation preview
Main features Expression-oriented List-oriented, garbage-collected heap-based Functional
Functions are first-class values Largely side-effect free
Strongly, statically typed Polymorphic type system Automatic type inference
Pattern matching Exceptions Modules Highly regular and expressive
History Designed as a Meta Language for
automatic theorem proving system in mid 70’s by Milner et al.
Standard ML: 1986 SML’97: 1997 Caml: a French version of ML, mid
80’s O’Caml: an object-oriented extension
of Caml, late 90’s
Interpreter interface Read-eval-print loop
Read input expression Reading ends with semicolon (not needed in files) = prompt indicates continuing expression on next line
Evaluate expression it (re)bound to result, in case you want to use it again
Print result repeat
- 3 + 4;val it = 7 : int- it + 5;val it = 12 : int- it + 5;val it = 17 : int
Basic ML data types and operations ML is organized around types
each type defines some set of values of that type each type defines a set of operations on values of that type
int ~, +, -, *, div, mod; =, <>, <, >, <=, >=; real, chr
real ~, +, -, *, /; <, >, <=, >= (no equality);
floor, ceil, trunc, round bool: different from int
true, false; =, <>; orelse, andalso string
e.g. "I said \"hi\"\tin dir C:\\stuff\\dir\n" =, <>, ^
char e.g. #"a", #"\n" =, <>; ord, str
Variables and binding Variables declared and initialized with a val
binding- val x:int = 6;val x = 6 : int- val y:int = x * x;val y = 36 : int
Variable bindings cannot be changed! Variables can be bound again,
but this shadows the previous definition- val y:int = y + 1;val y = 37 : int (* a new, different y *)
Variable types can be omitted they will be inferred by ML based on the type of the
r.h.s.- val z = x * y + 5;val z = 227 : int
Strong, static typing ML is statically typed: it will check for type
errors statically when programs are entered, not when they’re run
ML is strongly typed: it will catch all type errors (a.k.a. it's type-safe)
But which errors are type errors? Can have weakly, statically typed languages,
and strongly, dynamically typed languages
Type errors Type errors can look weird, given ML’s
fancy type system- asd;Error: unbound variable or constructor: asd
- 3 + 4.5;Error: operator and operand don’t agree
operator domain: int * int operand: int * real in expression: 3 + 4.5- 3 / 4;Error: overloaded variable not defined at type
symbol: / type: int
Records ML records are like C structs
allow heterogeneous element types, but fixed # of elements
A record type: {name:string, age:int} field order doesn’t matter
A record value: {name="Bob Smith", age=20} Can construct record values from expressions
for field values as with any value, can bind record values to variables
- val bob = {name="Bob " ^ "Smith",= age=18+num_years_in_college};
val bob = {age=20,name="Bob Smith"} : {age:int,name:string}
Accessing parts of records Can extract record fields using #fieldname function like C’s -> operator, but a regular function
- val bob’ = {name = #name(bob),
= age = #age(bob)+1};
val bob’ = {age=21,name="Bob Smith"}
: {...}
Cannot assign/change a record’s fields an immutable data structure
Tuples Like records, but fields ordered by position, not
label Useful for pairs, triples, etc.
A tuple type: string * int order does matter
A tuple value: ("Joe Stevens", 45) Can construct tuple values from expressions
for elements as with any value, can bind tuple values to variables
- val joe = ("Joe "^"Stevens", 25+num_jobs*10);
val joe = ("Joe Stevens",45) : string * int
Accessing parts of tuples Can extract tuple fields using #n
function- val joe’ = (#1(joe), #2(joe)+1);
val joe’ = ("Joe Stevens",46)
: string * int
Cannot assign/change a tuple’s components another immutable data structure
Lists ML lists are built-in, singly-linked lists
homogeneous element types, but variable # of elements
A list type: int list in general: T list, for any type T
A list value: [3, 4, 5] Empty list: [] or nil
null(lst): tests if lst is nil Can create a list value using the […] notation
elements are expressions- val lst = [1+2, 8 div 2, #age(bob)-15];
val lst = [3,4,5] : int list
Basic operations on lists Add to front of list, non-
destructively::: (an infix operator)
- val lst1 = 3::(4::(5::nil));
val lst1 = [3,4,5] : int list
- val lst2 = 2::lst1;
val lst2 = [2,3,4,5] : int list
nillst2
lst1
2 3 4 5
Basic operations on lists Adding to the front allocates a new
link;the original list is unchanged and still available
- lst1;
val it = [3,4,5] : int list
- lst2;
val it = [2,3,4,5] : int list
nillst2
lst1
2 3 4 5
More on lists Lists can be nested:
- (3 :: nil) :: (4 :: 5 :: nil) :: nil;
val it = [[3],[4,5]]: int list list
Lists must be homogeneous:- [3, "hi there"];
Error: operator and operand don’t agreeoperator domain: int * int listoperand: int * string listin expression:
(3 : int) :: "hi there" :: nil
Manipulating lists Look up the first (“head”) element: hd
- hd(lst1) + hd(lst2);val it = 5 : int
Extract the rest (“tail”) of the list: tl- val lst3 = tl(lst1);val lst3 = [4,5] : int list- val lst4 = tl(tl(lst3));val lst4 = [] : int list- tl(lst4); (* or hd(lst4) *)uncaught exception Empty
Cannot assign/change a list’s elements another immutable data structure
First-class values All of ML’s data values are first-class
there are no restrictions on how they can be created, used, passed around, bound to names, stored in other data structures, ….
One consequence: can nest records, tuples, lists arbitrarily
an example of orthogonal design{foo=(3, 5.6, "seattle"), bar=[[3,4], [5,6,7,8], [], [1,2]]}: {bar:int list list, foo:int*real*string}
Another consequence: can create initialized, anonymous values directly, as expressions
instead of using a sequence of statements to first declare (allocate named space) and then assign to initialize
Reference data model A variable refers to a value (of whatever type),
uniformly A record, tuple, or list refers to its element values,
uniformly all values are implicitly referred to by pointer
A variable binding makes the l.h.s. variable refer to its r.h.s. value
No implicit copying upon binding, parameter passing,returning from a function, storing in a data structure
like Java, Scheme, Smalltalk, … (all high-level languages) unlike C, where non-pointer values are copied
C arrays?
Reference-oriented values are heap-allocated (logically) scalar values like ints, reals, chars, bools, nil optimized
Garbage collection ML provides several ways to allocate & initialize new
values (…), {…}, […], ::
But it provides no way to deallocate/free values that are no longer being used
Instead, it provides automatic garbage collection when there are no more references to a value (either from
variables or from other objects), it is deemed garbage, and the system will automatically deallocate the value
dangling pointers impossible(could not guarantee type safety without this!)
storage leaks impossible simpler programming can be more efficient! less ability to carefully manage memory use & reuse
GCs exist even for C & C++, as free libraries
Functions Some function definitions:
- fun square(x:int):int = x * x;val square = fn : int -> int- fun swap(a:int, b:string):string*int = (b,a);
val swap = fn : int * string -> string * int Functions are values with types of the formTarg -> Tresult
use tuple type for multiple arguments use tuple type for multiple results (orthogonality!) * binds tighter than ->
Some function calls:- square(3); (* parens not needed! *)val it = 9 : int- swap(3 * 4, "billy" ^ "bob"); (*parens needed*)
val it = ("billybob",12) : string * int
Expression-orientation Function body is a single expression
fun square(x:int):int = x * x not a statement list no return keyword
Like equality in math a call to a function is equivalent to its body,
after substituting the actuals in the call for its formals
square(3) (x*x)[x3] 3*3
There are no statements in ML, only expressions
simplicity, regularity, and orthogonality in action What would be statements in other languages
are recast as expressions in ML
If expression General form: if test then e1 else e2
return value of either e1 or e2,based on whether test is true or false
cannot omit else part
- fun max(x:int, y:int):int =
= if x >= y then x else y;
val max = fn : int * int -> int
Like ?: operator in C don’t need a distinct if statement
Static typechecking ofif expression What are the rules for typechecking an if expression?
What’s the type of the result of if?
Some basic principles of typechecking: values are members of types the type of an expression must include all the values that might
possibly result from evaluating that expression at run-time
Requirements on each if expression: the type of the test expression must be bool the type of the result of the if must include whatever values
might be returned from the if the if might return the result of either e1 or e2
A solution: e1 and e2 must have the same type,and that type is the type of the result of the if expression
Let expression let: an expression that introduces a new nested scope
with local variable declarations unlike { … } statements in C, which don’t compute results
like a gcc extension? General form:
let val id1:type1 = e1
... val idn:typen = en
in ebody end typei are optional; they’ll be inferred from the ei
Evaluates each ei and binds it to idi, in turn each ei can refer to the previous id1..idi-1 bindings
Evaluates ebody and returns it as the result of the let expression
ebody can refer to all the id1..idn bindings The idi bindings disappear after ebody is evaluated
they’re in a nested, local scope
Example scopes- val x = 3;val x = 3 : int- fun f(y:int):int == let val z = x + y= val x = 4= in (let val y = z + x= in x + y + z end)= + x + y + z= end;val f = fn : int -> int- val x = 5;val x = 5 : int- f(x);???
“Statements” For expressions that have no useful result,
return empty tuple, of type unit:- print("hi\n");
hi
val it = () : unit
Expression sequence operator: ;(an infix operator, like C's comma operator) evaluates both “arguments”, returns second one
- val z = (print("hi "); print("there\n"); 3);
hi there
val z = 3 : int
Type inference for functions Declaration of function result type can be
omitted infer function result type from body expression result
type- fun max(x:int, y:int) == if x >= y then x else y;val max = fn : int * int -> int
Can even omit formal argument type declarations
infer all types based on how arguments are used in body
constraint-based algorithm to do type inference- fun max(x, y) == if x >= y then x else y;val max = fn : int * int -> int
Functions with many possible types Some functions could be used on arguments of different
types Some examples:
null: can test an int list, or a string list, or …;in general, work on a list of any type T
null: T list -> bool hd: similarly works on a list of any type T, and returns an
element of that type:hd: T list -> T
swap: takes a pair of an A and a B, returns a pair of a B and an A:
swap: A * B -> B * A How to define such functions in a statically-typed
language? in C: can’t (or have to use casts) in C++: can use templates (but can’t check separately) in ML: allow functions to have polymorphic types
Polymorphic types A polymorphic type contains one or more type
variables an identifier starting with a quote
'a list'a * 'b * 'a * 'c{x:'a, y:'b} list * 'a -> 'b
A polymorphic type describes a set of possible types,where each type variable is replaced with some type
each occurrence of a type variable must be replaced with the same type
('a * 'b * 'a * 'c)['aint, 'bstring, 'creal->real] (int * string * int * (real->real))
Polymorphic functions Functions can have polymorphic
types:null : 'a list -> bool
hd : 'a list -> 'a
tl : 'a list -> 'a list
(op ::): 'a * 'a list -> 'a list
swap : 'a * 'b -> 'b * 'a
Calling polymorphic functions When calling a polymorphic function, need to find
the instantiation of the polymorphic type into a regular type that's appropriate for the actual arguments
caller knows types of actual arguments can compute how to replace type variables so that the
replaced function type matches the argument types derive type of result of call
Example: hd([3,4,5]) type of argument: int list type of function: 'a list -> 'a replace 'a with int to make a match instantiated type of hd for this call: int list -> int type of result of this call: int
Polymorphic values Regular values can polymorphic, too
nil: 'a list
Each reference to nil finds the right instantiation for that use, separately from other references
(3 :: 4 :: nil) :: (5 :: nil) :: nil
Polymorphism versus overloading Polymorphic function: same function usable
for many different types- fun swap(i,j) = (j,i);val swap = fn : 'a * 'b -> 'b * 'a
Overloaded function: several different functions, but with same name
the name + is overloaded a function of type int*int->int a function of type real*real->real
Resolve overloading to particular function based on:
static argument types (in ML) dynamic argument classes (in object-oriented
languages)
Example of overload resolution
- 3 + 4;
val it = 7 : int
- 3.0 + 4.5;
val it = 7.5 : real
- (op +); (* which? default to int *)
val it = fn : int*int -> int
- (op +):real*real->real;
val it = fn : real*real -> real
Equality types Built-in = is polymorphic over all types that “admit
equality” i.e., any type except those containing reals or functions
Use ''a, ''b, etc. to stand for these equality types
- fun is_same(x, y) = if x = y then "yes" else "no";
val is_same = fn : ''a * ''a -> string- is_same(3, 4);val it = "no" : string- is_same({l=[3,4,5],h=("a","b"),w=nil}, {l=[3,4,5],h=("a","b"),w=nil});val it = "yes" : string- is_same(3.4, 3.4);Error: operator and operand don’t agree
[equality type required] operator domain: ’’Z * ’’Z operand: real * real in expression: is_same (3.4,3.4)
Loops, using recursion ML has no looping statement or
expression Instead, use recursion to compute
a resultfun append(l1, l2) = if null(l1) then l2 else hd(l1) :: append(tl(l1), l2)
val lst1 = [3, 4]val lst2 = [5, 6, 7]val lst3 = append(lst1, lst2)
Tail recursion Tail recursion: recursive call is last operation
before returning can be implemented just as efficiently as iteration, in
both time and space, since tail-caller isn’t needed after callee returns
Some tail-recursive functions:fun last(lst) =let val tail = tl(lst)in if null(tail) then hd(lst) else last(tail) end
fun includes(lst, x) =if null(lst) then falseelse if hd(lst) = x then trueelse includes(tl(lst), x)
append?
Converting to tail-recursive form Can often rewrite a recursive function into a tail-
recursive one introduce a helper function (usually nested) the helper function has an extra accumulator argument the accumulator holds the partial result computed so far accumulator returned as full result when base case
reached This isn’t tail-recursive:
fun fact(n) =if n <= 1 then 1else fact(n-1) * n
This is:fun fact(n0) =
let fun fact_helper(n, res) =if n <= 1 then reselse fact_helper(n-1,
res*n)in fact_helper(n0, 1) end
Pattern matching Pattern-matching: a convenient syntax for extracting
components of compound values (tuple, record, or list) A pattern looks like an expression to build a compound
value, but with variable names to be bound in some places
cannot use the same variable name more than once Use pattern in place of variable on l.h.s. of val binding
anywhere val can appear: either at top-level or in let (orthogonality & regularity)
- val x = (false, 17);val x = (false,17) : bool*int- val (a, b) = x;val a = false : boolval b = 17 : int- val (root1, root2) = quad_roots(3.0, 4.0,
5.0);val root1 = 0.786299647847 : realval root2 = ~2.11963298118 : real
More patterns List patterns:
- val [x,y] = 3::4::nil;val x = 3 : intval y = 4 : int- val (x::y::zs) = [3,4,5,6,7];val x = 3 : intval y = 4 : intval zs = [5,6,7] : int list
Constants (ints, bools, strings, chars, nil) can be patterns:- val (x, true, 3, "x", z) = (5.5, true, 3, "x",
[3,4]);val x = 5.5 : realval z = [3,4] : int list
If don’t care about some component, can use a wildcard: _- val (_::_::zs) = [3,4,5,6,7];val zs = [5,6,7] : int list
Patterns can be nested, too orthogonality
Function argument patterns Formal parameter of a fun declaration can be a pattern
- fun swap (i, j) = (j, i);val swap = fn : 'a * 'b -> 'b * 'a- fun swap2 p = (#2 p, #1 p);val swap2 = fn : 'a * 'b -> 'b * 'a- fun swap3 p = let val (a,b) = p in (b,a)
end;val swap3 = fn : 'a * 'b -> 'b * 'a- fun best_friend {student={name=n,age=_}, grades=_,
best_friends={name=f,age=_}::_} =n ^ "'s best friend is " ^
f;val best_friend = fn
: {best_friends:{age:'a, name:string} list, grades:'b, student:{age:'c, name:string}} -> string
In general, patterns allowed wherever binding occurs
Multiple cases Often a function’s implementation can be broken down
into several different cases, based on the argument value ML allows a single function to be declared via several
cases Each case identified using pattern-matching
cases checked in order, until first matching case- fun fib 0 = 0 | fib 1 = 1 | fib n = fib(n-1) + fib(n-2);val fib = fn : int -> int- fun null nil = true | null (_::_) = false;val null = fn : 'a list -> bool- fun append(nil, lst) = lst | append(x::xs,lst) = x :: append(xs,lst);val append = fn : 'a list * 'a list -> 'a list
The function has a single type all cases must have same argument and result types
Missing cases What if we don’t provide enough cases?
ML gives a warning message “match nonexhaustive”when function is declared (statically)
ML raises an exception “nonexhaustive match failure”if invoked and no existing case applies (dynamically)
- fun first_elem (x::xs) = x;Warning: match nonexhaustive
x :: xs => ...val first_elem = fn : 'a list -> 'a- first_elem [3,4,5];val it = 3 : int- first_elem [];uncaught exception nonexhaustive match failure
How would you provide an implementation of this missing case for nil?
- fun first_elem (x::xs) = x= | first_elem nil = ???
Exceptions If get in a situation where you can’t produce a normal
value of the right type, then can raise an exception aborts out of normal execution can be handled by some caller reported as a top-level “uncaught exception” if not
handled Step 1: declare an exception that can be raised
- exception EmptyList;exception EmptyList
Step 2: use the raise expression where desired- fun first_elem (x::xs) = x | first_elem nil = raise EmptyList;val first_elem = fn : 'a list -> 'a (* no
warning! *)- first_elem [3,4,5];val it = 3 : int- first_elem [];uncaught exception EmptyList
Handling exceptions Add handler clause to expressions to handle
(some) exceptions raised in that expressionexpr handle exn_name1 => expr1
| exn_name2 => expr2
...
| exn_namen => exprn
if expr raises exn_namei, then evaluate and return expri instead
- fun second_elem l = first_elem (tl l);
val second_elem = fn : 'a list -> 'a
- (second_elem [3] handle EmptyList => ~1) + 5
val it = 4 : int
Exceptions with arguments Can have exceptions with
arguments
- exception IOError of int;
exception IOError of int;
- (... raise IOError(-3) ...)
handle IOError(code) => ... code ...
Type synonyms Can give a name to a type, for convenience
name and type are equivalent, interchangeable- type person = {name:string, age:int};
type person = {age:int, name:string}
- val p:person = {name="Bob", age=18};
val p = {age=18,name="Bob"} : person
- val p2 = p;
val p2 = {age=18,name="Bob"} : person
- val p3:{name:string, age:int} = p;
val p3 = {age=18,name="Bob"}
: {age:int, name:string}
Polymorphic type synonyms Can define polymorphic synonyms
- type 'a stack = 'a list;type ’a stack = ’a list- val emptyStack:'a stack = nil;val emptyStack = [] : ’a stack
Synonyms can have multiple type parameters- type (''key, 'value) assoc_list == (''key * 'value) list;type (’a,’b) assoc_list = (’a * ’b) list
- val grades:(string,int) assoc_list == [("Joe", 84), ("Sue", 98), ("Dude", 44)];
val grades=[("Joe",84),("Sue",98),("Dude",44)]
:(string,int) assoc_list
Datatypes Users can define their own (polymorphic) data
structures a new type, unlike type synonyms
Simple example: ML’s version of enumerated types
- datatype sign = Positive | Zero | Negative;
datatype sign = Negative | Positive | Zero declares a type (sign) and a set of alternative
constructor values of that type (Positive etc.) order of constructors doesn’t matter
Another example: bool- datatype bool = true | falsedatatype bool = false | true
Using datatypes Can use constructor values as
regular values Their type is a regular type
- fun signum(x) =
= if x > 0 then Positive
= else if x = 0 then Zero
= else Negative;
val signum = fn : int -> sign
Datatypes and pattern-matching Constructor values can be used in
patterns, too- fun signum(Positive) = 1
= | signum(Zero) = 0
= | signum(Negative) = ~1;
val signum = fn : sign -> int
Datatypes with data Each constructor can have data of particular
type stored with it constructors with data are functions that allocate &
initialize new values with that “tag”- datatype LiteralExpr == Nil |= Integer of int |= String of string;datatype LiteralExpr =
Integer of int | Nil | String of string
- Nil;val it = Nil : LiteralExpr- Integer(3);val it = Integer 3 : LiteralExpr- String("xyz");val it = String "xyz" : LiteralExpr
Pattern-matching on datatypes The only way to access components of a
value of a datatype is via pattern-matching Constructor “calls” can be used in patterns
to test for and take apart values with that “tag”
- fun toString(Nil) = "nil"
= | toString(Integer(i)) = Int.toString(i)
= | toString(String(s)) = "\"" ^ s ^ "\"";
val toString = fn : LiteralExpr -> string
Recursive datatypes Many datatypes are recursive: one or more constructors
are defined in terms of the datatype itself- datatype Expr == Nil |= Integer of int |= String of string |= Variable of string |= Tuple of Expr list |= BinOpExpr of {arg1:Expr, operator:string,
arg2:Expr} |= FnCall of {function:string, arg:Expr};datatype Expr = ...
- val e1 = Tuple [Integer(3), String("hi")]; (* (3,"hi") *)
val e1 = Tuple [Integer 3,String "hi"] : Expr
(Nil, Integer, and String of LiteralExpr are shadowed)
Another example Expr value
(* f(3+x, "hi") *)
- val e2 =
= FnCall {
= function="f",
= arg=Tuple [
= BinOpExpr {arg1=Integer(3),
= operator="+",
= arg2=Variable("x")},
= String("hi")]};
val e2 = … : Expr
Recursive functions over recursive datatypes Often manipulate recursive datatypes with
recursive functions pattern of recursion in function matches pattern of
recursion in datatype- fun toString(Nil) = "nil"= | toString(Integer(i)) = Int.toString(i)= | toString(String(s)) = "\"" ^ s ^ "\""= | toString(Variable(name)) = name= | toString(Tuple(elems)) == "(" ^ listToString(elems) ^ ")"= | toString(BinOpExpr{arg1,operator,arg2})== toString(arg1) ^ " " ^ operator ^ " " ^
= toString(arg2)= | toString(FnCall{function,arg}) == function ^ "(" ^ toString(arg) ^ ")"= …;val toString = fn : Expr -> string
Mutually recursive functions and datatypes If two or more functions are defined in terms
of each other, recursively, then must be declared together, and linked with and
fun toString(...) = ... listToString ...and listToString([]) = "" | listToString([elem]) = toString(elem) | listToString(e::es) = toString(e) ^ "," ^ listToString(es);
If two or more mutually recursive datatypes, then declare them together, linked by and
datatype Stmt = ... Expr ...and Expr = ... Stmt ...
A convenience:record pattern syntactic sugar Instead of writing {a=a, b=b, c=c}
as a pattern, can write {a,b,c} E.g.
... BinOpExpr{arg1,operator,arg2} ...
is short-hand for... BinOpExpr{arg1=arg1,
operator=operator,
arg2=arg2} ...
Polymorphic datatypes Datatypes can be polymorphic
- datatype 'a List = Nil= | Cons of 'a * 'a List;
datatype 'a List = Cons of 'a * 'a List | Nil
- val lst = Cons(3, Cons(4, Nil));val lst = Cons (3, Cons (4, Nil)) : int List
- fun Null(Nil) = true= | Null(Cons(_,_)) = false;
val Null = fn : 'a List -> bool- fun Hd(Nil) = raise Empty= | Hd(Cons(h,_)) = h;val Hd = fn : 'a List -> 'a- fun Sum(Nil) = 0= | Sum(Cons(x,xs)) = x + Sum(xs);val Sum = fn : int List -> int
Modules for name-space management A file full of types and functions can be cumbersome to
manage Would like some hierarchical organization to names
Modules allow grouping declarations to achieve a hierarchical name-space
ML structure declarations create modules- structure Assoc_List = struct= type (''k,'v) assoc_list =
(''k*'v) list= val empty = nil= fun store(alist, key, value) =
...= fun fetch(alist, key) = ...= end;structure Assoc_List : sig
type ('a,'b) assoc_list = ('a*'b) listval empty : 'a listval store : ('’a*'b) list * ''a * 'b -> ('’a*'b) listval fetch : ('’a*'b) list * ''a -> 'b
end
Using structures To access declarations in a structure, can use dot
notation- val league = Assoc_List.empty;val l = [] : 'a list
- val league = Assoc_List.store(league, "Mariners", {..});
val league = [("Mariners", {..})] : (string * {..}) list
- ...
- Assoc_List.fetch("Mariners");val it = {wins=78,losses=4} : {..}
Other definitions of empty, store, fetch, etc. don’t clash
Common names can be reused by different structures
The open declaration To avoid typing a lot of structure names, can use
the open struct_name declaration to introduce local synonyms for all the declarations in a structure
usually in a let, local, or within some other structurefun add_first_team(name) =let
open Assoc_List(* imports assoc_list, empty,
store, fetch *)val init = {wins=0,losses=0}
instore(empty,name,init)(*
Assoc_List.store(Assoc_List.empty, name,
init) *)end
Modules for encapsulation Want to hide details of data structure implementations from
clients, i.e., data abstraction simplify interface to clients allow implementation to change without affecting clients
In C++ and Java, use public/private annotations In ML:
define a signature that specifies the desired interface specify the signature with the structure declaration
E.g. a signature that hides the implementation of assoc_list:- signature ASSOC_LIST = sig= type (''k,'v) assoc_list (* no rhs! *)= val empty : (''k,'v) assoc_list= val store : (''k,'v) assoc_list * ''k * 'v ->= (''k,'v) assoc_list= val fetch : (''k,'v) assoc_list * ''k -> 'v= end;signature ASSOC_LIST = sig ... end
Specifying the signatures of structures Specify desired signature of structure when
declaring it:- structure Assoc_List :> ASSOC_LIST = struct
= type (''k,'v) assoc_list = (''k*'v) list
= val empty = nil= fun store(alist, key, value) = ...
= fun fetch(alist, key) = ...
= fun helper(...) = ...= end;structure Assoc_List : ASSOC_LIST
The structure’s interface is the given one, not the default interface that exposes everything
Hidden implementation Now clients can’t see implementation, nor guess it
- val teams = Assoc_List.empty;val teams = - : (''a,'b) Assoc_List.assoc_list
- val teams’ = "Mariners"::"Yankees"::teams;Error: operator and operand don't agree
operator: string * string listoperand: string * (''Z,'Y) Assoc_List.assoc_list
- Assoc_List.helper(…);Error: unbound variable helper in path
Assoc_List.helper
- type Records = (string,…) Assoc_List.assoc_list;type Records = (string,…) Assoc_List.assoc_list- fun sortStandings(nil:Records):Records = nil= | sortStandings(pivot::rest) = ...;Error: pattern and constraint don't agree
pattern: 'Z listconstraint: Records
in pattern: nil : Records
An extended example:binary trees Stores elements in sorted order
enables faster membership testing, printing out in sorted order
datatype 'a BTree =
EmptyBTree
| BTNode of 'a * 'a BTree * 'a BTree
Some functions on binary trees
fun insert(x, EmptyBTree) =
BTNode(x, EmptyBTree, EmptyBTree)
| insert(x, n as BTNode(y,t1,t2)) =
if x = y then n
else if x < y then
BTNode(y, insert(x, t1), t2)
else BTNode(y, t1, insert(x, t2))
fun member(x, EmptyBTree) = false
| member(x, BTNode(y,t1,t2)) =
if x = y then true
else if x < y then member(x, t1)
else member(x, t2) What are the types of these functions?
First-class functions Can make code more reusable by parameterizing it by
functions as well as values and types Simple technique: treat functions as first-class values
function values can be created, used, passed around, bound to names, stored in other data structures, etc., just like all other ML values
- fun int_lt(x:int, y:int) = x < y;val int_lt = fn : int * int -> bool
- int_lt(3,4);val it = true : bool
- val f = int_lt;val f = fn : int * int -> bool
- f(3,4);val it = true : bool
Passing functions to functions A function can often be made more flexible if takes another
function as an argument Example:
parameterize binary tree insert & member functions by the = and < comparisons to use
parameterize the quicksort algorithm by the < comparison to use parameterize a list search function by the pattern being searched
for
(* find(test_fn:'a -> bool, lst:'a list):'a *)- exception NotFound;- fun find(test_fn, nil) = raise NotFound | find(test_fn, elem::elems) =
if test_fn(elem) then elem else find(test_fn, elems);
val find = fn : ('a -> bool) * 'a list -> 'a
- fun is_good_grade(g) = g >= 90;val is_good_grade = fn : int -> bool- find(is_good_grade, [85,72,92,98,84]);val it = 92 : int
Binary tree functions, revisited
- fun insert(x, EmptyBTree, eq, lt) = BTNode(x, EmptyBTree, EmptyBTree)
| insert(x, n as BTNode(y,t1,t2), eq, lt) = if eq(x,y) then n else if lt(x,y) then BTNode(y, insert(x, t1, eq, lt), t2) else BTNode(y, t1, insert(x, t2, eq, lt))
val insert = fn : 'a * 'a BTree * ('a * 'a -> bool) * ('a * 'a -> bool) -> 'a
BTree
- fun member(x, EmptyBTree, eq, lt) = false | member(x, BTNode(y,t1,t2), eq, lt) =
if eq(x,y) then true else if lt(x,y) then member(x, t1, eq,
lt) else member(x, t2, eq, lt)
val member = fn : 'a * 'a BTree * ('a * 'a -> bool) * ('a * 'a -> bool) -> bool
Calling binary tree functions
- val t = insert(5, EmptyBTree, op=, op<);val t = BTNode (5,EmptyBTree,EmptyBTree) : int BTree- val t = insert(2, t, op=, op<);- val t = insert(3, t, op=, op<);- val t = insert(7, t, op=, op<);- member(2, t, op=, op<);val it = true : bool- member(4, t, op=, op<);val it = false : bool
- ... definitions of person type, person_eq and person_lt functions, and p1 value
- val pt = insert(p1, EmptyBTree, person_eq, person_lt);
val pt = ... : person BTree
Storing functions in data structures It’s a pain to keep passing around the eq and lt functions to all
calls of insert and member It’s unreliable to depend on clients to pass in the right
functions
Idea: store the functions in the tree itselflocal
datatype 'a BT = EmptyBT | BTNode of 'a * 'a BT * 'a BTfun ins(x, tree, eq, lt) = ... previous insert ...fun mbr(x, tree, eq, lt) = ... previous member ...
indatatype 'a BTree = BTree of {tree:'a BT, eq:'a * 'a -> bool, lt:'a * 'a -> bool}fun emptyBTree(eq,lt) =
BTree{tree=EmptyBT, eq=eq, lt=lt}fun insert(x, BTree{tree, eq, lt}) =
BTree{tree=ins(x, tree, eq, lt), eq=eq, lt=lt}fun member(x, BTree{tree, eq, lt}) =
mbr(x, tree, eq, lt)end
Records containing functions are ML’s version of objects!
A common pattern: map Pattern: take a list and produce a new list, where each
element of the output is calculated from the corresponding element of the input
map captures this patternmap: ('a -> 'b) * 'a list -> 'b list
[not quite the type of ML’s predefined map; stay tuned]
Example: have a list of fahrenheit temperatures for Seattle days want to give a list of temps to friend in England
- fun f2c(f_temp) = (f_temp - 32.0) * 5.0/9.0;val f2c = fn : real -> real
- val f_temps = [56.4, 72.2, 68.4, 78.4, 45.0];val f_temps = [56.4,72.2,68.4,78.4,45.0] : real list
- val c_temps = map(f2c, f_temps);val c_temps = [13.556,22.333,20.222,25.778,7.222] : real list
Another common pattern: filter Pattern: take a list and produce a new list of all the
elements of the first list that pass some test (a predicate)
filter captures this patternfilter: ('a -> bool) * 'a list -> 'a list
[not quite the type of ML’s predefined filter; stay tuned]
Example: have a list of day temps want a list of nice days
- fun is_nice_day(temp) = temp >= 70.0;val is_nice_day = fn : real -> bool
- val nice_days = filter(is_nice_day, f_temps);val nice_days = [72.2,78.4] : real list
Another common pattern: find Pattern: take a list and return the first element
that passes some test, raising an exception if no element passes the test
find captures this patternfind: ('a -> bool) * 'a list -> 'aexception NotFound
[not quite the type of ML’s predefined find; stay tuned]
Example: find first nice day
- val a_nice_day = find(is_nice_day, f_temps);a_nice_day = 72.2 : real
Anonymous functions Map functions and predicate functions often pretty
simple, only used as argument to map, etc.; don’t merit their own name
Can directly write anonymous function expressions:fn patternformal => exprbody
Examples:- fn(x)=> x + 1;val it = fn : int -> int- (fn(x)=> x + 1)(8);val it = 9 : int
- map(fn(f)=> (f - 32.0) * 5.0/9.0, f_temps);val it = [13.556,...] : real list
- filter(fn(t)=> t < 60.0, f_temps);val it = [56.4,45.0] : real list
Fun vs. fn fn expressions are a primitive notion val declarations are a primitive notion fun declarations are just a convenient syntax for val + fn
fun f arg = expr is syntactic sugar for
val rec f = (fn arg => expr)
fun succ(x) = x + 1 is syntactic sugar for
val rec succ = (fn(x) => x + 1)
Explains why the type of a fun declaration prints like a val declaration with a fn value
val succ = fn : int -> int
Attributes of good design: orthogonality of primitives syntactic sugar for common combinations
Nested functions An example:
- fun good_days(good_temp:real, temps:real list):real list =
filter(fn(temp)=> temp >= good_temp, temps);
val good_days = fn : real * real list -> real list
(* good days in Seattle: *)- good_days(70.0, f_temps)val it = [72.2,78.4] : real list
(* good days in Fairbanks: *)- good_days(32.0, f_temps)val it = [56.4,72.2,68.4,78.4,45.0] : real list
What’s interesting about the anonymous function expressionfn(temp)=> temp >= good_temp ?
Nested functions and scoping If functions can be written nested within other functions
(whether named in a let expression, or anonymous) then can reference local variables in enclosing function scope
Variables declared outside a scope are called free variables
Makes nested functions a lot more useful in practice More than just hiding helper functions
Beyond what can be done with function pointers in C/C++
C functions only have globals as free variables
Akin to inner classes in Java
Returning functions from functions If functions are first-class, then should be able to create and
return them Example: function composition
- fun compose(f,g) = (fn(x) => f(g(x)));val compose = fn : (’b -> ’c) * (’a -> ’b) -> (’a -> ’c)
- fun square(x) = x*x;val square = fn : int -> int- fun double(y) = y+y;val double = fn : int -> int
- val double_square = compose(double, square);val double_square = fn : int -> int- double_square(3);val it = 18 : int- (compose(square,double))(3);val it = 36 : int
The infix o operator is ML’s predefined compose:- map(square o double, [3,4,5]);val it = [36,64,100] : int list
Currying A curried function takes some arguments and then
computes & returns a function which takes additional arguments
The result function can be applied to many different arguments, without having to pass in the first arguments again
Example: a curried version of map:- fun map(f) =
(fn(nil) => nil |(x::xs) => f(x)::map(f)(xs));
val map = fn : ('a->'b) -> 'a list -> 'b list
- map(square)([3,4,5]);val it = [9,16,25] : int list
- val squares = map(square); (* "partial application" *)val squares = fn : int list -> int list- squares([3,4,5]);val it = [9,16,25] : int list- squares([9,10]);val it = [81,100] : int list
Clean syntactic sugar for currying Allow multiple formal argument patterns curried function Application ("function calling") written without parentheses
juxtaposition associates left-to-right; higher precedence than infix operators Function type (->) associates right-to-left; lower precedence than e.g.
*, list
- fun map f nil = nil | map f (x::xs) = f x :: map f xs; (* parenthesization? *)val map = fn : ('a->'b) -> 'a list -> 'b list (* parenthesization? *)
- fun filter pred nil = nil | filter pred (x::xs) =
let val rest = filter pred xs in if pred x then x::rest else rest end;
val filter = fn : ('a->bool) -> 'a list -> 'a list
- fun find pred nil = raise NotFound | find pred (x::xs) =
if pred x then x else find pred xs;val find = fn : ('a->bool) -> 'a list -> 'a
Curried is the normal way to define ML functions syntactically cleaner semantically more flexible
ML’s predefined map, filter, and find are defined like this
First-class functions and scoping Lexical scoping is interesting if returning a function with
free variables how to remember bindings of free variables?
- fun compose(f,g) = (fn(x) => f(g(x)));val compose = fn : (’a -> ’b) * (’b -> ’c) -> ’a -> ’c
- val double_square = compose(double, square);- val square_double = compose(square, double);
- double_square(3);val it = 18 : int- square_double(3);val it = 36 : int
How are these two calls distinguished?Where do bindings for f and g come from?
All curried functions have free variables like this Many anonymous fn args (to map et al.) have free variables
Closures To support lexically nested procedures
which can be returned out of their enclosing scope, must represent as a closure: a pair of code address and an environment environment records bindings of free variables closure no longer dependent on enclosing scope pair and environment must be heap-allocated e.g. ML, Scheme, Haskell, Smalltalk, Cecil
Restricted versions If only allow to pass nested procedures down,
not return them, then can implement more cheaply
environment can be stack-allocated, not heap-allocated e.g. Pascal, Modula-3
If allow nested procedures but not first-class procedures, then cheaper still
do not need pair, just extra implicit environment argument
e.g. Ada If allow first-class procedures but no nesting,
then can implement with just a code address e.g. C, C++
A general pattern: fold The general pattern over lists simply abstracts the standard
pattern of recursion Recursion pattern:
fun f(…, nil, …) = … (* base case *) | f(…, x::xs, …) = … x … f(…, xs, …) … (* inductive case *)
Parameters of this pattern, for a list argument of type 'a list: what to return as the base case result ('b) how to compute the inductive result from the head and the
recursive call('a * 'b -> 'b)
fold captures this patternfoldl, foldr: ('a * 'b -> 'b) -> 'b -> 'a list -> 'b
3 curried arguments iterate over elements left-to-right: foldl iterate over elements right-to-left: foldr
for associative combining operators, order doesn’t matter [which is the recursive pattern above?]
Examples using foldfoldl, foldr: ('a * 'b -> 'b) -> 'b -> 'a list -> 'b
Summing all the elements of a list- val rainfall = [0.0, 1.2, 0.0, 0.4, 1.3, 1.1];val rainfall = […] : real list- val total_rainfall =
foldl (fn(rain,subtotal) => rain+subtotal)
0.0 rainfall;val total_rainfall = 4.0 : real
Reusable sum function?
What do these do?- foldl (fn(x,ls)=>x::ls) nil [3,4,5];
- foldr (fn(x,ls)=>x::ls) nil [3,4,5];
- foldr (fn(x,ls)=>x::ls) [1,2,3] [4,5,6];
Polymorphic type inference ML infers types of expressions automatically, as follows:
assign each declared variable & subexpression a fresh type variable result of function is another type variable share argument and result type variables across function cases
for each subexpression, generate constraints on types of its operands constraint: one type expression must equal another before applying a polymorphic function, replace quantified type variables with fresh
ones for that application solve constraints by unifying type expressions
can partially refine types, e.g.:'a 'b list'b ''c
fail for cyclic constraints, e.g. 'a = 'a list
If overloaded operator is unresolved after constraint solving, default to int version
Overconstrained (unsatisfiable constraints) type error Underconstrained (still some type variables) a polymorphic result
Let-bound polymorphism ML type inference supports only let-bound polymorphism
only val-/fun-declared names can be polymorphic, not names of formals
implicit quantifiers of polymorphic variables are at outer level “prenex form”
- fun id(x) = x;val id = fn : 'a -> 'a(* with explicit quantifier: val id = fn : 'a.'a->'a *)- fun g(f) = (f 3, f "hi");(* type error in ML; f cannot be given a polymorphic type *)(* this (legal) ML type wouldn’t allow the two different f calls: val g = fn : 'a.(('a->'a) -> int*string) *)
What if ML allowed explicitly quantified polymorphic types for formals?
- fun g(f:'a.'a->'a) = (f 3, f "hi");val g = fn : ('a.'a->'a) -> int*string- g(id);val it = (3, "hi") : int * string
Type inference precludes first-class polymorphic values
Polymorphic vs. monomorphic recursion When analyzing the body of a polymorphic function, what
do we do when we encounter a recursive call?fun f(x) =
... f(hd(x)) ... f(tl(x)) ...
If allow polymorphic recursion, then f is considered polymorphic in body, and each recursive call uses a fresh instantiation (like any call to a polymorphic function)
If only monomorphic recursion, then force recursive call to pass same argument types as formals (don’t make a fresh instantiation)
Type inference under polymorphic recursion is undecidable
but only in obscure cases ML uses monomorphic recursion
Nested polymorphic functions After doing type inference for a function, if any type variables
remain in its type, then make the function polymorphic over them
But what about a nested function?fun f(x) =
let fun g(u, v) = ([x,u], [v,v]) in ... g(x, 5) ... (* does this work? *) ... g([x], true) ... (* does this? *)end
Type of f: 'a -> '... Type of g: 'a * 'b -> 'a list * 'b list
but 'a and 'b act differently…
'a is a non-generalizable type variable don’t replace with a fresh type variable when g called
Handles monomorphic recursion restriction, too
Properties of ML type inference Hindley-Milner type inference
allows let-bound polymorphism only universal parametric polymorphism,
no constrained polymorphism (other than equality types)
Type inference yields principal type for expression
single most general type that can be inferred
Worst-case complexity of type inference: exponential time
Average case complexity: linear time
References Support side-effects (mutation) through explicit
reference values: ref : 'a -> 'a ref ! : 'a ref -> 'a (op :=) : 'a ref * 'a -> unit
- val v = ref 0;val v = ref 0 : int ref- v := !v + 1;val it = () : unit- !v;val it = 1 : int
Arrays: indexable mutable locations
Must say which things are mutable Mutation is compartmentalized
References to polymorphic values? Try this:
- fun id(x) = x;
val ID = fn : 'a -> 'a
- val fp = ref id;
(* error in real SML; pretend it’s not *)
val fp = ref fn : ('a -> 'a) ref
- (!fp true, !fp 5);
(true, 5) : bool * int
- fp := not;
hmmmm...
- !fp 5
CRASH!!!
The "value restriction" Cannot allow references to polymorphic
values exception arguments similarly cannot be
polymorphic In general, only polymorphic literals can
be bound in val/fun bindings, not polymorphic expressions get “non-generalizable type variable” error
otherwise SML'90 had “weakly polymorphic types”
instead
Functors Can parameterize structures by other
structures
functor AListUser(AL:ASSOC_LIST) = struct... AL.store ... AL.fetch ...
end
only know aspects of AL that are defined by ASSOC_LIST
Instantiate functors to build regular structures:
- structure ALU1 = AListUser(Assoc_List);
- structure ALU2 = AListUser(Hash_Assoc_List);
Functors for bounded quantification Define a signature representing the operations
needed
signature ORDERED = sigtype Tval eq: T * T -> boolval lt: T * T -> bool
end
Define quantified algorithms as elements of functors parameterized by required signature
functor Sort(O:ORDERED) = structfun min(x,y) = if O.lt(x,y) then x else yfun sort(lst) = ... O.lt(x, y) ...
end
An instantiation of Sort Create specialized sorter by instantiating functor with
appropriate operations- structure IntOrder:ORDERED = struct type T = int val lt = (op <) val eq = (op =) end;structure IntOrder:>ORDERED = …
- structure IntSort = Sort(IntOrder);structure IntSort = … val sort:IntOrder.T list -> IntOrder.T list …
- IntSort.sort([3,5,~2]);val it = [~2,3,5] : IntOrder.T list
Use IntOrder:ORDERED, not IntOrder:>ORDERED Using : instead of :> allows type binding (T=int) to bleed through
to users of IntOrder IntOrder is a view/extension of an existing type, int;
it isn’t creating a new ADT w/ only 2 operations
Another instantiation of Sort Can create nested, multiply parameterized
functors:functor PairOrder(
structure First:ORDERED;structure Second:ORDERED):ORDERED =
structtype T = First.T * Second.Tfun lt((x1,x2),(y1,y2)) = First.lt(x1,y1) andalso
Second.lt(x2,y2);fun eq((x1,x2),(y1,y2)) = ...;
end
(* to sort (int*string) lists: *)structure IntStringSort = Sort( PairOrder(structure First = IntOrder; structure Second = StringOrder))
Signature “subtyping” Signature specifies a particular interface Any structure that satisfies that interface
can be used where that interface is expected e.g. in functor application
Structure can have more operations more polymorphic operations more details of implementation of types
than required by signature
Some limitations of ML modules Structures are not first-class values
must be named or be argument to functor application must be declared at top-level or nested inside another
structure or signature
Cannot instantiate functors at run-time to create “objects” cannot simulate classes and object-oriented programming
No type inference for functor arguments
These constraints are to enable type inference of core and static typechecking (at all) of structures that contain types
Modules vs. classes Classes (abstract data types) implicitly define a single type,
with associated constructors, observers, and mutators
Modules can define 0, 1, or many types in same module,with associated operations over several types
no new types if adding operations to existing type(s) e.g. a library of integer or array functions hard to do in C++
multiple types can share private data & operations requires friend declarations in C++
one new type requires a name for the type (e.g. T) class name is also type name in C++, conveniently
Functors similar to parameterized classes
C++’s public/private is simpler than ML’s separate signatures, but C++ doesn’t have a simple way of describing just an interface
See Moby: modules + classes, cleanly