View
14
Download
0
Category
Preview:
Citation preview
What is a programming language?
• Medium for communicating our intentions to machines, and to other people, and to ourselves
• A language should express computation:• precisely• at a high level• so we (and the machine) can reason about them
• Make it easier to write programs that really work
Why study languages?
• Learn new ways of thinking about programming• Understanding the tools helps you avoid nasty surprises• Become a sophisticated, skeptical consumer of
languages• Learn to reason about your programs• Get a job
Story: buffer overflows
• Oldest software defect in the book
copy(Array a, Array b) { for (int i = 0; i < a.length; i++) a[i] = b[i];}
Story: buffer overflows
• Oldest software defect in the book
copy(Array a, Array b) { for (int i = 0; i < a.length; i++) a[i] = b[i];} blows up if b.length < a.length
Not only fail-stop
#include <string.h> void foo(char *bar){ char c[12]; memcpy(c, bar, strlen(bar)); // no bounds checking...} int main(int argc, char **argv){ foo(argv[1]); }
Possible fixes
• Don’t execute code on the stack or heap• There are ways to workaround (return to libc)
Possible fixes
• Don’t execute code on the stack or heap• There are ways to workaround (return to libc)
• Use safe libraries• How to enforce?
Possible fixes
• Don’t execute code on the stack or heap• There are ways to workaround (return to libc)
• Use safe libraries• How to enforce?
• Use safe languages• Java, C#, ML, Cyclone, ...• What about legacy code?
CCured
• George Necula et al. 02, 03• source-to-source translator for C• determines smallest number of run-time checks that
must be inserted to (statically) guarantee no memory safety violations
• resulting program is memory safe• but: need to store and check array bounds information
• => can hurt performance
Why study different languages?• Some languages are more powerful than others
• Happy user of Blub:• “Blub beats Cobol and assembly.”• “Use Haskell, Scheme, ML? Hell no! ‘cause they are
all equivalent to Blub plus some bizarre stuff no one uses.”
• Habit blinds us to power
• Only valid reason to use an inferior language is backward compatibility with legacy libraries and tools
Red-black tree insertion (C)void LeftRotate(rb_red_blk_tree* tree, rb_red_blk_node* x) { rb_red_blk_node* y; rb_red_blk_node* nil=tree-‐>nil;
y=x-‐>right; x-‐>right=y-‐>left;
if (y-‐>left != nil) y-‐>left-‐>parent=x;
y-‐>parent=x-‐>parent;
if( x == x-‐>parent-‐>left) { x-‐>parent-‐>left=y; } else { x-‐>parent-‐>right=y; } y-‐>left=x; x-‐>parent=y;}
void RightRotate(rb_red_blk_tree* tree, rb_red_blk_node* y) { rb_red_blk_node* x; rb_red_blk_node* nil=tree-‐>nil;
x=y-‐>left; y-‐>left=x-‐>right;
if (nil != x-‐>right) x-‐>right-‐>parent=y; x-‐>parent=y-‐>parent; if( y == y-‐>parent-‐>left) { y-‐>parent-‐>left=x; } else { y-‐>parent-‐>right=x; } x-‐>right=y; y-‐>parent=x;}
rb_red_blk_node * RBTreeInsert(rb_red_blk_tree* tree, void* key, void* info) { rb_red_blk_node * y; rb_red_blk_node * x; rb_red_blk_node * newNode;
x=(rb_red_blk_node*) SafeMalloc(sizeof(rb_red_blk_node)); x-‐>key=key; x-‐>info=info;
TreeInsertHelp(tree,x); newNode=x; x-‐>red=1; while(x-‐>parent-‐>red) { if (x-‐>parent == x-‐>parent-‐>parent-‐>left) { y=x-‐>parent-‐>parent-‐>right; if (y-‐>red) { x-‐>parent-‐>red=0; y-‐>red=0; x-‐>parent-‐>parent-‐>red=1; x=x-‐>parent-‐>parent; } else { if (x == x-‐>parent-‐>right) { x=x-‐>parent; LeftRotate(tree,x); } x-‐>parent-‐>red=0; x-‐>parent-‐>parent-‐>red=1; RightRotate(tree,x-‐>parent-‐>parent); } } else { /* case for x-‐>parent == x-‐>parent-‐>parent-‐>right */ y=x-‐>parent-‐>parent-‐>left; if (y-‐>red) { x-‐>parent-‐>red=0; y-‐>red=0; x-‐>parent-‐>parent-‐>red=1; x=x-‐>parent-‐>parent; } else { if (x == x-‐>parent-‐>left) { x=x-‐>parent; RightRotate(tree,x); } x-‐>parent-‐>red=0; x-‐>parent-‐>parent-‐>red=1; LeftRotate(tree,x-‐>parent-‐>parent); } } } tree-‐>root-‐>left-‐>red=0; return(newNode);}
void TreeInsertHelp(rb_red_blk_tree* tree, rb_red_blk_node* z) { rb_red_blk_node* x; rb_red_blk_node* y; rb_red_blk_node* nil=tree-‐>nil; z-‐>left=z-‐>right=nil; y=tree-‐>root; x=tree-‐>root-‐>left; while( x != nil) { y=x; if (1 == tree-‐>Compare(x-‐>key,z-‐>key)) { /* x.key > z.key */ x=x-‐>left; } else { /* x,key <= z.key */ x=x-‐>right; } } z-‐>parent=y; if ( (y == tree-‐>root) || (1 == tree-‐>Compare(y-‐>key,z-‐>key))) { /* y.key > z.key */ y-‐>left=z; } else { y-‐>right=z; }}
Red-black tree insertion (Scala)
abstract class RBMap[K: Ordered, V] { protected def blacken(n: RBMap[K,V]) = n match { case L() => n case T(_,l,k,v,r) => T(B,l,k,v,r) } protected def balance (c: Color) (l: RBMap[K,V]) (k: K) (v: Option[V]) (r: RBMap[K,V]) = (c,l,k,v,r) match { case (B,T(R,T(R,a,xK,xV,b),yK,yV,c),zK,zV,d) => T(R,T(B,a,xK,xV,b),yK,yV,T(B,c,zK,zV,d)) case (B,T(R,a,xK,xV,T(R,b,yK,yV,c)),zK,zV,d) => T(R,T(B,a,xK,xV,b),yK,yV,T(B,c,zK,zV,d)) case (B,a,xK,xV,T(R,T(R,b,yK,yV,c),zK,zV,d)) => T(R,T(B,a,xK,xV,b),yK,yV,T(B,c,zK,zV,d)) case (B,a,xK,xV,T(R,b,yK,yV,T(R,c,zK,zV,d))) => T(R,T(B,a,xK,xV,b),yK,yV,T(B,c,zK,zV,d)) case (c,a,xK,xV,b) => T(c,a,xK,xV,b) }
private[map] def modWith (k: K, f: (K, Option[V]) => Option[V]): RBMap[K,V]
def modifiedWith (k: K, f: (K, Option[V]) => Option[V]): RBMap[K,V] = blacken(modWith(k,f))
def insert (k: K, v: V) = modifiedWith (k, (_,_) => Some(v))}
private case class L[K: Ordered, V] extends RBMap[K,V] { private[map] def modWith (k: K, f: (K, Option[V]) => Option[V]) = T(R, this, k, f(k,None), this)}
private case class T[K: Ordered, V](c: Color, l: RBMap[K,V], k: K, v: Option[V], r: RBMap[K,V]) extends RBMap[K,V] { private[map] def modWith (k: K, f: (K, Option[V]) => Option[V]): RBMap[K,V] = { if (k < this.k) (balance (c) (l.modWith(k,f)) (this.k) (this.v) (r)) else if (k == this.k) (T(c,l,k,f(this.k,this.v),r)) else (balance (c) (l) (this.k) (this.v) (r.modWith(k,f))) }}
Agenda
• Intellectual tools to understand and evaluate programming languages• focus is on language features
• Learn by doing• write mostly short programs, thought required• implement language features to understand how they
work
Language features• Choose abstractions (i.e., language features) to suit the
needs
• Some features to help build your vocabulary:
• higher-order functions
• polymorphism• pattern matching
• data for symbolic computing: lists, tables, sets• abstract data types, encapsulation
• objects and subtyping
• modules and parameterization• searching and backtracking
What features?
“A programming language should be designed not by piling feature on top of feature, but by removing the weaknesses that make additional features appear necessary.”
–The Scheme Report
Theory
• Functional programming• Type systems• Formal semantics (defining the language precisely)
• Operational semantics (tools of the trade)• Denotational semantics (for mathematicians)• Axiomatic semantics (for logicians)
Productivity• Programming methodologies• Software engineering• Goals
• fewer bugs, easier to isolate• including performance bugs
• code reuse• Techniques
• strong typing (static or dynamic)• abstract data types• modules (including generics)• objects and inheritance• separate compilation• automatic memory management
Implementation
• Cannot design a language without considering how it will be implemented
• Techniques• Parser generators• Memory allocation• Garbage collection• Runtime typing, tagging• Reflection• Performance
• fast execution, fast compilation, low footprint, predictability
Design dimensions• Typing
• strong vs. weak• static vs. dynamic• monomorphic vs. polymorphic
• First-class values• structures?• procedures?• are built-in types different?
• Safety• no unexplained crashes• security?
• Control flow• stack-based• heap-based (closures and continuations)• search-based (logic programming, unification)
In the beginning...
• 1800s Jacquard loom: punch cards => cloth designs• 1830-40s Charles Babbage: difference engine
• finally built in 1991• http://www.youtube.com/watch?v=Lcedn6fxgS0
• Ada Lovelace wrote some notes on how to program the engine
• 1941 Z3: first digital computer (electromechanical)• 1943 ENIAC: first electronic computer• 1945 EDVAC: von Neumann architecture (program is
data)• programmed by rewiring
Machine code and Assembly
• Machine code: bit sequences• 00000 00001 00010 00110 00000 10000
• Assembly language• symbolic representation of machine code• machine-specific
• ld r2, 0[r1]• addi r3, r2, 1• st 0[r1], r3
Towards more abstraction• Fortran
• John Backus et al. (IBM) 1954-57• arrays, loops, if statement
• Cobol• Grace Murray Hopper (DoD) 1959-60• record structure, separate data structures from execution
• Algol60• type declarations, block structure, recursion
• LISP• John McCarthy 1960• first-class functions, garbage collection, eval
• CLU• Liskov et al. 1972-80• abstract data types, iterators, exceptions
Functional languages• LISP (John McCarthy 1960)
• Scheme (Guy Steele 1980)• ML (Milner)
• OCaml (Leroy), F# (Syme)• Haskell
• pure functional language (no mutable state), lazy evaluation• Key features
• first-class functions (aka higher-order functions, closures)• val square = fun (x) => x*x• val xs = [1,2,3,4]• val ys = map square xs (* [1,4,9,16] *)• val zs = foldl (+) 0 xs (* 0+1+2+3+4 = 10 *)
• pattern matching• fun length(xs) = case (xs) [] => 0 | y::ys => 1 + length(ys) end
Logic languages
• Prolog• search-based evaluation
sibling(a,b) :- parent(a,x), parent(b,x).sister(a,b) :- sibling(a,b), female(b).brother(a,b) :- sibling(a,b), male(b).parent(“Apollo”, “Zeus”). male(“Apollo”).parent(“Artemis”, “Zeus”). female(“Artemis”).parent(“Ares”, “Zeus”). male(“Ares”).brother(“Artemis”, X).
--> X=”Apollo”, X=”Ares”
OO languages• Simula67
• Kristen Nygaard, Ole-Johan Dahl• objects, classes, inheritance, virtual methods, coroutines
• Smalltalk• 1972-80 Alan Kay, Dan Ingalls, et al. (Xerox PARC)• pure OO (everything is an object)• classes, metaclasses• blocks (closures)
• C++ - C with Simula67 classes• Java
• 1995 James Gosling et al. (Sun)• C++ syntax but portable, type-safe, GC, rich libraries
• C# - Microsoft’s Java with a few improvements
Multi-paradign languages
• Scala• Martin Odersky et al. (EPFL) 2005-10• Runs on JVM and CLR• OO + FP features
• pure OO (first-class functions are objects)• type inference• implicit conversions
• Functional logic languages
Scripting languages
• Perl, Python, PHP, Ruby, Tcl, Groovy, JavaScript• dynamically typed• high-level data structures (list, map, etc) [many
borrowed from functional languages]
Domain-specific languages
• SQL - databases• TeX, LaTeX, troff - text processing• Matlab, Mathematica - math• AutoLisp - CAD• Processing, NodeBox - graphics
Hot topics in PL
• Effects• How to reason about side-effects (state, I/O,
concurrency)• Concurrency
• abstractions for concurrent programming• type systems for eliminating concurrency-related bugs• much more later
• Security• enforcing security policies in the language
• Bug-finding• program analysis to find bugs
Multicore• Moore’s law still holds:
• 2x transistors every 18 months• Intel: 32nm in early 2010, 4nm in 2022
• Others: about one generation behind (IBM @ 45nm late 2008)
• Use transistors to add smaller, simpler cores
• IBM shifted to multicore in 2002, Intel in 2004• Intel scrapped Prescott (3.4GHz P4)
• Run at lower clock frequency• Less work per transistor ⇒ less heat
Another trend: hybrid architectures• Cell
• 1 Power processor• 8-16 “synergistic processing elements” (vector processors)
• GPGPU• many vector processors (NVIDIA GTX 480 = 448 cores)
• Can take advantage of these architectures if you can express your computation as vector operations• e.g. CUDA for NVIDIA GPUs
• no recursion, no virtual dispatch• limited memory• must manually manage movement of data to/from GPU
Concurrency for the masses
• Parallelism is the way to get high-performance on modern architectures
• Parallel programming is becoming mainstream
• No longer domain of the expert
Concurrent programming
• ... is hard:• data races• deadlock• livelock• overlocking• underlocking• priority inversion
• how to parallelize effectively?
Concurrent programming languages
• Need new languages to hide the complexity• abstractions for concurrent programming• type systems to rule out errors, improve performance
My research projects• Thorn
• a scalable concurrent scripting language• http://www.thorn-lang.org
• X10• a concurrent OO language for HPC• http://www.x10-lang.org
• Firepile• a Scala library for GPU programming
• Polyglot• an extensible compiler framework• http://www.cs.cornell.edu/Projects/polyglot
• See me for more
Grading
• three exams: 15% each• 10 homeworks: 45% total
• can drop lowest grade• term paper: 10%
• must getting a passing grade on term paper to pass the course
Assignments
• 10 small assignments• some writing• some (short, but sometimes tricky) programming• late penalty: 100%• readability and speling counts!
Working together
• Collaborate! (up to a point)• that’s what professionals do• vital to your success• discuss problems, techniques, ideas• all discussions must be acknowledged• if in doubt, ask me• if still in doubt, don’t collaborate
• Must not collaborate on code• don’t even look!
Method of study
• focus on• semantics, not syntax• the unusual, not the usual (weird by powerful)
• case studies of interpreters• learn foundations of languages by studying and
modifying implementations
• study abstracted “essentials” of languages• supplement by
• descriptive tools: operational semantics, lambda calculus, type systems
Recommended