2-1 © 2004, D.A. Watt, University of Glasgow 2 Values and Types Types of values. Primitive, composite, recursive types. Type systems: static vs dynamic

2-1© 2004, D.A. Watt, University of Glasgow

2Values and Types

Types of values.

Primitive, composite, recursive types.

Type systems: static vs dynamic typing, type completeness.

Expressions.

Implementation notes.

2-2

Types (1)

Values are grouped into types according to the operations that may be performed on them.

Different PLs support different types of values (according to their intended application areas):• Ada: booleans, characters, enumerands, integers, real numbers,

records, arrays, discriminated records, objects (tagged records), strings, pointers to data, pointers to procedures.

• C: enumerands, integers, real numbers, structures, arrays, unions, pointers to variables, pointers to functions.

• Java: booleans, integers, real numbers, arrays, objects.

• Haskell: booleans, characters, integers, real numbers, tuples, disjoint unions, lists, recursive types.

2-3

Types (2)

Roughly speaking, a type is a set of values:• v is a value of type T if v T.

E is an expression of type T if E is guaranteed to yield a value of type T.

But only certain sets of values are types:• {false, true} is a type, since the operations not, and, and or operate

uniformly over the values false and true.

• {, –2, –1, 0, +1, +2, …} is a type, since operations such as addition and multiplication operate uniformly over all these values.

• {13, true, Monday} is not considered to be a type, since there are no useful operations over this set of values.

2-4

Types (3)

More precisely, a type is a set of values, equipped with one or more operations that can be applied uniformly to all these values.

The cardinality of a type T, written #T, is the number of values of type T.

2-5

Primitive types

A primitive value is one that cannot be decomposed into simpler values.

A primitive type is one whose values are primitive.

Every PL provides built-in primitive types. Some PLs also allow programs to define new primitive types.

2-6

Built-in primitive types (1)

Typical built-in primitive types:

Boolean = {false, true}

Character = {…, ‘A’, …, ‘Z’,…, ‘0’, …, ‘9’,…}

Integer= {…, –2, –1, 0, +1, +2, …}

Float = {…, –1.0, …, 0.0, +1.0, …}

PL- or implementation-defined set of characters (ASCII, ISO-Latin, or Unicode)

PL- or implementation-defined set of whole numbers

PL- or implementation-defined set of real numbers

Names of types vary from one PL to another: not significant.

2-7

Built-in primitive types (2)

Cardinalities:

#Boolean = 2

#Character = 128 (ASCII), 256 (ISO-Latin), or 32768 (Unicode)

#Integer = max integer – min integer + 1

Note: In some PLs (such as C), booleans and characters are just small integers.

2-8

Defined primitive types

In Ada we can define new numeric types.

In Ada and C we can define new enumeration types simply by enumerating their values (called enumerands).

2-9

Example: Ada numerics

Type declaration:

type Population is range 0 .. 1e10;

Set of values:

Population = {0, 1, …, 1010}

Cardinality:

#Population = 1010+1

2-10

Example: Ada enumerations

Type declaration:

type Color is (red, green, blue);

Set of values:

Color = {red, green, blue}

Cardinality:

#Color = 3

2-11

Composite types

A composite value is one that is composed from simpler values.

A composite type is a type whose values are composite.

PLs support a huge variety of composite types.

All these can be understood in terms of a few concepts:• Cartesian products (tuples, structures, records)

• mappings (arrays)

• disjoint unions (algebraic data types, discriminated records, objects)

• recursive types (lists, trees, etc.)

2-12

Cartesian products (1)

In a Cartesian product, values of several types are grouped into tuples.

Let (x, y) stand for the pair whose first component is x and whose second component is y.

Let S T stand for the set of all pairs (x, y) such that x is chosen from set S and y is chosen from set T:

S T = { (x, y) | x S; y T }

Cardinality:

#(S T) = #S #T hence the “” notation

2-13

Cartesian products (2)

We can generalise from pairs to tuples. Let S1 S2 Sn stand for the set of all n-tuples such that the ith component is chosen from Si:

S1 S2 Sn = { (x1, x2, , xn) | x1 S1; x2 S2; …; xn Sn }

Basic operations on tuples:• construction of a tuple from its component values

• selection of an explicitly-designated component of a tuple. so we can select the 1st or 2nd (but not the ith) component

Records (Ada), structures (C), and tuples (Haskell) can all be understood in terms of Cartesian products.

2-14

Example: Ada records (1)

Type declarations:

type Month is (jan, feb, mar, apr, may, jun,jul, aug, sep, oct, nov, dec);

type Day_Number is range 1 .. 31;type Date is record

m: Month;d: Day_Number;

end record;

Application code:

someday: Date := (jan, 1);…put(someday.m+1); put("/"); put(someday.d);someday.d := 29; someday.m := feb;

record construction

component selectioncomponent selection

2-15

Example: Ada records (2)

Set of values:

Date = Month Day-Number = {jan, feb, …, dec} {1, …, 31}

viz: (jan, 1) (jan, 2) … (jan, 30) (jan, 31)(feb, 1) (feb, 2) … (feb, 30) (feb, 31)

… … … … …(dec, 1) (dec, 2) … (dec, 30) (dec, 31)

Cardinality:

#Date = #Month #Day-Number = 12 31 = 372

NBNB

2-16

Example: Haskell tuples

Declarations:data Month = Jan | Feb | Mar | Apr

| May | Jun | Jul | Aug| Sep | Oct | Nov | Dec

type Date = (Month, Int)

Set of values:Date = Month Integer

= {Jan, Feb, …, Dec} {…, –1, 0, 1, 2, …}

Application code:someday = (jan, 1)m, d = somedayanotherday = (m + 1, d)

tuple construction

component selection (by pattern matching)

2-17

Mappings (1)

We write m : S T to state that m is a mapping from set S to set T. In other words, m maps every value in S to some value in T.

If m maps value x to value y, we write y = m(x). The value y is called the image of x under m.

Some of the mappings in {u, v} {a, b, c}:

m1 = {u a, v c}

m2 = {u c, v c}

m3 = {u c, v b} image of u is c, image of v is b

2-18

Mappings (2)

Let S T stand for the set of all mappings from S to T:

S T = { m | x S m(x) T }

What is the cardinality of S T?

There are #S values in S.

Each value in S has #T possible images under a mapping in S T.

So there are #T #T … #T possible mappings. Thus:

#(S T) = (#T)#S #S copies of #T multiplied together

For example, in {u, v} {a, b, c}there are 32 = 9 possible mappings.

2-19

Arrays (1)

Arrays (found in all imperative and OO PLs) can be understood as mappings.

If an array’s components are of type T and its index values are of type S, the array has one component of type T for each value in type S. Thus the array’s type is S T.

An array’s length is the number of components, #S.

Basic operations on arrays:• construction of an array from its components

• indexing – using a computed index value to select a component.

so we can select the ith component

2-20

Arrays (2)

An array of type S T is a finite mapping.

Here S is nearly always a finite range of consecutive values {l, l+1, …, u}. This is called the array’s index range.lower bound upper bound

In C and Java, the index range must be {0, 1, …, n–1}.In Ada, the index range may be any primitive (sub)type other than Float.

We can generalise to n-dimensional arrays. If an array has index ranges of types S1, …, Sn, the array’s type isS1 … Sn T.

2-21

Example: Ada arrays (1)

Type declarations:

type Color is (red, green, blue);type Pixel is array (Color) of Boolean;

Application code:

p: Pixel := (true, false, true);c: Color;…p(c) := not p(c);

array construction

indexing indexing

2-22

Example: Ada arrays (2)

Set of values:

Pixel = Color Boolean = {red, green, blue} {false, true}

viz: {red false, green false, blue false}{red false, green false, blue true}{red false, green true, blue false}{red false, green true, blue true}{red true, green false, blue false}{red true, green false, blue true}{red true, green true, blue false}{red true, green true, blue true}

Cardinality:

#Pixel = (#Boolean)#Color = 23 = 8

2-23

Example: Ada 2-dimensional arrays

Type declarations:

type Xrange is range 0 .. 511;type Yrange is range 0 .. 255;type Window is

array (YRange, XRange) of Pixel;

Set of values:

Window = Yrange Xrange Pixel= {0, 1, …, 255} {0, 1, …, 511} Pixel

Cardinality:

#Window = (#Pixel)#Yrange #Xrange = 8256 512

2-24

Functions as mappings

Functions (found in all PLs) can also be understood as mappings. A function maps its argument(s) to its result.

If a function has a single argument of type S and its result is of type T, the function’s type is S T.

Basic operations on functions:• construction (or definition) of a function

• application – calling the function with a computed argument.

We can generalise to functions with n arguments. If a function has arguments of types S1, …, Sn and its result type is T, the function’s type is S1 … Sn T.

2-25

Example: Ada functions

Definition:

function is_even (n: Integer)return Boolean is

beginreturn (n mod 2 = 0);

end;

or any other code that achieves the same effect

Type:Integer Boolean

Value:{…, 0 true, 1 false, 2 true, 3 false, …}

Other functions of same type: is_odd, is_prime, etc.

2-26

Disjoint unions (1)

In a disjoint union, a value is chosen from one of several different types.

Let S + T stand for a set of disjoint-union values, each of which consists of a tag together with a variant chosen from either type S or type T. The tag indicates the type of the variant:

S + T = { left x | x S } { right y | y T }

• left x is a value with tag left and variant x chosen from S

• right x is a value with tag right and variant y chosen from T.

Let us write left S + right T (instead of S + T) when we want to make the tags explicit.

2-27

Disjoint unions (2)

Cardinality:

#(S T) = #S #T hence the “+” notation

Basic operations on disjoint-union values in S + T:• construction of a disjoint-union value from its tag and variant

• tag test, to determine whether the variant was chosen from S or T

• projection, to recover either the variant in S or the variant in T.

Algebraic data types (Haskell), discriminated records (Ada), and objects (Java) can all be understood in terms of disjoint unions.

We can generalise to multiple variants: S1 + S2 + + Sn.

2-28

Example: Haskell algebraic data types (1)

Type declaration:

data Number = Exact Int | Inexact Float

Each Number value consists of a tag, together with either an Integer variant (if the tag is Exact) or a Float variant (if the tag is Inexact).

Set of values:Number = Exact Integer + Inexact Float

viz: … Exact(–2) Exact(–1) Exact 0 Exact 1 Exact 2 …… Inexact(–1.0) … Inexact 0.0 … Inexact 1.0 …

Cardinality:#Number = #Integer + #Float

2-29

Example: Haskell algebraic data types (2)

Application code:

pi = Inexact 3.1416

rounded :: Number -> Integerrounded num =

case num ofExact i -> iInexact r -> round r

disjoint-union construction

tag testprojection (by pattern matching)

projection (by pattern matching)

2-30

Example: Ada discriminated records (1)

Type declarations:

type Accuracy is (exact, inexact);type Number (acc: Accuracy := exact) is

recordcase acc of

when exact => ival: Integer;when inexact => rval: Float;

end case;end record;

Each Number value consists of a tag field named acc, together with either an Integer variant field named ival (if the tag is exact) or a Float variant field named rval (if the tag is inexact).

2-31


Set of values:

Number = exact Integer + inexact Float

viz: … exact(–2) exact(–1) exact 0 exact 1 exact 2 …

… inexact(–1.0) … inexact 0.0 … inexact 1.0 …

Cardinality:

#Number = #Integer + #Float

2-32


Type declarations:

type Form is(pointy, circular, rectangular);

type Figure (f: Form := pointy) is recordx, y: Float;case f is

when pointy => null;when circular => r: Float;when rectangular => w, h:

Float;end case;

end record;Each Figure value consists of a tag field named f, together with a pair of Float fields named x and y, together with either an empty variant or a Float variant field named r or a pair of Float variant fields named w and h.

2-33


Set of values:

Figure = pointy(Float Float)+ circular(Float Float Float)+ rectangular(Float Float Float Float)

e.g.: pointy(1.0, 2.0)circular(0.0, 0.0, 5.0)rectangular(1.5, 2.0, 3.0, 4.0)…

represents the point (1, 2)

represents a circle of radius 5 centered at (0, 0)

represents a 34 rectang-le centered at (1.5, 2)

2-34


Application code:

box: Figure :=(rectangular, 1.5, 2.0, 3.0, 4.0);

function area (fig: Figure) return Float isbegin

case fig.f iswhen pointy =>

return 0.0;when circular =>

return 3.1416 * fig.r**2;when rectangular =>

return fig.w * fig.h;end case;

end;

discriminated-record construction

tag test

projectionprojection

2-35

Example: Java objects (1)

Type declarations:

class Point {private float x, y;… // methods

}

class Circle extends Point {private float r;… // methods

}

class Rectangle extends Point {private float w, h;… // methods

}

inherits x and y from Point

inherits x and y from Point

2-36


Set of objects in this program:

Point(Float Float)+ Circle(Float Float Float)+ Rectangle(Float Float Float Float)+ …

The set of objects is open-ended. It is augmented by any further class declarations.

2-37


Methods:

class Point {…public float area(){ return 0.0; }

}

class Circle extends Point {…public float area(){ return 3.1416 * r * r; }

}

class Rectangle extends Point {…public float area(){ return w * h; }

}

overrides Point’s area() method

overrides Point’s area() method

2-38


Application code:

Rectangle box =new Rectangle(1.5, 2.0, 3.0, 4.0);

float a1 = box.area();

Point it = …;float a2 = it.area();

it can refer to a Point, Circle, or Rectangle object

calls the appropriate area() method

2-39

Recursive types

A recursive type is one defined in terms of itself.

Examples of recursive types:• lists

• trees

Drawing Hands (1948), litogravura - M C Escher

http://en.wikipedia.org/wiki/Drawing_Hands





http://en.wikipedia.org/wiki/M._C._Escher

http://en.wikipedia.org/wiki/M._C._Escher

2-40

Lists (1)

A list is a sequence of 0 or more component values.

The length of a list is its number of components. The empty list has no components.

A non-empty list consists of a head (its first component) and a tail (all but its first component).

A list is homogeneous if all its components are of the same type. Otherwise it is heterogeneous.

2-41

Lists (2)

Typical list operations:• length

• emptiness test

• head selection

• tail selection

• concatenation.

2-42

Lists (3)

For example, an integer-list may be defined recursively to be either empty or a pair consisting of an integer (its head) and a further integer-list (its tail):

Integer-List = nil Unit + cons(Integer Integer-List)

or Integer-List = { nil } { cons(i, l) | i Integer; l Integer-List }

where Unit is a type with only one (empty) value.

Solution:Integer-List = { nil }

{ cons(i, nil) | i Integer } { cons(i, cons(j, nil)) | i, j Integer } { cons(i, cons(j, cons(k, nil))) | i, j, k

Integer } …

2-43

Example: Haskell lists

Type declaration for integer-lists:

data IntList = Nil | Cons Int IntList

recursive Some IntList constructions:

Nil

Cons 2 (Cons 3 (Cons 5 (Cons 7 Nil)))

Actually, Haskell has built-in list types:

[Int] [String] [[Int]]

Some list constructions:

[] [2,3,5,7] ["cat","dog"] [[1],[2,3]]

2-44

Example: Ada lists

Type declarations for integer-lists:

type IntNode;type IntList is access IntNode;type IntNode is record

head: Integer;tail: IntList;

end record;

mutually recursivemutually recursive

An IntList construction:new IntNode'(2,

new IntNode'(3,new IntNode'(5,

new IntNode'(7, null)))

2-45

Example: Java lists (1)

Class declarations for integer-lists:

class IntList {

public int head;public IntList tail;

public IntList (int h, IntList t) {head = h; tail = t;

}

}

recursive

An integer-list construction:new IntList(2,

new IntList(3,new IntList(5,

new IntList(7, null)))));

2-46

Example: Java lists (2)

Class declarations for object-lists:

class List {

public Object head;public List tail;

public List (Object h, IntList t) {head = h; tail = t;

}

}

Note that List objects are heterogeneous lists (since head can refer to an object of any class).

By contrast, IntList objects are homogeneous lists.

2-47

Strings

A string is a sequence of 0 or more characters.

Some PLs (ML, Python) treat strings as primitive.

Haskell treats strings as lists of characters. Strings are thus equipped with general list operations (length, head selection, tail selection, concatenation, …).

Ada treats strings as arrays of characters. Strings are thus equipped with general array operations (length, indexing, slicing, concatenation, …).

Java treats strings as objects, of class String.

2-48

Type systems

A PL’s type system groups values into types:• to enable programmers to describe data effectively

• to help prevent type errors.

A type error occurs if a program performs a nonsensical operation such as multiplying a string by a boolean.

Possession of a type system distinguishes high-level PLs from low-level languages (such as assembly languages). In the latter, the only “types” are bytes and words, so nonsensical operations cannot be prevented.

2-49

Static vs dynamic typing (1)

Before any operation is performed, its operands must be type-checked to prevent a type error. E.g.:• mod operation: check that both operands are integers

• and operation: check that both operands are booleans

• indexing operation: check that the left operand is an array, and that the right operand is a value of the array’s index type.

2-50


In a statically typed PL:• all variables and expressions have fixed types

(either stated by the programmer or inferred by the compiler)

• all operands are type-checked at compile-time.

Most PLs are statically typed, including Ada, C, C++, Java, Haskell.

2-51


In a dynamically typed PL:• values have fixed types, but variables and expressions do not

• operands must be type-checked when they are computed at run-time.

Some PLs and many scripting languages are dynamically typed, including Smalltalk, Lisp, Prolog, Perl, Python.

2-52

Example: Ada static typing

Ada function definition:

function is_even (n: Integer)return Boolean is

beginreturn (n mod 2 = 0);

end;

The compiler doesn’t know the value of n. But, knowing that n’s type is Integer, it infers that the type of “n mod 2 = 0” will be Boolean.

The compiler doesn’t know the value of p. But, knowing that p’s type is Integer, it infers that the type of “p+1” will be Integer.

Call:p: Integer;…if is_even(p+1) …

Even without knowing the values of variables and parameters, the Ada compiler can guarantee that no type errors will happen at run-time.

2-53

Example: Python dynamic typing (1)

Python function definition:

def even (n):return (n % 2 == 0)

The type of n is unknown. So the “%” (mod) operation must be protected by a run-time type check.

The types of variables and parameters are not declared, and cannot be inferred by the Python compiler. So run-time type checks are needed to detect type errors.

2-54

Example: Python dynamic typing (2)

Python function definition:

def respond (prompt):# Print prompt and return the user’s response, # as an integer if possible, otherwise as a string.try:

response = raw_input(prompt)return int(response)

except ValueError:return response

yields a string

converts the string to an integer, or throws ValueError if impossible Application code:

m = respond("Month? ")if m == "Jan": m = 1elif m == "Feb": m = 2

2-55


Pros and cons of static and dynamic typing:• Static typing is more efficient. Dynamic typing requires run-time

type checks (which make the program run slower), and forces all values to be tagged (to make the type checks possible). Static typing requires only compile-time type checks, and does not force values to be tagged.

• Static typing is more secure: the compiler can guarantee that the object program contains no type errors. Dynamic typing provides no such security.

• Dynamic typing is more flexible. This is needed by some applications where the types of the data are not known in advance.

2-56

Type completeness (1)

In principle, a value of any type can be:• assigned

• composed with other values (as components of composite values)

• passed as an argument (to a procedure or function)

• returned as a function result.

But some (mainly older) PLs restrict which of these operations are applicable to certain types of values.

First-class values are values that are not restricted in which operations can be applied to them.

2-57


C:primitive structure array function

can be assigned? √ √ ? ?

can be composed? √ √ √ ?

can be argument? √ √ ? ?

can be function result? √ √ ? ?

Pascal:primitive record array function

can be assigned? √ √ √

can be composed? √ √ √

can be argument? √ √ √ √

can be function result? √ ? ?

2-58


Ada:primitive record array function

can be assigned? √ √ √ ?

can be composed? √ √ √ ?

can be argument? √ √ √ ?

can be function result? √ √ √ ?

Haskell:primitive tuple list function

can be composed? √ √ √ √

can be argument? √ √ √ √

can be function result? √ √ √ √

2-59

Example: type completeness (1)

Ada function and application code:type Complex is

record x, y: Float; end record;

function sum (c1, c2: Complex)return Complex is

beginreturn (c1.x+c2.x, c1.y+c2.y);

end;

-- Print the complex sum of p, q, and r:put(sum(sum(p, q), r));

2-60

Example: type completeness (2)

What if Ada function results were restricted to primitive values?

procedure add (c1, c2: in Complex;c3: out Complex) is

beginc3 := (c1.x+c2.x, c1.y+c2.y);

end;

-- Print the complex sum of p, q, and r:declare

t1, t2: Complex;begin

add(p, q, t1);add(t1, r, t2);put(t2);

end;

2-61

Type Completeness Principle

Some PLs are more class-conscious than others:• C and Pascal are very class-conscious.

• Ada is moderately class-conscious.

• Haskell is not class-conscious at all (all values are first-class).

PL designers should bear in mind the Type Completeness Principle:

No operation should be arbitrarily restricted in the types of its operands.

Examples:• Restricting function results to be primitive is arbitrary.

• Restricting the operands of and to be booleans is reasonable.

2-62

Expressions

An expression is a PL construct that may be evaluated to yield a value.

Forms of expressions:• literals (trivial)

• constant/variable accesses (trivial)

• constructions

• function calls

• conditional expressions

• iterative expressions.

2-63

Constructions

A construction is an expression that constructs a composite value from its component values.

In C, the component values are restricted to be literals. In Ada, Java, and Haskell, the component values are computed by evaluating subexpressions.

2-64

Example: Ada record and array constructions

Record constructions:

type Date is recordm: Month;d: Day_Number;

end record;today: Date := (Dec, 25);tomorrow: Date := (today.m, today.d+1);

Array construction:

leap: Integer range 0 .. 1;…month_length: array (Month) of Integer :=

(31, 28+leap, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31);

2-65

Example: Haskell tuple and list constructions

Tuple constructions:

today = (Dec, 25)m, d = todaytomorrow = (m, d+1)

List construction:

monthLengths =[31, if isLeap y then 29 else 28, 31, 30, 31, 30, 31, 31, 30, 31, 30,

31]

2-66

Example: Java object constructions

Assume:class Date {

public int m, d;

public Date (int m, int d) {this.m = m; this.d = d;

}

…}

Object constructions:Date today = new Date(12, 25);Date tomorrow =

new Date(today.m, today.d+1);

2-67

Function calls (1)

A function call computes a result by applying a function to some arguments.

If the function has a single argument, a function call typically has the form “F(E)”, or just “F E”, where F determines the function to be applied, and the expression E is evaluated to determine the argument.

In most PLs, F is just the identifier of a specific function. However, in PLs where functions as first-class values, F may be any expression yielding a function. E.g., this Haskell function call:

(if … then sin else cos)(x)

2-68

Function calls (2)

If a function has n parameters, the function call typically has the form “F(E1, …, En )”. We can view this function call as passing a single argument that is an n-tuple.

2-69

Function calls (3)

An operator may be thought of as denoting a function.

Applying a unary operator to its operand is essentially a function call with one argument:

E is essentially equivalent to (E)

Applying a binary operator to its operands is essentially a function call with two arguments:

E1 E2 is essentially equivalent to (E1, E2)

Thus a conventional arithmetic expression is essentially equivalent to a composition of function calls:

a * b + c / d is essentially equivalent to+(*(a, b), /(c, d))

2-70

Conditional expressions

A conditional expression chooses one of its subexpressions to evaluate, depending on a condition.

An if-expression chooses from two subexpressions, using a boolean condition.

A case-expression chooses from several subexpressions.

Conditional expressions are commonplace in functional PLs, but less common in imperative/OO PLs.

2-71

Example: Java if-expressions

Java if-expression:x>y ? x : y

Conditional expressions tend to be more elegant than conditional commands. Compare:

int max1 (int x, int y) {return (x>y ? x : y);

}

int max2 (int x, int y) {if (x>y)

return x;else

return y;}

2-72

Example: Haskell if- and case-expressions

Haskell if-expression:

if x>y then x else y

Haskell case-expression:

case m offeb -> if isLeap y then 29 else 28apr -> 30jun -> 30sep -> 30nov -> 30_ -> 31

2-73

Iterative expressions

An iterative expression is one that performs a computation over a series of values (typically the components of an array or list), yielding some result.

Iterative expressions are uncommon, but they are supported by Haskell in the form of list comprehensions.

2-74

Example: Haskell list comprehensions

Given a list of characters cs, convert all lowercase letters to uppercase, yielding a modified list of characters:

[if isLowercase c then toUppercase c else c| c <- cs]

Given a list of year numbers ys, compute a list (in the same order) of those year numbers in ys that are not leap years:

[y | y <- ys, not(isLeap y)]

2-75

Implementation notes

Values and types are mathematical abstractions.

In a computer, each value is represented by a bit sequence stored in one or more bytes or words.

Important principle: all values of the same type must be represented in a uniform way. (But values of different types can be represented in different ways.)

Sometimes the representation of a type is PL-defined (e.g., Java primitive types).

More commonly, the representation is implementation-defined, i.e., chosen by the compiler.

2-76

Representation of primitive types (1)

Each primitive type T is typically represented by single or multiple bytes: usually 8 bits, 16 bits, 32 bits, or 64 bits.

The choice of representation is constrained by the type’s cardinality, #T:• With n bits we can represent at most 2n different values.

• So the smallest possible representation is log2(#T) bits.

2-77


Booleans can in principle be represented by a single bit (0 for false and 1 for true). In practice, the compiler is likely to choose a whole byte.

Characters have a representation determined by the character set:• ASCII or ISO-Latin characters have an 8-bit representation

• Unicode characters have a 16-bit representation.

Enumerands are typically represented by unsigned integers starting from 0.• E.g., the enumerands of type Month above would be represented

by the integers {0, …, 11}. The representation must have at least 4 bits. In practice the compiler is likely to choose a whole byte.

2-78


Integers have a representation influenced by the desired range. Assuming two’s complement representation, in n bits we can represent the integers {–2n–1, …, 2n–1–1}:• In a PL where the compiler gets to choose the number of bits n,

from that we can deduce the range of integers.

• In a PL where the programmer defines the range of integers, the compiler must use that range to determine the minimum n. E.g., if the range is {0, …, 1010}, the representation must have at least 35 bits. In practice the compiler is likely to choose 64 bits.

Real numbers have a representation influenced by the desired range and precision. Nowadays most compilers adopt the IEEE floating-point standard (either 32 or 64 bits).

2-79

Representation of Cartesian products

jan

1

2000

dec

25

2004

m

d

y

Tuples, records, and structures are represented by juxtaposing the components in a fixed order.

Example (Ada):

type Date is recordy: Year_Number;m: Month;d: Day_Number;

end record;

Implementation of component selection:• Let r be a record or structure.

• Each component r.f has a fixed offset (determined by the compiler) relative to the base address of r.

2-80

Representation of arrays (1)

The values of an array type are represented by juxtaposing the components in ascending order of indices.

Example (Ada):

type Vector is array (1 .. 3) of Float;

3.0

4.0

0.0

1

3

2

1.0

1.0

0.5

2-81

Representation of arrays (2)

Implementation of array indexing:• Let a be an array with index range {l, …, u}.

• Assume that each component occupies s bytes (determined by the compiler).

• Then a(i) has offset s(i–l) bytes relative to the base address of a.(In C and Java l = 0, so this simplifies to si bytes.)

• The offset computation must be done at run-time (since the value of i is not known until run-time).

• A range check must also be done at run-time, to ensure thatl i u.

2-82

Representation of disjoint unions (1)

Each value of a disjoint-union type is represented by juxtaposing a tag with one of the possible variants. The type (and therefore representation) of the variant depends on the current value of the tag.

Example (Haskell):

data Number = Exact Int | Inexact Float

Exact

2

tag

variant

Inexact

3.1416

tag

variant

2-83


Example (Ada):

type Accuracy is (exact, inexact);type Number (acc: Accuracy := exact) is

recordcase acc of

when exact => ival: Integer;when inexact => rval: Float;

end case;end record;

acc exact

2ival

acc inexact

3.1416rval

2-84


Example (Ada):

type Form is (pointy,circular,rectangular);type Figure (f: Form := pointy) is record

x, y: Float;case f is

when pointy => null;when circular => r: Float;when rectangular => w, h:

Float;end case;

end record;

f pointy

x 1.0

2.0y

f circ.

x 0.0

0.0y

r 5.0

f rect.

x 1.5

2.0y

w 3.0

4.0h

2-85

Representation of objects (simplified)

Example (Java):class Point {

private float x, y;… // methods

}

class Circleextends Point {

private float r;… // methods

}

class Rectangleextends Point {

private float w, h;… // methods

}

tagPoint

x1.0

2.0 yCircle

x0.0

0.0 y

r5.0

tag

Rect.

x1.5

2.0 y

w3.0

4.0 h

tag

2-86


Implementation of tag test and projection:• Let u be a disjoint-union value/object.

• The tag of u has an offset of 0 relative to the base of u.

• Each variant of u has a fixed offset (determined by the compiler) relative to the base of u.

2-87

Representation of recursive types (1)

Each value of a recursive type is represented by a pointer (whether the PL has explicit pointers or not).

Example (Ada):

type IntList;type IntNode is record

head: Integer;tail: IntList;

end record;type IntList is access IntNode;

2 3 5 7 head

tail

2-88


Example (Haskell):

data IntList = Nil | Cons Int IntList

Cons

2

Cons

3

Cons

5

Cons

7

Nil

2-89


Example (Java):

class IntList {public int head;public IntList tail;…

}

IntList

2

IntList

3

IntList

5

IntList

7 head

tail

tag

Documents

2-1 © 2004, D.A. Watt, University of Glasgow 2 Values and Types Types of values. Primitive, composite, recursive types. Type systems: static vs dynamic