What do we really know about the differences between static and dynamic types?

Preview:

DESCRIPTION

Slides from the talk for Devnology, held 15 january 2014 at the Delft University of Technology. Presentation by Stefan Hanenberg.

Citation preview

What do we really know about the differences between static and dynamic types?

Stefan HanenbergUniversity of Duisburg-Essen, Germany

Delft, NL, 15.01.2014

Initial Notes

I like static type systems– Elegant specification– I am Teaching type systems since 2006

I like Squeak/Smalltalk– Nice programming environment– Straight syntax

I have no personal interest in arguing for / against static types

I have a personal interest in understanding whether a type system improves or worsens software development

Personal background (1)

● PhD in Aspect-Oriented Software Development (2006)

● While doing PhD / after PhD:● Serious doubts about usefulness of current AO

languages / AO in general

● Personal feeling:

„There is something wrong in how people argue for or against given artefacts.“● Started reading about scientific methods

(philosophy, mainly Popper)

Personal background (2)

● Personal conclusion (1)● We always argue why something should be in principle

good for developers.● We never take developers into account in our research

methods

● Applied research methods are completely unappropriate to argue for or against usefulness of given artefact

(by the way....are we really applying any research method?)

Personal conclusion (3)

● I want to do „empirical studies“● „test whether something has a measurable effect on

developers

● Why not testing type systems?● Not that much studies so far.....(amazing! How

come?)

Claim: State of the Art in Usability

● Current dominating approach

(1) Find example

(2) Build construct

(3) Claim that construct helps developers

This leads to nowhere● Research methods needed that consider

developers / users … involved humans ● Empirical Method!

7

Empirical SE

• Following the approach of Karl Popper

– Falsification of hypothesis(use of statically typed language decreases development time)

– NO PROOFS / NO GENERALIZABILITY

• But always the hope that repeated observations reveal some truth

8

Empirical SE - Example

• Hypothesis

• Using tool X reduces development time in comparison to tool Y

• Approach

• Measure development time for X, measure time for Y, do comparison

• Falsification

• ...in case development time for Y was less...

9

Context: CS Research Methods

[Hanenberg, Onward 2010]

Taken from [Hanenberg, Faith, Hope, Love, Onward'10]

Now, let's put the focus on type systems

Type Systems.....

● … in Teaching● Formal Approaches

– Lambda Calculus, Featherweight Java, ...– Type soundness proofs, ...

● What about Usability?– Static Types vs. Massive Testing?– Complexity of Static Type System?– ...

Questions for Industry

● Is it a rewarding investment to migrate software to a new type system?Java Generics, ...

● Should you invest money on development of a static type system?statically typed Ruby, ...

● Should you switch to a statically typed language?JavaScript vs. TypeScript, Groovy vs. Java, ...

State of Discussion Static vs. Dynamic Types

State of Discussion Static vs. Dynamic Types

State of Discussion Static vs. Dynamic Types

● Many fights, many arguments, lots of anecdotes

● Argumentations built on „personal impressions“● Arguments (hypothesis!) never actually tested

Overall Goal

Let's test the given arguments

(well, ok, the initial motivation was different)

Results so far....

It looks like (Java-like) static type system (in Java-like languages) really help in development!

10 Tested Statements and Results (1)

Naive Experiment: [OOPSLA'10] Dynamic Type System are great....almost...

Do type casts matter? [DLS'11] Not really.

Are dynamic TS as quick for fixing type errors as static TS? No, not even close! But no difference for semantic errors. [unpublished'11, ICPC'12]

10 Tested Statements and Results (2)

Are statically typed APIs faster to use?: [OOPSLA'12, ICPC'12] Yes

Is the previous finding only a matter of syntax? [AOSD'13] Yes, but in case there is an error in the (unchecked) type it is worse than having no type declaration at all!

Can documentation compensate the positive effect of static types? No. [submitted to ICSE'14]

10 Tested Statements and Results (3)

Do generics really help?: [OOPSLA'13] Yes, if they occur in API interface. No, if application has additional constraints because of generics.

Is the previous finding only a matter of syntax? [AOSD'14] Yes, but in case there is an error in the (unchecked) type it is worse than having no type declaration at all!

Can documentation compensate the positive effect of static types? No. [submitted to ICSE'14]

Do current IDEs (for dynamic TSs) compensate the previous measured positive effect of static types? [unpublished'14] No

Summary of statements● Don't argue with type casts – they do not matter

● Don't say that type error fixing time is the same for dynamically typed languages

● Don't say that good IDE support compensates the positive effect of static types – they don't

Summary of Statements

● In case dynamic languages have a benefit, it has nothing to do with the absence of the type system.

● In case they do have a benefit, it is despite the absense of the type system!

Let's take a look at the experiments (...and let's skip the statistical parts)

Related Work

● Two experiments available: Gannon'77, PrecheltTichy'98

● Both showed positive effect of static type systems (measured development time)

● Idea● Ok, let's do just another experiment

(...still in the learning phase of experimentation...)

First Experiment - Naive (1) [OOPSLA'10]

● Idea● Experiment similar to Gannon'77, PrecheltTichy'98 ● Measure number of errors / time to completion● Make programming task larger

(more generalizable?)

● How● ~50 subjects write parser / scanner● Measure time required for minimal scanner / final test case coverage

for parser● ~40 hours / subject = 1000 hours * subjects

● Results● Opposite to Gannon'77, PrecheltTichy'98

First Experiment - Naive(2) [OOPSLA'10]

●Scanner development took less time using dynamic types●No difference for parser...

First Experiment - Naive (3) [OOPSLA'10]

● Interpretation● There is at least one situation where static TS was counter productive● Falsification of „run an experiment and see the benefit of TS“

● Personal conclusion● Experiment much too expensive● Relatively few insights● Unclear what the additional insights are

● What's next?● Try to identify often mentioned statements in literature

– Type casts are bad for programmers, Type error fixing time better with TS

Second Experiment – Casts(1) [DLS'10]

● Idea● Test „type casts are bad“● Only time to completion as dependent variable● More tasks, smaller tasks

● How● ~21 subjects write very small programs (3-10 LOCs)● All programs in statically typed variant required type casts● ~4 hours / subject = 85 subject hours

● Results● For small tasks casts matter (decrease productivity)● For larger tasks (10 LOC) no difference measured

Second Experiment – Casts (2) [DLS'10]

● Results● Differences only for completely trivial tasks● Our interpretation: Type casts are not that important

Second Experiment – Casts(3) [DLS'11]

● Interpretation● Casts are not relevant enough for further studies

● Personal conclusion● Small experiments work● The more measurements the better● Change in experimental design worked well

● What's next?● Go on with often mentioned statements in literature

– Type error fixing time better with TS

Third Experiment – Type Errors (1) [Unpublished'11]

● Idea● Measure time until type error is fixed● Time to completion as dependent variable● Again more tasks, smaller tasks

● How● ~30 subjects, 120 subjects hours

● Results● Clear benefit in fixing time

Third Experiment – Type Errors (2) [Unpublished'11]

● Results

Really, really large differences pro Java!(for first task, runtime error stops exactly at same position as type error!)

Third Experiment – Type Errors (3) [Unpublished'11]

● Interpretation● Type error fixing time validated without doubt● No idea how often this situation occurs in programming

(controlled experiments won't help here)

● Personal conclusion● Fixing time considered as stable knowledge● Go on with different experiment, check fixing time from now on from time to

time

● What's next?● Go on with often mentioned statements in literature

– TS as documentation

4th Experiment - API Usage (1) [OOPSLA'12]

● Idea● 5 programming tasks on ondocumented API

(only source code)● Time to completion as dependent variable

● How● ~30 subjects, 210 subject hours

● Results● No clear results, 3 tasks show benefit of TS, 2 benefit of

dynamic types (!?!)

4th Experiment - API Usage (2) [OOPSLA'12]

● Results

Task 2 & 3 seem to show the opposite!

4th Experiment - API Usage (3) [OOPSLA'12]

● Interpretation● Ups....no clear interpretation● What about „bad luck“?

● Personal conclusion● Try to build up experiment from scratch, re-run it● There are situations where TS seem to be counterproductive

● What's next?● Re-run experiment

5th Experiment – API usage (1) [ICPC'12]

● Idea● 9 programming tasks, 2 type error fixing tasks, (2

semantic errors fixing tasks), 5 documentation tasks

● How● ~30 subjects, 120 subjects hours

● Results● Type Error fixing time confirmed, now clear results in

documentation pro TS

5th Experiment – API usage (2) [ICPC'12]

● Results

Shows what expected (+ replication of type error + semantic error tests)

5th Experiment – API usage (3) [ICPC'12]

● Interpretation● Not the same as 4th experiment, maybe „something is different“● What about „bad luck“?

● Personal conclusion● Consider positive documentation as proven● Keep in mind that „there might be still something out there....“

● What's next?● What about different type systems?● What about different languages?● Has documentation anything to do with type systems at all?

6th Experiment – Generics (1) [OOPSLA'13]

● Idea● 3 programming tasks on API usage (raw vs. Generic)● One extension task for strategy implementation● One type error fixing task (strategy)

● How● ~Analysis on only 16 subjects

● Results● API usage better in generics, terrible extension time for generic

strategy, no difference in type error fixing!

6th Experiment – Generics (2) [OOPSLA'13]

● Results

Task 5 is extension task (in strategy) – almost all subjects failed to do that in 55 minutes!

7th Experiment – Type Declaration vs. Type Checking (1)

● Idea● 3 programming tasks on API usage (repetition of previous

experiments, but no type checking!)● 1 programming task, where a wrong type name is in the API – code

needs to be corrected

● How● ~Analysis on only 20 subjects

● Results● Type names already help.....but wrong type names reduce usability

7th Experiment – Type Declaration vs. Type Checking (1)

Results● Type names already help.....but wrong type names reduce

usability

8th Experiment – Documentation (1)

● Idea● One programming task, 2 variables (static vs. Dynamic

type system + with vs. without documentation)

● How● ~Analysis on only 25 subjects

● Results● Type names help more than documentation!

8th Experiment – Documentation (1)

● Results● Type names help more than documentation!

Personal conclusion (1)

● Go on measuring

● Hopefully, we come up with a theory● Follow rigorous methods

● Use small sample sizes (!!!) - not convincing, but helps doing more experiments!

● Still only a few experiments to far....hopefully other people start doing experiments on type systems

Personal conclusion (2)

● Let's contribute to the type system war!

● Let's use facts as arguments!

● Let's start stop collecting annectodes!

● Let's say more agressive that we do not accept annecdotes as arguments

Personal conclusion (3)

● There is still plenty of experiments waiting to be done

● Think about whether you would like to contribute to the experiment series – all additional measurements help!

Summary of Statements

● In case dynamic languages have a benefit, it has nothing to do with the absence of the type system.

● In case they do have a benefit, it is despite the absense of the type system!

Conclusion

● It is possible to collect data about language constructs

● Controlled experiments are really a way to extract information / gather knowledge

● Maybe small experiments more useful than larger experiments

● Try to do not only a single experiment, but a collection of experiments in order to understand the topic

What do we really know about the differences between static and dynamic types

Stefan HanenbergUniversity of Duisburg-Essen, Germany

Delft, NL, 15.01.2014

Recommended