Seminar „Komponenten" Empiricism in Computer Science Eike Send Freie Universität Berlin, Institut für Informatik http://www.inf.fu-berlin.de/inst/ag-se/ Part 1 Definition: Empiricism Definition: Experiment Quantative vs Qualitive Data Analysis Part 2 State of Empiricism in CS Object Oriented vs Procedural Sample Experiment

Empiricism in Computer Science Eike Send Freie Universität Berlin, Institut für Informatik

Seminar „Komponenten"

Seminar „Komponenten“

Empiricism in Computer ScienceEike Send

Freie Universität Berlin, Institut für Informatikhttp://www.inf.fu-berlin.de/inst/ag-se/

Part 1• Definition: Empiricism • Definition: Experiment• Quantative vs Qualitive• Data Analysis

Part 2• State of Empiricism in CS• Object Oriented vs

Procedural• Sample Experiment

Part 1:Empiricism in general

Part 1:Empiricism in general

Gaining knowledge

Gaining knowledge

• How do we gain knowledge? by intuition (direct insight) from some authority (tradition, teacher, book etc.) by rational thought (reasoning, deduction) by direct observation via the scientific method

• No method can make totally sure that the information is correct

• All of these methods can lead to a correct result• The scientific method will most likely do

It will also be the most convincing one• What is the scientific method though?

Definition: Empiricism

Definition: Empiricism

• Empiricism (greek εμπειρισμός, from empirical, latin experientia - the experience) [...] is generally regarded as being at the heart of the modern scientific method, that our theories should be based on our observations of the world rather than on intuition or faith

The scientific method

The scientific method

• Empiricism is: based on observation opposed to:

• being based on theoretical considerations• intuition• random selection

• The purpose is typically making some kind of decision: Selecting among solution candidates Understanding priorities Deciding yes or no

Falsification and Karl Popper

Falsification and Karl Popper

• Falsification: Falsification is the contradiction of hypotheses oder theories

through empirical statements (for example by observation or experiment)

Note: Even if a theory is wrong, it might be that it predicts the results of an experiment correctly

• This view of science was suggested by Karl Popper (1904-1994) It is the prevalent scientific paradigm today In this view theories cannot bedirectly confirmed, only refuted If a theory cannot be refuted for a long time, it will gradually beaccepted as confirmed

Different kinds of experiments

Different kinds of experiments

• Empirical research can be conducted in different ways, Tichy has identified seven methods: Case study Field experiment Controlled experiment Survey Meta study Simulation Benchmark

• The scientifically most credible method is the controlled experiment / formal experiment

Quantitative vs. qualitative

Quantitative vs. qualitative

• A word on other aspects to discriminate research: Quantitative research is the numerical representation

of observations for describing phenomena Qualitative research describes how individuals and

groups view and understand the world and construct meaning out of their experiences

• It is essentially narrative-oriented

It is a good idea to combine both:• Quantitative: How much time is spent on a type of

work?• Qualitative: On what activities is it spent?

Terminology of experiments Part 1

Terminology of experiments Part 1

• The terminology of a formal experiment in Computer Science (CS):

Treatments are activities, methods or tools to be compared• e.g. object oriented programming vs. procedural tech.

Control is used to help isolate the effects of a treatment• e.g. using a limited inheritance depth or NOT

Trials are individual test runs, using only one treatment Subjects are the people applying the treatment

Terminology of experiments Part 2

Terminology of experiments Part 2

Dependent variables are those factors that are expected to change or differ as a result of applying a treatment

• e.g. the time it takes to find an error Independent variables are those variables that may influence

the application of a treatment and thus indirectly the result• e.g. the subjects in a study

Error is the failure of two identically treated experimental units to yield identical results

The Formal Experiment

The Formal Experiment

• But what is a formal experiment? A controlled empirical investigation into some

phenomenon with• A clearly stated hypothesis and• A random assignment of subjects to different


Example of an experiment

Example of an experiment

• In 1990 Henry and Humphrey compared an object oriented language with a procedural language:

Treatments: C, objective C Control: Using either one or the other Subjects: Twenty students Dependent variables: maintenance time, error counts,

change counts, programmer's subjective impression Independent variables: subject (student identifier), group (A

or B), language, modification task

• We'll get back to this experiment later on

Six steps of an Experiment (in CS)

Six steps of an Experiment (in CS)

• Conception: Deciding what we wish to learn more about Define the goals of the experiment State clearly and precisely the objective of the study

• Design: Translate the objective into a formal hypothesis Re-express the goal as a hypotheses

• The hypothesis is a tentative theory or supposition that we think explains the behaviour we want to explore

• Often there will be more than one or two hypotheses• The null hypotheses states that there is no difference

between the treatments• The alternative hypotheses states that there is a

significant difference

• Preparation: Make the subjects and the environment ready. If possible, a pilot study of the experiment should be conducted.

Six steps of an Experiment (in CS) Part 2

Six steps of an Experiment (in CS) Part 2

• Execution

• Analysis: This phase consists of two parts. All the measurements taken must be reviewed in order to

ensure that they are valid and useful. Analyse of the sets of data according to usual statistical


• Dissemination and Decision-making: Document the experimental materials and conclusions in a way that will allow others to replicate and confirm the conclusions in a similar setting. The results can be used to

Support decisions about software development and maintenance

Suggest improvements to development environments Perform similar experiments with different subjects or

dependent variables

Data Analysis

Data Analysis

Data Analysis

Data analysis

Data analysis

• There are several different kinds of general goals when analysing data: Exploring something: You do not know in advance what

to expect in the data. You try to get an overview of the data you have and to find interesting structure in the data

Measuring something: You know exactly what aspect of an object you are interested in

Modelling something for explanation: You want to describe the mechanism that has produced the data

Modelling something for prediction: You consider your data to be examples. You want to find out how to predict output values given the input values

Comparing something: You have two or more “things” and want to compare them with respect to one or more attributes.

Data analysis Part 2

Data analysis Part 2

• Data analysis has to support the primary quality attributes of the empirical study overall: Credibility Relevance

• In order to gain credebility the data has to be Correct: Neither mis-collected nor mis-processed and we

trust the analysis (and hence its results) Illustrative: Easy to understand what the results say and

how they came to be, given the data. The analysis makes us understand the data itself

Informative: The analysis reports results that are relevant and helpful for answering the study question

Four steps for data analysis

Four steps for data analysis

• Prechelt provides four steps to carry out a proper data analysis: Make data available

• E.g. making it machine readable Validate data

• Compare it with expected results• Check for impossible or unlikely values

Explore data• Get an overview of the whole data set• Look at individual variables• Look at pairs of variables• Quick-check specific expectations

Perform analysis: measure, model, or compare

Eike Send, [email protected] 19

Part 2: The state of Empiricism in Computer Science

The State of Empiricism in Computer Science

The State of Empiricism in Computer Science

• “Large parts of CS may not meet standards long established in the natural and engineering sciences”

• Knowledge is based on “intuitive feelings and anecdotal evidence, than on empirical and quantitative evidence”

• “Students of computer science are not educated as scientists. They are trained as programmers”

• “The state of experimentation in software engineering is poor. [...] On the whole, we consider this situation as unacceptable, even alarming”

• “A survey of over 400 recent research articles suggests that computer scientists publish relatively few papers with experimentally validated results.”

• Experts developed a consensus “that favoured fostering a culture closer to the traditional sciences as far as experimentation is concerned.”

Theory, Construction, Empiricism

Theory, Construction, Empiricism

• The modern sciences consist of: Theory Construction Empiricism

• Computer science is a mixture of the purely theoretical mathematics and the mostly constructive but also theoretical and empirical electrical engineering

• All three are essential for successful work

• Empiricism is slowly gaining ground

Example: Meta Study

Example: Meta Study

Example: Meta Study

Meta study: Object oriented vs. procedural

Meta study: Object oriented vs. procedural

• Deligiannis et al. have conducted a meta study: The question was: “What have studies found out about

object oriented technology?” They took 27 studies into account of which 18 met their

requirements of a formal experiment or were case studies 10 of these were on the question “Is there a significant

advantage of object oriented technology over procedural technology?”

The other studies were on the subject of inheritance, design principles, design patterns or inspection techniques

Evaluation framework

Evaluation framework

• For the evaluation of the experiments they used a framework provided by Wohlin et al. which separates the analysis into the following parts:

Motivation Hypotheses Variables Participants Experiment design Results and interpretation Critique

• The main critique was that most papers merely used students as participants, because in OOT expertise is important

• More critique: Designs were flawed or too simple or the setups lacked a documentation of the domain.

• In the results of the 10 experiments on OOT vs. PT they found out that only 2 could make out an advantage of OOT

Example: OOT vs. PT

Example: OOT vs. PT

Example: Study of Object Oriented Technology vs.

Procedural Technology

Summary: Henry and Humphrey

Summary: Henry and Humphrey

• One of the experiments which could make out a difference was the earlier mentioned study by Henry and Humphrey

• Motivation Lack of scientific evidence for OOT superiority

• Hypotheses alternative H0: It is easier to maintain OO programs than

structured programs, H1: OO programmers take less time to perform a

maintenance task, H2: OO maintenance requires fewer changes to the code,

H3: OO programmers perceive the changes as conceptually easier, and

H4: OO programmers make fewer errors during the maintenance task

Summary: Henry & Humphrey Part 2

Summary: Henry & Humphrey Part 2

• Variables There were four independent variables:

• Subject (students)• Group (A or B)• Programming language• Modification task

The dependent variables were:• Maintenance times• Error counts• Change counts• Programmers subjective impression

• Participants “Twenty students participated.”

Summary: Henry & Humphrey Part 3

Summary: Henry & Humphrey Part 3

• Experiment design Using a counterbalancing procedure the subjects were to perform enhancement maintenance tasks. The subjects performed each task twice, once in C and once in Objective C and completed a post-experimental questionnaire.

• Results and interpretation “This experiment supports the hypothesis that subjects produce more maintainable code with an OO language than with a PO language.”

• Critique The counting of maintenance time relied solely “upon the accuracy of subjects reporting minutes of thinking time” and it is also noted “that inexperienced subjects were used and design documentation was not provided.”

Summary


• Empiricism: Is good for gaining valid and credible knowledge Is an important component of scientific work Works through falsification Can use different methods Can be quantitative or qualitative or both

• There is a formal way to carry out an experiment which uses a formal terminology

• Data analysis can have different goals using different methods

• Empiricism is a part of computer science

• Currently empiricism in CS does not meet scientific standards

• There is some empirical research being conducted, although it still needs to be validated

Thank you!

Thank you!