46
Selected Topics of Software Technology 3 Static Analysis Techniques Part II 1 S C I E N C E P A S S I O N T E C H N O L O G Y u www.tugraz.at Selected Topics of Software Technology 3 Talking about apples and oranges Static analysis techniques Part II Birgit Hofer Institute for Software Technology

Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

  • Upload
    lamhanh

  • View
    220

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II1

S C I E N C E P A S S I O N T E C H N O L O G Y

u www.tugraz.at

Selected Topics of Software Technology 3

Talking about apples and orangesStatic analysis techniques – Part II

Birgit Hofer

Institute for Software Technology

Page 2: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II2

Warren Buffet

“Risk comes from not knowing

what you’re doing”

Page 3: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II3

Outline

Static Analysis Techniques

Code smells repetition

Header and Unit inference

Data Debugging

Spreadsheet Environment specific remarks

Recognizing identical formulas

Useful libraries

Spectrum-based fault localization

Practical

Page 4: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II4

Outline

Static Analysis Techniques

Code smells repetition

Header and Unit inference

Data Debugging

Spreadsheet Environment specific remarks

Recognizing identical formulas

Useful libraries

Spectrum-based fault localization

Practical

Page 5: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II5

Spreadsheet

Quality Assurance

Techniques

Visualization

StaticAnalysis

Debugging

Testing

Modeling

Design & Maintenance

Support

Source: Jannach et al. “Avoiding, Finding and Fixing Spreadsheet Errors – A Survey of

Automated Approaches for Spreadsheet QA”, in Journal of Systems and Software, 2014.

Visualization: Patrick Koch,Diploma Seminar, TU Graz, 2015.

Static Analysis

• Code Smells

• Static Checker

Page 6: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II6

What are Code Smells?

Any symptom in the source code of a program that

possibly indicates a deeper problem.

Not bugs, but they increase the risk

of introducing bugs.

Indicate the need of refactoring.

Refactoring

Process of changing a system in such a way that it does not alter the external behavior of the system but improves the internal structure. (Martin Fowler)

Page 7: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II7

Code Smells in Software

Code Smells within Classes

Code Smells between Classes

Coding Standard

Long Methods

Long Parameter List

Duplicated Code

Large Classes

Dead code

Inappropriate Intimacy

Data clumps

Indecent Exposure

Feature Envy

Message Chains

Shotgun Surgery

Lazy Class

Middle Class

Page 8: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II8

Spreadsheet Smells

Formula Smells

Interworksheet smells

Multiple References Multiple Operators Conditional Complexity

Duplicated FormulasLong Calculation Chains

Inappropriate Intimacy Feature Envy

Shotgun Surgery Middle Class

Standard Deviation Empty Cell

Pattern FinderReference to Empty CellString Distance

Quasi functional dependency

Page 9: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II9

Identify the formula smells!

Source: Patrick Koch,Diploma Seminar, TU Graz, 2015.

Page 10: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II10

Formula Smells - Solution

Source: Patrick Koch,Diploma Seminar, TU Graz, 2015.

Page 11: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II11

Outline

Static Analysis Techniques

Code smells repetition

Header and Unit inference

Data Debugging

Spreadsheet Environment specific remarks

Recognizing identical formulas

Useful libraries

Spectrum-based Fault Localization

Practical

Page 12: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II12

Relevant Literature

Abraham and Erwig:

“Ucheck: A Spreadsheet Type Checker for End Users”

Journal of Visual Languages and Computing,

Volume 18, 2007

Erwig and Burnett:

“Adding Apples and Oranges”

4th Int. Symposium on Practical Aspects

of Declarative Languages (PADL) 2002.

Antoniu et al.:

“Validating the Unit Correctness of

Spreadsheet Programs”

26th Int. Conference on Software

Engineering (ICSE), 2014.

Page 13: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II13

Ariane 5's first test flight on 4 June 1996

failed, with the rocket self-destructing 37

seconds after launch because of a

malfunction in the control software. A data

conversion from 64-bit floating point value to

16-bit signed integer value to be stored in a

variable representing horizontal bias caused

a processor trap (operand error) because the

floating point value was too large to be

represented by a 16-bit signed integer. ….

Ariane 5

Source: Wikipedia, https://en.wikipedia.org/wiki/Ariane_5, 2015-10-19

Page 14: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II14

Classical Type Checker

Purpose

Find incorrect applications of operations/assignments

E.g. a multiplication of a number and a string

Types

Runtime errors

Incorrect results

Spreadsheets:

Label-based Type Checking to find more faults!

Page 15: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II15

UCheck - A Spreadsheet Type Checker form the Oregon State University

Talking about apples and oranges

Page 16: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II16

A Harvest Example

Source: Erwig and Burnett: “Adding Apples and Oranges”, 4th Int. Symposium on Practical Aspects of

Declarative Languages (PADL) 2002.

Page 17: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II17

A Harvest Example

Source: Erwig and Burnett: “Adding Apples and Oranges”, 4th Int. Symposium on Practical Aspects of

Declarative Languages (PADL) 2002.

Month

Month [May]

Month[June]

Month[May] | Month[June]

= Month[May|June]

Page 18: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II18

A Harvest Example

Source: Erwig and Burnett: “Adding Apples and Oranges”, 4th Int. Symposium on Practical Aspects of

Declarative Languages (PADL) 2002.

Fruit

Fruit [Apples]

Fruit[Oranges] Fruit[Apples] | Fruit[Oranges]

= Fruit[Apples|Oranges]

Page 19: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II19

A Harvest Example

Source: Erwig and Burnett: “Adding Apples and Oranges”, 4th Int. Symposium on Practical Aspects of

Declarative Languages (PADL) 2002.

Month[May] & Fruit[Apples]

Month[June] & Fruit[Apples]

Month[May] & Fruit[Apples] | Month[June] & Fruit[Apples]

=(Month[May] | Month[June]) & Fruit[Apples]

=Month[May|June] & Fruit[Apples]

Page 20: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II20

Rules for Unit Propagation

1. Every value without header is a well-formed unit.

2. If a cell has a value v and a header U, then u[v] is

a well-formed unit.

3. If two units have no common root unit, you can link

the units using &.

4. If two units have a common root unit, you can link

the units using |.

If you cannot derive meaningful unit expressions, you might have found an incorrect formula!

Page 21: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II21

Identify all types and units!

Page 22: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II22

Modified Harvest Example

Month[June] & Fruit[Apples]

Month[May] & Fruit [Oranges]

Month[June] & Fruit[Apples] |

Month[May] & Fruit[Oranges]

Month[May] & Fruit[Apples|Oranges]

Page 23: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II23

Identify all types and units!

Page 24: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II24

Another Type Checking Example

Source: Elisabeth Getzner: “Survey of Fault Localization Techniques in Spreadsheets”,

Diplomanten-Seminar, TU Graz 2014.

Page 25: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II25

Another Type Checking Example

Source: Elisabeth Getzner: “Survey of Fault Localization Techniques in Spreadsheets”, Diploma Seminar, TU Graz 2014.

Hours & Worker [Jones]

Hours & Worker [Smith]

Bonus & Worker [Smith]

Salary & Hours & Worker [Jones]

Salary & Hours & Worker [Smith]

Salary Hours & Worker [Jones]

Page 26: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II26

The devil is in the details.

Header Inference is not that easy!

Spatial layout (top down, left right)

Heuristics required

Header

Core

Footer

Cells to label the data

• do not contain formulas

• are not input to other cells

Aggregation formulas

• at the end of rows or columns, or

• formulas that reference core cells or other formula cells

Data cells

• do not contain formulas

• are input to other cells

FillerBlank cells or cells with special formatting

(to separate tables within sheets)

Page 27: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II27

Validating Unit Correctness

Source: Antoniu et al.: “Validating the Unit Correctness of Spreadsheet Programs”, ICSE 2014.

Page 28: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II28

Validating Unit Correctness

Source: Antoniu et al.: “Validating the Unit Correctness of Spreadsheet Programs”, ICSE 2014.

Page 29: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II29

Validating Unit Correctness

Source: Antoniu et al.: “Validating the Unit Correctness of Spreadsheet Programs”, ICSE 2014.

Page 30: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II30

Validating Unit Correctness

Source: Antoniu et al.: “Validating the Unit Correctness of Spreadsheet Programs”, ICSE 2014.

Page 31: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II31

Outline

Static Analysis Techniques

Code smells repetition

Header and Unit inference

Data Debugging

Spreadsheet Environment specific remarks

Recognizing identical formulas

Useful libraries

Spectrum-based Fault Localization

Practical

Page 32: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II32

Relevant Literature

Barawy, Gochev and Berger:

“CheckCell: Data Debugging for Spreadsheets”

Object-Oriented Programming, Systems,

Languages & Applications (OOPSLA), 2014.

Page 33: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II33

Programmers proverb

Garbage in garbage out

Page 34: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II34

Data Debugging

Input data errors

Data entry errors

Measurement errors

Data integration errors

Techniques

Statistical (Gaussian distribution, std. deviation)

Identify cells that have an unusual impact on result

CheckCell

Source: Barawy, Gochev and Berger: “CheckCell: Data Debugging for Spreadsheets”

Object-Oriented Programming, Systems, Languages & Applications (OOPSLA), 2014.

Identify cells that have an unusual impact on result

CheckCell

Page 35: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II35

CheckCell

Premises on input vector:

Values in vector exchangeable

Function input is homogeneous

Computed value changes significantly when an

erroneous input value is corrected

Computation Trees

Root node = formula, leaves = input values

Dependency Graph == Computation Forest

Bootstrap procedure

To determine effect of a particular input on formulas

Page 36: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II36

Example CheckCell

Source: Barawy, Gochev and Berger: “CheckCell: Data Debugging for Spreadsheets”

Object-Oriented Programming, Systems, Languages & Applications (OOPSLA), 2014.

Page 37: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II37

Example CheckCell

Source: Barawy, Gochev and Berger: “CheckCell: Data Debugging for Spreadsheets”

Object-Oriented Programming, Systems, Languages & Applications (OOPSLA), 2014.

“Zahlendreher” - reversal of two neighboring

digits when writing down a number

Page 38: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II38

Outline

Static Analysis Techniques

Code smells repetition

Header and Unit inference

Data Debugging

Spreadsheet Environment specific remarks

Recognizing identical formulas

Useful libraries

Spectrum-based Fault Localization

Practical

Page 39: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II39

How to recognize equivalent formulas?

Page 40: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II40

The R1C1 cell reference system

Absolute References

Relative References

Page 41: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II41

The R1C1 cell reference system

Reference Meaning

R[-2]CRelative reference to the cell two rows up and in the same

column

R[2]C[2]Relative reference to the cell two rows down and two columns to

the right

R2C2Absolute reference to the cell in the second row and the second

column

R[-1] Relative reference to the entire row above the active cell

R Reference to the current row

Source: http://office.microsoft.com/en-us/help/about-cell-and-range-references-HP005198323.aspx

Page 42: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II42

R1C1 with fixed references

Page 43: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II43

Helpful libraries

Java:

Apache POI

http://poi.apache.org

Generic Framework for Spreadsheet Analysis

extends POI

http://ssaapp.di.uminho.pt

.NET:

Gembox

www.gemboxsoftware.com/spreadsheet/overview

Google GData API

https://developers.google.com/gdata

Page 44: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II44

Outline

Static Analysis Techniques

Code smells repetition

Header and Unit inference

Data Debugging

Spreadsheet Environment specific remarks

Recognizing identical formulas

Useful libraries

Spectrum-based Fault Localization

Practical

Page 45: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II45

Outline

Static Analysis Techniques

Code smells repetition

Header and Unit inference

Data Debugging

Spreadsheet Environment specific remarks

Recognizing identical formulas

Useful libraries

Spectrum-based Fault Localization

Practical

Page 46: Static analysis techniques Part II - ist.tugraz.at · 1 Static Analysis Techniques –Part II ... Modeling Design & ... “Validating the Unit Correctness of Spreadsheet Programs

Selected Topics of Software Technology 3

Static Analysis Techniques – Part II46

Practical

No lecture for the next month

PART 1 - Testing (individual)

Read a scientific paper and present its content

in class (30 minutes/person, November 23rd)

In case of questions email