Upload
lamhanh
View
220
Download
1
Embed Size (px)
Citation preview
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II1
S C I E N C E P A S S I O N T E C H N O L O G Y
u www.tugraz.at
Selected Topics of Software Technology 3
Talking about apples and orangesStatic analysis techniques – Part II
Birgit Hofer
Institute for Software Technology
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II2
Warren Buffet
“Risk comes from not knowing
what you’re doing”
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II3
Outline
Static Analysis Techniques
Code smells repetition
Header and Unit inference
Data Debugging
Spreadsheet Environment specific remarks
Recognizing identical formulas
Useful libraries
Spectrum-based fault localization
Practical
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II4
Outline
Static Analysis Techniques
Code smells repetition
Header and Unit inference
Data Debugging
Spreadsheet Environment specific remarks
Recognizing identical formulas
Useful libraries
Spectrum-based fault localization
Practical
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II5
Spreadsheet
Quality Assurance
Techniques
Visualization
StaticAnalysis
Debugging
Testing
Modeling
Design & Maintenance
Support
Source: Jannach et al. “Avoiding, Finding and Fixing Spreadsheet Errors – A Survey of
Automated Approaches for Spreadsheet QA”, in Journal of Systems and Software, 2014.
Visualization: Patrick Koch,Diploma Seminar, TU Graz, 2015.
Static Analysis
• Code Smells
• Static Checker
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II6
What are Code Smells?
Any symptom in the source code of a program that
possibly indicates a deeper problem.
Not bugs, but they increase the risk
of introducing bugs.
Indicate the need of refactoring.
Refactoring
Process of changing a system in such a way that it does not alter the external behavior of the system but improves the internal structure. (Martin Fowler)
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II7
Code Smells in Software
Code Smells within Classes
Code Smells between Classes
Coding Standard
Long Methods
Long Parameter List
Duplicated Code
Large Classes
Dead code
Inappropriate Intimacy
Data clumps
Indecent Exposure
Feature Envy
Message Chains
Shotgun Surgery
Lazy Class
Middle Class
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II8
Spreadsheet Smells
Formula Smells
Interworksheet smells
Multiple References Multiple Operators Conditional Complexity
Duplicated FormulasLong Calculation Chains
Inappropriate Intimacy Feature Envy
Shotgun Surgery Middle Class
Standard Deviation Empty Cell
Pattern FinderReference to Empty CellString Distance
Quasi functional dependency
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II9
Identify the formula smells!
Source: Patrick Koch,Diploma Seminar, TU Graz, 2015.
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II10
Formula Smells - Solution
Source: Patrick Koch,Diploma Seminar, TU Graz, 2015.
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II11
Outline
Static Analysis Techniques
Code smells repetition
Header and Unit inference
Data Debugging
Spreadsheet Environment specific remarks
Recognizing identical formulas
Useful libraries
Spectrum-based Fault Localization
Practical
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II12
Relevant Literature
Abraham and Erwig:
“Ucheck: A Spreadsheet Type Checker for End Users”
Journal of Visual Languages and Computing,
Volume 18, 2007
Erwig and Burnett:
“Adding Apples and Oranges”
4th Int. Symposium on Practical Aspects
of Declarative Languages (PADL) 2002.
Antoniu et al.:
“Validating the Unit Correctness of
Spreadsheet Programs”
26th Int. Conference on Software
Engineering (ICSE), 2014.
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II13
Ariane 5's first test flight on 4 June 1996
failed, with the rocket self-destructing 37
seconds after launch because of a
malfunction in the control software. A data
conversion from 64-bit floating point value to
16-bit signed integer value to be stored in a
variable representing horizontal bias caused
a processor trap (operand error) because the
floating point value was too large to be
represented by a 16-bit signed integer. ….
Ariane 5
Source: Wikipedia, https://en.wikipedia.org/wiki/Ariane_5, 2015-10-19
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II14
Classical Type Checker
Purpose
Find incorrect applications of operations/assignments
E.g. a multiplication of a number and a string
Types
Runtime errors
Incorrect results
Spreadsheets:
Label-based Type Checking to find more faults!
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II15
UCheck - A Spreadsheet Type Checker form the Oregon State University
Talking about apples and oranges
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II16
A Harvest Example
Source: Erwig and Burnett: “Adding Apples and Oranges”, 4th Int. Symposium on Practical Aspects of
Declarative Languages (PADL) 2002.
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II17
A Harvest Example
Source: Erwig and Burnett: “Adding Apples and Oranges”, 4th Int. Symposium on Practical Aspects of
Declarative Languages (PADL) 2002.
Month
Month [May]
Month[June]
Month[May] | Month[June]
= Month[May|June]
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II18
A Harvest Example
Source: Erwig and Burnett: “Adding Apples and Oranges”, 4th Int. Symposium on Practical Aspects of
Declarative Languages (PADL) 2002.
Fruit
Fruit [Apples]
Fruit[Oranges] Fruit[Apples] | Fruit[Oranges]
= Fruit[Apples|Oranges]
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II19
A Harvest Example
Source: Erwig and Burnett: “Adding Apples and Oranges”, 4th Int. Symposium on Practical Aspects of
Declarative Languages (PADL) 2002.
Month[May] & Fruit[Apples]
Month[June] & Fruit[Apples]
Month[May] & Fruit[Apples] | Month[June] & Fruit[Apples]
=(Month[May] | Month[June]) & Fruit[Apples]
=Month[May|June] & Fruit[Apples]
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II20
Rules for Unit Propagation
1. Every value without header is a well-formed unit.
2. If a cell has a value v and a header U, then u[v] is
a well-formed unit.
3. If two units have no common root unit, you can link
the units using &.
4. If two units have a common root unit, you can link
the units using |.
If you cannot derive meaningful unit expressions, you might have found an incorrect formula!
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II21
Identify all types and units!
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II22
Modified Harvest Example
Month[June] & Fruit[Apples]
Month[May] & Fruit [Oranges]
Month[June] & Fruit[Apples] |
Month[May] & Fruit[Oranges]
Month[May] & Fruit[Apples|Oranges]
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II23
Identify all types and units!
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II24
Another Type Checking Example
Source: Elisabeth Getzner: “Survey of Fault Localization Techniques in Spreadsheets”,
Diplomanten-Seminar, TU Graz 2014.
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II25
Another Type Checking Example
Source: Elisabeth Getzner: “Survey of Fault Localization Techniques in Spreadsheets”, Diploma Seminar, TU Graz 2014.
Hours & Worker [Jones]
Hours & Worker [Smith]
Bonus & Worker [Smith]
Salary & Hours & Worker [Jones]
Salary & Hours & Worker [Smith]
Salary Hours & Worker [Jones]
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II26
The devil is in the details.
Header Inference is not that easy!
Spatial layout (top down, left right)
Heuristics required
Header
Core
Footer
Cells to label the data
• do not contain formulas
• are not input to other cells
Aggregation formulas
• at the end of rows or columns, or
• formulas that reference core cells or other formula cells
Data cells
• do not contain formulas
• are input to other cells
FillerBlank cells or cells with special formatting
(to separate tables within sheets)
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II27
Validating Unit Correctness
Source: Antoniu et al.: “Validating the Unit Correctness of Spreadsheet Programs”, ICSE 2014.
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II28
Validating Unit Correctness
Source: Antoniu et al.: “Validating the Unit Correctness of Spreadsheet Programs”, ICSE 2014.
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II29
Validating Unit Correctness
Source: Antoniu et al.: “Validating the Unit Correctness of Spreadsheet Programs”, ICSE 2014.
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II30
Validating Unit Correctness
Source: Antoniu et al.: “Validating the Unit Correctness of Spreadsheet Programs”, ICSE 2014.
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II31
Outline
Static Analysis Techniques
Code smells repetition
Header and Unit inference
Data Debugging
Spreadsheet Environment specific remarks
Recognizing identical formulas
Useful libraries
Spectrum-based Fault Localization
Practical
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II32
Relevant Literature
Barawy, Gochev and Berger:
“CheckCell: Data Debugging for Spreadsheets”
Object-Oriented Programming, Systems,
Languages & Applications (OOPSLA), 2014.
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II33
Programmers proverb
Garbage in garbage out
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II34
Data Debugging
Input data errors
Data entry errors
Measurement errors
Data integration errors
Techniques
Statistical (Gaussian distribution, std. deviation)
Identify cells that have an unusual impact on result
CheckCell
Source: Barawy, Gochev and Berger: “CheckCell: Data Debugging for Spreadsheets”
Object-Oriented Programming, Systems, Languages & Applications (OOPSLA), 2014.
Identify cells that have an unusual impact on result
CheckCell
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II35
CheckCell
Premises on input vector:
Values in vector exchangeable
Function input is homogeneous
Computed value changes significantly when an
erroneous input value is corrected
Computation Trees
Root node = formula, leaves = input values
Dependency Graph == Computation Forest
Bootstrap procedure
To determine effect of a particular input on formulas
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II36
Example CheckCell
Source: Barawy, Gochev and Berger: “CheckCell: Data Debugging for Spreadsheets”
Object-Oriented Programming, Systems, Languages & Applications (OOPSLA), 2014.
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II37
Example CheckCell
Source: Barawy, Gochev and Berger: “CheckCell: Data Debugging for Spreadsheets”
Object-Oriented Programming, Systems, Languages & Applications (OOPSLA), 2014.
“Zahlendreher” - reversal of two neighboring
digits when writing down a number
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II38
Outline
Static Analysis Techniques
Code smells repetition
Header and Unit inference
Data Debugging
Spreadsheet Environment specific remarks
Recognizing identical formulas
Useful libraries
Spectrum-based Fault Localization
Practical
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II39
How to recognize equivalent formulas?
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II40
The R1C1 cell reference system
Absolute References
Relative References
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II41
The R1C1 cell reference system
Reference Meaning
R[-2]CRelative reference to the cell two rows up and in the same
column
R[2]C[2]Relative reference to the cell two rows down and two columns to
the right
R2C2Absolute reference to the cell in the second row and the second
column
R[-1] Relative reference to the entire row above the active cell
R Reference to the current row
Source: http://office.microsoft.com/en-us/help/about-cell-and-range-references-HP005198323.aspx
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II42
R1C1 with fixed references
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II43
Helpful libraries
Java:
Apache POI
http://poi.apache.org
Generic Framework for Spreadsheet Analysis
extends POI
http://ssaapp.di.uminho.pt
.NET:
Gembox
www.gemboxsoftware.com/spreadsheet/overview
Google GData API
https://developers.google.com/gdata
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II44
Outline
Static Analysis Techniques
Code smells repetition
Header and Unit inference
Data Debugging
Spreadsheet Environment specific remarks
Recognizing identical formulas
Useful libraries
Spectrum-based Fault Localization
Practical
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II45
Outline
Static Analysis Techniques
Code smells repetition
Header and Unit inference
Data Debugging
Spreadsheet Environment specific remarks
Recognizing identical formulas
Useful libraries
Spectrum-based Fault Localization
Practical
Selected Topics of Software Technology 3
Static Analysis Techniques – Part II46
Practical
No lecture for the next month
PART 1 - Testing (individual)
Read a scientific paper and present its content
in class (30 minutes/person, November 23rd)
In case of questions email