98
Graph-Based Source Code Analysis of JavaScript Repositories Budapest University of Technology and Economics Department of Measurement and Information Systems Fault Tolerant Systems Research Group Dániel Stein Gábor Szárnyas

Graph-Based Source Code Analysis of JavaScript Repositories

Embed Size (px)

Citation preview

Page 1: Graph-Based Source Code Analysis of JavaScript Repositories

Graph-Based Source Code Analysisof JavaScript Repositories

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems

Fault Tolerant Systems Research Group

Dániel SteinGábor Szárnyas

Page 2: Graph-Based Source Code Analysis of JavaScript Repositories

Content

1. Context

2. Tooling

3. Use Cases

4. Neo4j Observations

2

Page 3: Graph-Based Source Code Analysis of JavaScript Repositories

Continuous Integration (CI)

– Developers working together

– Prevent integration problems

– Examples

– Jenkins

– Hudson

– Travis CI

3

Verziókezelés

Fordítás

Fejlesztés

Egység- és integrációs teszt

DevelopmentVersion ControlSystem

CompilationUnit and

IntegrationTests

Page 4: Graph-Based Source Code Analysis of JavaScript Repositories

4

Apple,https://blog.codecentric.de/en/2014/02/curly-braces/

Page 5: Graph-Based Source Code Analysis of JavaScript Repositories

4

Apple,https://blog.codecentric.de/en/2014/02/curly-braces/

Page 6: Graph-Based Source Code Analysis of JavaScript Repositories

4

whoops

Apple,https://blog.codecentric.de/en/2014/02/curly-braces/

Page 7: Graph-Based Source Code Analysis of JavaScript Repositories

Static Analysis

– No need for compilation orexecution of the application

– Formatting, structural and semantic rule checking

– Can extend the workflow of continuous integration and improve it

– In this research we used codeanalysis utilizing patternmatching

5

Verziókezelés

Fordítás

Fejlesztés

Egység- és integrációs teszt

Kódanalízis

Verziókezelés

Fordítás

Fejlesztés

Egység- és integrációs teszt

Kódanalízis

DevelopmentVersion ControlSystem

CompilationUnit and

IntegrationTests

StaticAnalysis

Page 8: Graph-Based Source Code Analysis of JavaScript Repositories

Static Analysis

– No need for compilation orexecution of the application

– Formatting, structural and semantic rule checking

– Can extend the workflow of continuous integration and improve it

– In this research we used codeanalysis utilizing patternmatching

5

Verziókezelés

Fordítás

Fejlesztés

Egység- és integrációs teszt

Kódanalízis

Verziókezelés

Fordítás

Fejlesztés

Egység- és integrációs teszt

Kódanalízis

DevelopmentVersion ControlSystem

CompilationUnit and

IntegrationTests

StaticAnalysis

Page 9: Graph-Based Source Code Analysis of JavaScript Repositories

Static Analysis

– No need for compilation orexecution of the application

– Formatting, structural and semantic rule checking

– Can extend the workflow of continuous integration and improve it

– In this research we used codeanalysis utilizing patternmatching

5

Verziókezelés

Fordítás

Fejlesztés

Egység- és integrációs teszt

Kódanalízis

Verziókezelés

Fordítás

Fejlesztés

Egység- és integrációs teszt

Kódanalízis

– Java

– FindBugs

– PMD

– CheckStyle

DevelopmentVersion ControlSystem

CompilationUnit and

IntegrationTests

StaticAnalysis

Page 10: Graph-Based Source Code Analysis of JavaScript Repositories

Static Analysis

– No need for compilation orexecution of the application

– Formatting, structural and semantic rule checking

– Can extend the workflow of continuous integration and improve it

– In this research we used codeanalysis utilizing patternmatching

5

Verziókezelés

Fordítás

Fejlesztés

Egység- és integrációs teszt

Kódanalízis

Verziókezelés

Fordítás

Fejlesztés

Egység- és integrációs teszt

Kódanalízis

– Java

– FindBugs

– PMD

– CheckStyle

– JavaScript

– ESLint

– Facebook Infer, Flow

– Tern

– TAJS

DevelopmentVersion ControlSystem

CompilationUnit and

IntegrationTests

StaticAnalysis

Page 11: Graph-Based Source Code Analysis of JavaScript Repositories

– Thorough code analysis is time-consuming and resource-intensive

– For large projects it can be too slow

Problems to Solve

6

unit tests

static analysis

☼ ☆☾☆

Page 12: Graph-Based Source Code Analysis of JavaScript Repositories

– Thorough code analysis is time-consuming and resource-intensive

– For large projects it can be too slow

– Temporary solution: batching

Problems to Solve

6

unit tests

static analysis

☼ ☆☾☆

unit tests

static analyis

Page 13: Graph-Based Source Code Analysis of JavaScript Repositories

– Thorough code analysis is time-consuming and resource-intensive

– For large projects it can be too slow

– Temporary solution: batching

Present results

as soon and as fast

as possible.

Problems to Solve

6

unit tests

static analysis

☼ ☆☾☆

unit tests

static analyis

Page 14: Graph-Based Source Code Analysis of JavaScript Repositories

Problems to Solve

– Memory limits appear when...

– Global rules are checked

– Storing the structure in-memory

– For large code repositories

– Not being incremental

– Batched execution simplydoes not cut it

– Small change inducescomplete recheck

7

Page 15: Graph-Based Source Code Analysis of JavaScript Repositories

Our Approach

– Incremental methodology– Instead of batched execution

– Update the prepared results with theeffects of the change

– Only store the required parts in thememory

8

analyzer

Δ2.-1.1.

Page 16: Graph-Based Source Code Analysis of JavaScript Repositories

VCS Workspace Abstact SyntaxTree

Abstract SemanticGraph

Well-formednessRules

Query Execution Database

Main.js | ++----

Dependency.js | +++++-

FIterator.js | ----

Parser.js | ++

AutomaticWell-formedness

Rule Evaluation

Manual Executionand Data Extraction

Querying and Transformation

.

discoverer

ChangeProcessor.js

CommandParser.js

FileIterator.js

iterators

DepCollector.js

FileDiscoverer.js

InitIterator.js

Main.js

whitepages

ConnectionMgr.js

DependencyMgr.js

neo4jValidation Report

<!><?>

<.>

Module

declaration

declarators

items

binding init

left right

Architecture overview

9

Page 17: Graph-Based Source Code Analysis of JavaScript Repositories

VCS Workspace Abstact SyntaxTree

Abstract SemanticGraph

Well-formednessRules

Query Execution Database

Main.js | ++----

Dependency.js | +++++-

FIterator.js | ----

Parser.js | ++

AutomaticWell-formedness

Rule Evaluation

Manual Executionand Data Extraction

Querying and Transformation

.

discoverer

ChangeProcessor.js

CommandParser.js

FileIterator.js

iterators

DepCollector.js

FileDiscoverer.js

InitIterator.js

Main.js

whitepages

ConnectionMgr.js

DependencyMgr.js

neo4jValidation Report

<!><?>

<.>

Module

declaration

declarators

items

binding init

left right

Architecture overview

9

Page 18: Graph-Based Source Code Analysis of JavaScript Repositories

VCS Workspace Abstact SyntaxTree

Abstract SemanticGraph

Well-formednessRules

Query Execution Database

Main.js | ++----

Dependency.js | +++++-

FIterator.js | ----

Parser.js | ++

AutomaticWell-formedness

Rule Evaluation

Manual Executionand Data Extraction

Querying and Transformation

.

discoverer

ChangeProcessor.js

CommandParser.js

FileIterator.js

iterators

DepCollector.js

FileDiscoverer.js

InitIterator.js

Main.js

whitepages

ConnectionMgr.js

DependencyMgr.js

neo4jValidation Report

<!><?>

<.>

Module

declaration

declarators

items

binding init

left right

Architecture overview

9

Page 19: Graph-Based Source Code Analysis of JavaScript Repositories

VCS Workspace Abstact SyntaxTree

Abstract SemanticGraph

Well-formednessRules

Query Execution Database

Main.js | ++----

Dependency.js | +++++-

FIterator.js | ----

Parser.js | ++

AutomaticWell-formedness

Rule Evaluation

Manual Executionand Data Extraction

Querying and Transformation

.

discoverer

ChangeProcessor.js

CommandParser.js

FileIterator.js

iterators

DepCollector.js

FileDiscoverer.js

InitIterator.js

Main.js

whitepages

ConnectionMgr.js

DependencyMgr.js

neo4jValidation Report

<!><?>

<.>

Module

declaration

declarators

items

binding init

left right

Architecture overview

9

Page 20: Graph-Based Source Code Analysis of JavaScript Repositories

Code Processing Steps

20

tokenizer

source code

tokens

AST

ASG

parser

scope analyzer

tokenizer

source code

tokens

AST

ASG

parser

scope analyzer

Page 21: Graph-Based Source Code Analysis of JavaScript Repositories

Code Processing Steps

21

tokenizer

source code

tokens

AST

ASG

parser

scope analyzer

Page 22: Graph-Based Source Code Analysis of JavaScript Repositories

Code Processing Steps

22

tokenizer

source code

tokens

AST

ASG

parser

scope analyzer

Sequence of statements

formalized in a given language

Page 23: Graph-Based Source Code Analysis of JavaScript Repositories

Code Processing Steps

23

tokenizer

source code

tokens

AST

ASG

parser

scope analyzer

Sequence of statements

formalized in a given language

Page 24: Graph-Based Source Code Analysis of JavaScript Repositories

Code Processing Steps

24

tokenizer

source code

tokens

AST

ASG

parser

scope analyzer

Page 25: Graph-Based Source Code Analysis of JavaScript Repositories

Code Processing Steps

25

tokenizer

source code

tokens

AST

ASG

parser

scope analyzer

token – the shortest character sequence still having meaning.

Page 26: Graph-Based Source Code Analysis of JavaScript Repositories

Code Processing Steps

26

tokenizer

source code

tokens

AST

ASG

parser

scope analyzer

token – the shortest character sequence still having meaning.

Page 27: Graph-Based Source Code Analysis of JavaScript Repositories

Code Processing Steps

27

tokenizer

source code

tokens

AST

ASG

parser

scope analyzer

Token Token type

VAR (Keyword)

IDENTIFIER (Ident)

ASSIGN (Punctuator)

NUMBER (NumericLiteral)

DIV (Punctuator)

NUMBER (NumericLiteral)

token – the shortest character sequence still having meaning.

Page 28: Graph-Based Source Code Analysis of JavaScript Repositories

tokenizer

source code

tokens

AST

ASG

parser

scope analyzer

Code Processing Steps

12

Module

VariableDeclarationStatement

VariableDeclaration

VariableDeclarator

BindingIdentifiername = `foo`

BinaryExpressionoperator = `Div`

LiteralNumericExpressionvalue = 1.0

LiteralNumericExpressionvalue = 0.0

declaration

declarators

items

binding init

left right

Page 29: Graph-Based Source Code Analysis of JavaScript Repositories

tokenizer

source code

tokens

AST

ASG

parser

scope analyzer

Code Processing Steps

12

Abstract Syntax Tree (AST)

– Tree representation of

– the grammar structure of

– the sequence of tokens.

Module

VariableDeclarationStatement

VariableDeclaration

VariableDeclarator

BindingIdentifiername = `foo`

BinaryExpressionoperator = `Div`

LiteralNumericExpressionvalue = 1.0

LiteralNumericExpressionvalue = 0.0

declaration

declarators

items

binding init

left right

Page 30: Graph-Based Source Code Analysis of JavaScript Repositories

tokenizer

source code

tokens

AST

ASG

parser

scope analyzer

Code Processing Steps

12

Module

VariableDeclarationStatement

VariableDeclaration

VariableDeclarator

BindingIdentifiername = `foo`

BinaryExpressionoperator = `Div`

LiteralNumericExpressionvalue = 1.0

LiteralNumericExpressionvalue = 0.0

declaration

declarators

items

binding init

left right

Page 31: Graph-Based Source Code Analysis of JavaScript Repositories

tokenizer

source code

tokens

AST

ASG

parser

scope analyzer

Code Processing Steps

13

Module

VariableDeclarationStatement

VariableDeclaration

VariableDeclarator

BindingIdentifiername = `foo`

BinaryExpressionoperator = `Div`

LiteralNumericExpressionvalue = 1.0

LiteralNumericExpressionvalue = 0.0

declaration

declarators

items

binding init

left right

Page 32: Graph-Based Source Code Analysis of JavaScript Repositories

tokenizer

source code

tokens

AST

ASG

parser

scope analyzer

Code Processing Steps

13

Module

VariableDeclarationStatement

VariableDeclaration

VariableDeclarator

BindingIdentifiername = `foo`

BinaryExpressionoperator = `

LiteralNumericExpressionvalue = 1.0

declaration

declarators

items

binding init

left right

Page 33: Graph-Based Source Code Analysis of JavaScript Repositories

tokenizer

source code

tokens

AST

ASG

parser

scope analyzer

Code Processing Steps

13

Module

VariableDeclarationStatement

VariableDeclaration

VariableDeclarator

BindingIdentifiername = `foo`

BinaryExpressionoperator = `

LiteralNumericExpressionvalue = 1.0

declaration

declarators

items

binding init

left right

GlobalScope

Scope

Variablename = `foo`

Referenceaccessibility = `Write`

variables

references

children

Declarationkind = `Var`

declarations

node

astNode

Page 34: Graph-Based Source Code Analysis of JavaScript Repositories

tokenizer

source code

tokens

AST

ASG

parser

scope analyzer

Code Processing StepsAbstract Semantic Graph(ASG)

– Graph, not necessarily tree.

– Semantic information besidesthe syntactic structure.

– Containscross-edges →

13

Module

VariableDeclarationStatement

VariableDeclaration

VariableDeclarator

BindingIdentifiername = `foo`

BinaryExpressionoperator = `

LiteralNumericExpressionvalue = 1.0

declaration

declarators

items

binding init

left right

GlobalScope

Scope

Variablename = `foo`

Referenceaccessibility = `Write`

variables

references

children

Declarationkind = `Var`

declarations

node

astNode

Page 35: Graph-Based Source Code Analysis of JavaScript Repositories

tokenizer

source code

tokens

AST

ASG

parser

scope analyzer

Code Processing Steps

13

Module

VariableDeclarationStatement

VariableDeclaration

VariableDeclarator

BindingIdentifiername = `foo`

BinaryExpressionoperator = `

LiteralNumericExpressionvalue = 1.0

declaration

declarators

items

binding init

left right

GlobalScope

Scope

Variablename = `foo`

Referenceaccessibility = `Write`

variables

references

children

Declarationkind = `Var`

declarations

node

astNode

Page 36: Graph-Based Source Code Analysis of JavaScript Repositories

AST vs ASG

14

Page 37: Graph-Based Source Code Analysis of JavaScript Repositories

AST vs ASG

14

Page 38: Graph-Based Source Code Analysis of JavaScript Repositories

AST vs ASG

14

1SLOC

20-40-50nodes

Page 39: Graph-Based Source Code Analysis of JavaScript Repositories

Overview of the Approach

15

Verziókezelés

Fordítás

Fejlesztés

Egység- és integrációs teszt

Kódanalízis

Verziókezelés

Fordítás

Fejlesztés

Egység- és integrációs teszt

Kódanalízis

DevelopmentVersion ControlSystem

CompilationUnit and

IntegrationTests

StaticAnalysis

Page 40: Graph-Based Source Code Analysis of JavaScript Repositories

Overview of the Approach

16

Page 41: Graph-Based Source Code Analysis of JavaScript Repositories

Overview of the Approach

16

VersionControlSystem

IntegratedDevelopmentEnvironment

Git, Visual StudioCode

Page 42: Graph-Based Source Code Analysis of JavaScript Repositories

Overview of the Approach

16

VersionControlSystem

IntegratedDevelopmentEnvironment

tokenizer

source code

tokens

AST

ASG

parser

scope analyzer

Git, Visual StudioCode ShapeSecurityShift

Page 43: Graph-Based Source Code Analysis of JavaScript Repositories

Overview of the Approach

16

VersionControlSystem

transformationIntegrated

DevelopmentEnvironment

tokenizer

source code

tokens

AST

ASG

parser

scope analyzer

Git, Visual StudioCode ShapeSecurityShift Java, Cypher

Page 44: Graph-Based Source Code Analysis of JavaScript Repositories

Overview of the Approach

16

VersionControlSystem

transformation

graphdatabase

IntegratedDevelopmentEnvironment

tokenizer

source code

tokens

AST

ASG

parser

scope analyzer

Git, Visual StudioCode ShapeSecurityShift Java, Cypher Neo4j

Page 45: Graph-Based Source Code Analysis of JavaScript Repositories

Overview of the Approach

16

VersionControlSystem

transformation

graphdatabase

IntegratedDevelopmentEnvironment

tokenizer

source code

tokens

AST

ASG

parser

scope analyzer

resultprocessing

Git, Visual StudioCode ShapeSecurityShift Java, Cypher Neo4j

Page 46: Graph-Based Source Code Analysis of JavaScript Repositories

Overview of the Approach

16

VersionControlSystem

transformation

graphdatabase

IntegratedDevelopmentEnvironment

tokenizer

source code

tokens

AST

ASG

parser

scope analyzer

resultprocessing

Git, Visual StudioCode ShapeSecurityShift Java, Cypher Neo4j

Page 47: Graph-Based Source Code Analysis of JavaScript Repositories

Overview of the Approach

16

VersionControlSystem

transformation

graphdatabase

IntegratedDevelopmentEnvironment

tokenizer

source code

tokens

AST

ASG

parser

scope analyzer

resultprocessing

Git, Visual StudioCode ShapeSecurityShift Java, Cypher Neo4j

Page 48: Graph-Based Source Code Analysis of JavaScript Repositories

Overview of the Approach

16

VersionControlSystem

transformation

graphdatabase

IntegratedDevelopmentEnvironment

tokenizer

source code

tokens

AST

ASG

parser

scope analyzer

resultprocessing

Git, Visual StudioCode ShapeSecurityShift Java, Cypher Neo4j

Page 49: Graph-Based Source Code Analysis of JavaScript Repositories

Overview of the Approach

16

VersionControlSystem

transformation

graphdatabase

IntegratedDevelopmentEnvironment

tokenizer

source code

tokens

AST

ASG

parser

scope analyzer

resultprocessing

Git, Visual StudioCode ShapeSecurityShift Java, Cypher Neo4j

Page 50: Graph-Based Source Code Analysis of JavaScript Repositories

Overview of the Approach

16

VersionControlSystem

transformationtransformation

graphdatabase

IntegratedDevelopmentEnvironment

tokenizer

source code

tokens

AST

ASG

parser

scope analyzer

resultprocessing

resultprocessing

Git, Visual StudioCode ShapeSecurityShift Java, Cypher Neo4j

Page 51: Graph-Based Source Code Analysis of JavaScript Repositories

Graph Pattern Matching

17

VariableDeclarator

BindingIdentifiername = `foo`

BinaryExpressionoperator = `Div`

LNExpressionvalue = 1.0

LNExpressionvalue = 0.0

Page 52: Graph-Based Source Code Analysis of JavaScript Repositories

Graph Pattern Matching

– Graph pattern

– A declarative,

– graph-like formalism

– expressing constraints.

17

VariableDeclarator

BindingIdentifiername = `foo`

BinaryExpressionoperator = `Div`

LNExpressionvalue = 1.0

LNExpressionvalue = 0.0

Page 53: Graph-Based Source Code Analysis of JavaScript Repositories

Graph Pattern Matching

– Graph pattern

– A declarative,

– graph-like formalism

– expressing constraints.

17

VariableDeclarator

BindingIdentifiername = `foo`

BinaryExpressionoperator = `Div`

LNExpressionvalue = 1.0

LNExpressionvalue = 0.0

binding be

right

Page 54: Graph-Based Source Code Analysis of JavaScript Repositories

Graph Pattern Matching

– Graph pattern

– A declarative,

– graph-like formalism

– expressing constraints.

17

VariableDeclarator

BindingIdentifiername = `foo`

BinaryExpressionoperator = `Div`

LNExpressionvalue = 1.0

LNExpressionvalue = 0.0

binding be

right

Page 55: Graph-Based Source Code Analysis of JavaScript Repositories

Graph Pattern Matching

– Graph pattern

– A declarative,

– graph-like formalism

– expressing constraints.

17

BindingIdentifiername = `foo`

Graphpatternqueryexpressed in Cypherlookingforadivisionbyzero

binding

Resultsof thepatternmatching

Page 56: Graph-Based Source Code Analysis of JavaScript Repositories

Use Cases static analysis

– Searching for local badsmells (linter warnings)

– without a case

– value set more than once

– Not used variable

– Global rules– Unreachable code parts

– Framework

– Freely extendable

– User-defined rules

– Easier to use than visitorpattern solutions

18

Page 57: Graph-Based Source Code Analysis of JavaScript Repositories

Use Cases transformation

Control Flow Graph (CFG)

– Graph representation of

– every possiblestatement sequence

– during code execution.

19

Page 58: Graph-Based Source Code Analysis of JavaScript Repositories

Use Cases transformation

Control Flow Graph (CFG)

– Graph representation of

– every possiblestatement sequence

– during code execution.

19

statement

Page 59: Graph-Based Source Code Analysis of JavaScript Repositories

Use Cases transformation

Control Flow Graph (CFG)

– Graph representation of

– every possiblestatement sequence

– during code execution.

19

statement

statement

Page 60: Graph-Based Source Code Analysis of JavaScript Repositories

Use Cases transformation

Control Flow Graph (CFG)

– Graph representation of

– every possiblestatement sequence

– during code execution.

19

statement

statement

if

Page 61: Graph-Based Source Code Analysis of JavaScript Repositories

Use Cases transformation

Control Flow Graph (CFG)

– Graph representation of

– every possiblestatement sequence

– during code execution.

19

statement

statement

if condition

Page 62: Graph-Based Source Code Analysis of JavaScript Repositories

Use Cases transformation

Control Flow Graph (CFG)

– Graph representation of

– every possiblestatement sequence

– during code execution.

19

statement

statement

statement statement

if condition

Page 63: Graph-Based Source Code Analysis of JavaScript Repositories

Use Cases transformation

Control Flow Graph (CFG)

– Graph representation of

– every possiblestatement sequence

– during code execution.

19

statement

statement

statement statement

if

statement

condition

Page 64: Graph-Based Source Code Analysis of JavaScript Repositories

Use Cases transformation

Control Flow Graph (CFG)

– Graph representation of

– every possiblestatement sequence

– during code execution.

19

statement

statement

statement statement

error

if

statement

condition

Page 65: Graph-Based Source Code Analysis of JavaScript Repositories

Use Cases transformation

Control Flow Graph (CFG)

– Graph representation of

– every possiblestatement sequence

– during code execution.

19

statement

statement

statement statement

statement

error

if

statement

condition

Page 66: Graph-Based Source Code Analysis of JavaScript Repositories

Use Cases transformation

Control Flow Graph (CFG)

– Graph representation of

– every possiblestatement sequence

– during code execution.

19

statement

statement

statement statement

statement

error

if

statement

condition

Page 67: Graph-Based Source Code Analysis of JavaScript Repositories

Use Cases transformation

Control Flow Graph (CFG)

– Graph representation of

– every possiblestatement sequence

– during code execution.

19

statement

statement

statement statement

statement

error

if

return

statement

condition

Page 68: Graph-Based Source Code Analysis of JavaScript Repositories

error

Use Cases test generation

20

statement

statement

statement statement

statement

if

return

condition

statement

Page 69: Graph-Based Source Code Analysis of JavaScript Repositories

error

Use Cases test generation

– Inspecting control flows

– Is the given statement reachable

given the constraints on the

edges?

– Which one is the shortest route?

20

statement

statement

statement statement

statement

if

return

condition

statement

Page 70: Graph-Based Source Code Analysis of JavaScript Repositories

error

Use Cases test generation

– Inspecting control flows

– Is the given statement reachable

given the constraints on the

edges?

– Which one is the shortest route?

20

statement

statement

statement statement

statement

if

return

condition

statement

Page 71: Graph-Based Source Code Analysis of JavaScript Repositories

error

Use Cases test generation

– Inspecting control flows

– Is the given statement reachable

given the constraints on the

edges?

– Which one is the shortest route?

20

statement

statement

statement statement

statement

if

return

condition

statement

Page 72: Graph-Based Source Code Analysis of JavaScript Repositories

error

Use Cases test generation

– Inspecting control flows

– Is the given statement reachable

given the constraints on the

edges?

– Which one is the shortest route?

20

statement

statement

statement statement

statement

if

return

condition

statement

Page 73: Graph-Based Source Code Analysis of JavaScript Repositories

error

Use Cases test generation

– Inspecting control flows

– Is the given statement reachable

given the constraints on the

edges?

– Which one is the shortest route?

20

statement

statement

statement statement

statement

if

return

condition

statement

Page 74: Graph-Based Source Code Analysis of JavaScript Repositories

error

Use Cases test generation

– Inspecting control flows

– Is the given statement reachable

given the constraints on the

edges?

– Which one is the shortest route?

20

statement

statement

statement statement

statement

if

return

condition

statement

Page 75: Graph-Based Source Code Analysis of JavaScript Repositories

error

Use Cases test generation

– Inspecting control flows

– Is the given statement reachable

given the constraints on the

edges?

– Which one is the shortest route?

20

statement

statement

statement statement

statement

if

return

condition

statement

Page 76: Graph-Based Source Code Analysis of JavaScript Repositories

error

Use Cases test generation

– Inspecting control flows

– Is the given statement reachable

given the constraints on the

edges?

– Which one is the shortest route?

20

statement

statement

statement statement

statement

if

return

condition

statement

Page 77: Graph-Based Source Code Analysis of JavaScript Repositories

error

Use Cases test generation

– Inspecting control flows

– Is the given statement reachable

given the constraints on the

edges?

– Which one is the shortest route?

– Producing test input

for dynamic testing

20

statement

statement

statement statement

statement

if

return

condition

statement

Page 78: Graph-Based Source Code Analysis of JavaScript Repositories

Use Cases type inference

– Supporting dynamically typed languages

– Python

– JavaScript / ECMAScript

21

Page 79: Graph-Based Source Code Analysis of JavaScript Repositories

Use Cases type inference

– Supporting dynamically typed languages

– Python

– JavaScript / ECMAScript

21

http://marijnhaverbeke.nl/blog/tern.html

Page 80: Graph-Based Source Code Analysis of JavaScript Repositories

Use Cases impact analysis

– Adapting to the continuous integration workflow

– Handling multiple branches

– Following the modifications in a branch

– File-level incremental granularity

– Giving differential reports to the developers

22

Page 81: Graph-Based Source Code Analysis of JavaScript Repositories

Why Neo4j?+++

– Quick prototyping

– Supporting transactions

– Great tooling

--

– Not scaling well

– Only disk-based

23

Page 82: Graph-Based Source Code Analysis of JavaScript Repositories

Remarks MERGE

– MATCH or CREATE

– Great for the lazy

– Can be expensive

– Possible solutions:

– Less MERGE

– Separating queries

– Create first if not present

– Use MATCH instead of MERGE

– Prevention

– Prepare the structure when

inserting the data

24

Page 83: Graph-Based Source Code Analysis of JavaScript Repositories

Remarks MERGE

25

3 1

Page 84: Graph-Based Source Code Analysis of JavaScript Repositories

Remarks if-then-else

– Not a language element in

Cypher

– Can be solved with a trick

– Verrrrrry sloww

– Solution:

– Two smaller, disjunct cases

26

Page 85: Graph-Based Source Code Analysis of JavaScript Repositories

Remarks if-then-else

– Not a language element in

Cypher

– Can be solved with a trick

– Verrrrrry sloww

– Solution:

– Two smaller, disjunct cases

26

Page 86: Graph-Based Source Code Analysis of JavaScript Repositories

Remarks if-then-else

27

Page 87: Graph-Based Source Code Analysis of JavaScript Repositories

Remarks if-then-else

28

Page 88: Graph-Based Source Code Analysis of JavaScript Repositories

Remarks if-then-else

28

Page 89: Graph-Based Source Code Analysis of JavaScript Repositories

Remarks if-then-else

28

Page 90: Graph-Based Source Code Analysis of JavaScript Repositories

Remarks if-then-else

28

Page 91: Graph-Based Source Code Analysis of JavaScript Repositories

Remarks if-then-else

28∞ vs 15 sec

Page 92: Graph-Based Source Code Analysis of JavaScript Repositories

Remarks if-then-else

28∞ vs 15 sec

These are not chickens.

Page 93: Graph-Based Source Code Analysis of JavaScript Repositories

Remarks reachability

– Transitive closure without

length constraints is slow.

– Transitive closure over

repeating node/edge pattern

is only possible using tricks.

29

A B

*

Page 94: Graph-Based Source Code Analysis of JavaScript Repositories

Remarks reachability

– Transitive closure without

length constraints is slow.

– Transitive closure over

repeating node/edge pattern

is only possible using tricks.

29

A B

*

Page 95: Graph-Based Source Code Analysis of JavaScript Repositories

Remarks reachability

– Transitive closure without

length constraints is slow.

– Transitive closure over

repeating node/edge pattern

is only possible using tricks.

29

A B

*

Page 96: Graph-Based Source Code Analysis of JavaScript Repositories

Conclusions

– Source code analyzerframework

– Searching for global errorpatterns

– Close to real time feedback

– Type inference possible

– Test input generation possible

– Approach for both dynamicallyand statically typed languages

– Using Neo4j for

– Storing

– Pattern matching

– Transforming

– Version control

– Storing metadata

30

Page 97: Graph-Based Source Code Analysis of JavaScript Repositories

– Our work was supported by:

– ÚNKP*

– Microsoft Azure for Research

– MTA-BME Lendület Program

Project Details

– The frameworkprototype is open-source.

https://github.com/

ftsrg/codemodel-rifle

31

*Supported by the ÚNKP-16-2-I. New National Excellence Program of the Ministry of Human Capacities.

Page 98: Graph-Based Source Code Analysis of JavaScript Repositories

Project Details

– Supervisors

– Ádám Lippai

– Dávid Honfi

– Gábor Szárnyas

– Helped my research

– Tamás Soma Lucz

– Industrial case study

32