11
© Copyright 2010 Hewlett-Packard Development Company, L.P. 1 © Copyright 2010 Hewlett-Packard Development Company, L.P. David Lehavi HP Labs Israel A NEW PARSING LANGUAGE FOR GUI AND VISUALLY STRUCTURED DOCUMENTS

A New Parsing Language for GUI and Visually Structured Documents

  • Upload
    gilead

  • View
    64

  • Download
    0

Embed Size (px)

DESCRIPTION

A New Parsing Language for GUI and Visually Structured Documents. David Lehavi HP Labs Israel. Why bother ?. universal interface for all graphical applications. Standard Approach: use the DOM. And if there is no dom , or a hybrid environment ?. - PowerPoint PPT Presentation

Citation preview

Page 1: A New Parsing Language for GUI and Visually Structured Documents

© Copyright 2010 Hewlett-Packard Development Company, L.P.    1© Copyright 2010 Hewlett-Packard Development Company, L.P.   

David LehaviHP Labs Israel

A NEW PARSING LANGUAGE FOR GUI AND VISUALLY

STRUCTURED DOCUMENTS

Page 2: A New Parsing Language for GUI and Visually Structured Documents

© Copyright 2010 Hewlett-Packard Development Company, L.P.    2

UNIVERSAL INTERFACE FOR ALL GRAPHICAL APPLICATIONS

STANDARD APPROACH: USE THE DOM

AND IF THERE IS NO DOM, OR A HYBRID ENVIRONMENT ?

WHY BOTHER ?

– New GUI for legacy apps (additional functionality, hiding sensitive data).

– Software testing (record and replay).

– Accessibility (speech activated apps).

• Mobile Devices

• Web 2.0 (Flash, fragmented toolkit environment)

• Hybrid environments

DOM inspector

•Object type

•Set/Get properties

Page 3: A New Parsing Language for GUI and Visually Structured Documents

© Copyright 2010 Hewlett-Packard Development Company, L.P.    3

What images do we need to understand ?

VISUAL LANGUAGES

– A two dimensional pixel word: bit map

– A two dimensional picture word (constructed from graphical tokens)

– Formal presentation: A•→(B•↓C)

– We only parse objects which are “cut by lines”.

– Less restrictive than it seems at first: we may generalize and parse objects which are “cut by curves” (overcome the X)

C

B

A

Page 4: A New Parsing Language for GUI and Visually Structured Documents

© Copyright 2010 Hewlett-Packard Development Company, L.P.    4

INTERMEZZO: USING LANGUAGE CONSTRUCTSFollowing Ken Thompsons work on regular expressions

UniversalmachineCompiler

Visuallexer

bytecode

Lang

uage

defin

itio

n

request

s

toke

ns

chara

cters

Page 5: A New Parsing Language for GUI and Visually Structured Documents

© Copyright 2010 Hewlett-Packard Development Company, L.P.    5

Finding useful language constructs

CHALLENGES IN GUI PARSING

• Expressability• Regular languages are too weak to describe recursive structures.

• Decidability & performance• Context free languages are too strong – they are undecidable.

• Ease of maintenance: Many GUI’s, and constantly changing.

• Robust to “lexing noise”: Input may originate from screenshot analysis.

Page 6: A New Parsing Language for GUI and Visually Structured Documents

© Copyright 2010 Hewlett-Packard Development Company, L.P.    6

RADIO-BUTTON-SET EXAMPLE

A Naïve representation: (Radio•→Text)*↓

Problems: alignment, distances.

RTitled_E<Object X> = [ X C 0..50 Text ]

RTitled_M<Object X> = [ X C 0..50 Text

L L 0..50

X C 0..50 Text ]

RBS := V{RTitled_M<Radio>*RTitled_E<Radio>}

Page 7: A New Parsing Language for GUI and Visually Structured Documents

© Copyright 2010 Hewlett-Packard Development Company, L.P.    7

EBNF FOR REGULAR EXPRESSION

VPL = REGEX + FUNCTION CALLS AND DEFINITIONS

ADDING DISTANCES AND ALIGNMENTS

USING A VISIBLY PUSHDOWN META LANGUAGE

<RE>=<union>|<simple>

<union>=<RE>"|"<simple>

<simple>=<concat>|<basic>

<concat>=<simple><basic>

<basic>=<star>|<elementary>

<star>=<elementary>"*"

<elementary>= <group>|<token> 

<group>="("<RE> ")“

<group>=[V>]"("<RE> ")"

<name>= standard

<call>=<name>"<"<values>">"

<values> = comma separated <value>

<value>=

<call>|<token>|<name>

<rule>=

<name>"<"<params>">=" (<group>|<col>|<call>)

<params>= comma separated <param>

<param>="Object" <name>

<elementary>=<group>|<call>|<col>|<token>

<range>= <int>".."<int>

<west>=

[TBC]<range>

(<call>|<token>|<name>)

<row>=

(<call>|<token>|<name>)

<west>?

<south>= [RLC][RLC]?<range><row>

<col>= "["<row><south>?"]"

Page 8: A New Parsing Language for GUI and Visually Structured Documents

© Copyright 2010 Hewlett-Packard Development Company, L.P.    8

LANGUAGE & COMPILATION - EXAMPLE

RTitled_M<Radio > = [ Radio C 0..50 Text

L L 0..50

Radio C 0..50 Text ]

RTitled_E<Radio > = [ Radio C 0..50 Text ]

RBS := V{RTitled_M<Radio>*RTitled_E<Radio>}

Each node is a function

Concat

Kleene-*

Col

Row

Radio Text

Page 9: A New Parsing Language for GUI and Visually Structured Documents

© Copyright 2010 Hewlett-Packard Development Company, L.P.    9

RUNNING THE VPL CODE - EXAMPLE

Line

Concat

Kleene-*

Col

Radio Text

UniversalVPL

machine RBS

Page 10: A New Parsing Language for GUI and Visually Structured Documents

© Copyright 2010 Hewlett-Packard Development Company, L.P.    10

GLOBAL ROBUSTNESS TO LOCAL AMBIGUITIES• Visual lexer returns atoms.

• Lexer assigns likelihood to any pair (atom, bounding box).

• We use conditional likelihood to avoid consistent errors.

• A “compound object” has heuristic “likelihood”

• VPL graph vertices are no longer functions, but co-routines (user space threads).• sit on (heuristic based) priority queue, and paused when their priority is low.

• can be forked when they get multiple return values.

50 % LO, 50 % scroller

Page 11: A New Parsing Language for GUI and Visually Structured Documents

© Copyright 2010 Hewlett-Packard Development Company, L.P.    11

QUESTIONS ?