24
SherLog: Error Diagnosis by Connecting Clues from Run-time Logs Ding Yuan 1 Haohui Mai 1 Weiwei Xiong 1 Lin Tan 1 Yuanyuan Zhou 2 Shankar Pasupathy 3 1 University of Illinois at Urbana-Champaign 2 University of California, San Diego 3 NetApp, Inc. ASPLOS’10, March 13-17, 2010, Pittsburgh, Pennsylvania, USA. July 18, 2013 Lisong Guo (LIP6/REGAL) July 18, 2013 1 / 13

SherLog: Error Diagnosis Through Connecting Clues from Run-time Logs

Embed Size (px)

Citation preview

Page 1: SherLog:  Error Diagnosis Through Connecting Clues from Run-time Logs

SherLog: Error Diagnosis by Connecting Clues fromRun-time Logs

Ding Yuan1 Haohui Mai1 Weiwei Xiong1 Lin Tan1 YuanyuanZhou2 Shankar Pasupathy3

1University of Illinois at Urbana-Champaign

2University of California, San Diego

3NetApp, Inc.

ASPLOS’10, March 13-17, 2010, Pittsburgh, Pennsylvania, USA.

July 18, 2013

Lisong Guo (LIP6/REGAL) July 18, 2013 1 / 13

Page 2: SherLog:  Error Diagnosis Through Connecting Clues from Run-time Logs

Introduction

Scenario — Postmortem In-Production Debugging with Logspostmortem VS. prediction (model checking, static analysis etc.)in-production VS. in-house (runtime instrumentation etc.)log VS. others (bug reports, deployment configuration etc.)

Subtasks of Debuggingreproduce the bug (procedure non-related to source code)infer the failure-inducing execution pathinfer the conditions along the failure-inducing execution path

Research QuestionHow can we help a developer to debug in the scenario above ?

Lisong Guo (LIP6/REGAL) July 18, 2013 2 / 13

Page 3: SherLog:  Error Diagnosis Through Connecting Clues from Run-time Logs

Introduction

Scenario — Postmortem In-Production Debugging with Logspostmortem VS. prediction (model checking, static analysis etc.)in-production VS. in-house (runtime instrumentation etc.)log VS. others (bug reports, deployment configuration etc.)

Subtasks of Debuggingreproduce the bug (procedure non-related to source code)infer the failure-inducing execution pathinfer the conditions along the failure-inducing execution path

Research QuestionHow can we help a developer to debug in the scenario above ?

Lisong Guo (LIP6/REGAL) July 18, 2013 2 / 13

Page 4: SherLog:  Error Diagnosis Through Connecting Clues from Run-time Logs

Introduction

Scenario — Postmortem In-Production Debugging with Logspostmortem VS. prediction (model checking, static analysis etc.)in-production VS. in-house (runtime instrumentation etc.)log VS. others (bug reports, deployment configuration etc.)

Subtasks of Debuggingreproduce the bug (procedure non-related to source code)infer the failure-inducing execution pathinfer the conditions along the failure-inducing execution path

Research QuestionHow can we help a developer to debug in the scenario above ?

Lisong Guo (LIP6/REGAL) July 18, 2013 2 / 13

Page 5: SherLog:  Error Diagnosis Through Connecting Clues from Run-time Logs

Approach

IdeaManual Inspection −→ Automatic Inference

A tool that takes the run-time logs and source code as inputs, and thenproduces some debugging hints for developers (i.e connecting the dots)

all possible and valid failure-inducing execution paththe evolution of value on certain variables along the inferred paths

Usage Scenariorun the tool to get a list of interesting pathsexamine the values of certain suspicious variables along some pathrepeat the previous step until the root cause of the bug is found

Lisong Guo (LIP6/REGAL) July 18, 2013 3 / 13

Page 6: SherLog:  Error Diagnosis Through Connecting Clues from Run-time Logs

Approach

IdeaManual Inspection −→ Automatic Inference

A tool that takes the run-time logs and source code as inputs, and thenproduces some debugging hints for developers (i.e connecting the dots)

all possible and valid failure-inducing execution paththe evolution of value on certain variables along the inferred paths

Usage Scenariorun the tool to get a list of interesting pathsexamine the values of certain suspicious variables along some pathrepeat the previous step until the root cause of the bug is found

Lisong Guo (LIP6/REGAL) July 18, 2013 3 / 13

Page 7: SherLog:  Error Diagnosis Through Connecting Clues from Run-time Logs

Design

ComponentsLog Parsing: locate the logging statements in the source codePath Inference: infer the failure execution paths and the constraintsValue Inference: infer the value evaluation along the given paths

Lisong Guo (LIP6/REGAL) July 18, 2013 4 / 13

Page 8: SherLog:  Error Diagnosis Through Connecting Clues from Run-time Logs

Log Parsing

ObjectivesIdentifying the Logging Points and Variables in the source code.

Simple Logging StatementsSolution: regular-expression matching (i.e. grep)e.g. error(0, 0, _("removing directory, %s"), path);rule: {error(), 3, 4}

Complicated Logging FacilitiesHierarchy wrappers of standard printing APIs. (alt: coccinelle)e.g. error() -> strerrno()rule 1: ’%s’: %{serrono}rule 2: [{ "specifier": serrno; "regex": Regex; "val_func": ErrMsgToErrno}]

Lisong Guo (LIP6/REGAL) July 18, 2013 5 / 13

Page 9: SherLog:  Error Diagnosis Through Connecting Clues from Run-time Logs

Log Parsing

ObjectivesIdentifying the Logging Points and Variables in the source code.

Simple Logging StatementsSolution: regular-expression matching (i.e. grep)e.g. error(0, 0, _("removing directory, %s"), path);rule: {error(), 3, 4}

Complicated Logging FacilitiesHierarchy wrappers of standard printing APIs. (alt: coccinelle)e.g. error() -> strerrno()rule 1: ’%s’: %{serrono}rule 2: [{ "specifier": serrno; "regex": Regex; "val_func": ErrMsgToErrno}]

Lisong Guo (LIP6/REGAL) July 18, 2013 5 / 13

Page 10: SherLog:  Error Diagnosis Through Connecting Clues from Run-time Logs

Log Parsing

ObjectivesIdentifying the Logging Points and Variables in the source code.

Simple Logging StatementsSolution: regular-expression matching (i.e. grep)e.g. error(0, 0, _("removing directory, %s"), path);rule: {error(), 3, 4}

Complicated Logging FacilitiesHierarchy wrappers of standard printing APIs. (alt: coccinelle)e.g. error() -> strerrno()rule 1: ’%s’: %{serrono}rule 2: [{ "specifier": serrno; "regex": Regex; "val_func": ErrMsgToErrno}]

Lisong Guo (LIP6/REGAL) July 18, 2013 5 / 13

Page 11: SherLog:  Error Diagnosis Through Connecting Clues from Run-time Logs

Path Inference

Objectivesinfer the failure-inducing execution pathinfer the constraints of variables along the path

Constrained Sequence Matching Problem (NP-Complete?)based on Saturn, a static analysis framework for C programsmatch the control & data flow with the sequence of log messagesconvert the path searching problem into a set of declarative rules

Glance of Implementationcustomized control-flow: main → log@4 → b1@10 → c@16 → log@25

conjunctive constraints: strchr 6= NULL ∧ verbose 6= 0 rmdir()@17 6= 0

Lisong Guo (LIP6/REGAL) July 18, 2013 6 / 13

Page 12: SherLog:  Error Diagnosis Through Connecting Clues from Run-time Logs

Path Inference

Objectivesinfer the failure-inducing execution pathinfer the constraints of variables along the path

Constrained Sequence Matching Problem (NP-Complete?)based on Saturn, a static analysis framework for C programsmatch the control & data flow with the sequence of log messagesconvert the path searching problem into a set of declarative rules

Glance of Implementationcustomized control-flow: main → log@4 → b1@10 → c@16 → log@25

conjunctive constraints: strchr 6= NULL ∧ verbose 6= 0 rmdir()@17 6= 0

Lisong Guo (LIP6/REGAL) July 18, 2013 6 / 13

Page 13: SherLog:  Error Diagnosis Through Connecting Clues from Run-time Logs

Path Inference

Objectivesinfer the failure-inducing execution pathinfer the constraints of variables along the path

Constrained Sequence Matching Problem (NP-Complete?)based on Saturn, a static analysis framework for C programsmatch the control & data flow with the sequence of log messagesconvert the path searching problem into a set of declarative rules

Glance of Implementationcustomized control-flow: main → log@4 → b1@10 → c@16 → log@25

conjunctive constraints: strchr 6= NULL ∧ verbose 6= 0 rmdir()@17 6= 0

Lisong Guo (LIP6/REGAL) July 18, 2013 6 / 13

Page 14: SherLog:  Error Diagnosis Through Connecting Clues from Run-time Logs

Path Inference

Lisong Guo (LIP6/REGAL) July 18, 2013 7 / 13

Page 15: SherLog:  Error Diagnosis Through Connecting Clues from Run-time Logs

Path Inference

Technique SummarySAT-based path searchingconstraint programming

Limitationsskip the analysis on the functions of non-log-generating

I therefore it might return incorrect resultsno alias analysis for pointers

I but the underlining framework Saturn support alias analysisspecial treatments on some external routines/functions

I abort, exit, setjmp, longjmp etc.

Lisong Guo (LIP6/REGAL) July 18, 2013 8 / 13

Page 16: SherLog:  Error Diagnosis Through Connecting Clues from Run-time Logs

Path Inference

Technique SummarySAT-based path searchingconstraint programming

Limitationsskip the analysis on the functions of non-log-generating

I therefore it might return incorrect resultsno alias analysis for pointers

I but the underlining framework Saturn support alias analysisspecial treatments on some external routines/functions

I abort, exit, setjmp, longjmp etc.

Lisong Guo (LIP6/REGAL) July 18, 2013 8 / 13

Page 17: SherLog:  Error Diagnosis Through Connecting Clues from Run-time Logs

Value Inference

Objectiveinfer the value-flow of certain variables, given the execution paths

Algorithmmodel the assignment relationship among memory locations asguarded points-to graph

I predicate(position, variable, value, constraint)symbolically execute the inferred failure path forwards

I refine the constraint according to the scope of variablesI incrementally update the graph at each step

generate the sequence of value evolution (value-flow)

Lisong Guo (LIP6/REGAL) July 18, 2013 9 / 13

Page 18: SherLog:  Error Diagnosis Through Connecting Clues from Run-time Logs

Value Inference

Objectiveinfer the value-flow of certain variables, given the execution paths

Algorithmmodel the assignment relationship among memory locations asguarded points-to graph

I predicate(position, variable, value, constraint)symbolically execute the inferred failure path forwards

I refine the constraint according to the scope of variablesI incrementally update the graph at each step

generate the sequence of value evolution (value-flow)

Lisong Guo (LIP6/REGAL) July 18, 2013 9 / 13

Page 19: SherLog:  Error Diagnosis Through Connecting Clues from Run-time Logs

Evaluation

Methodologymanually reproduce and diagnose the real-world bugscollect path summaries at runtimecompare the result of SherLog with the reproduction information

Metricsuseful: SherLog infers a subset of bug reproduction informationcomplete: SherLog infers all the bug reproduction information

Lisong Guo (LIP6/REGAL) July 18, 2013 10 / 13

Page 20: SherLog:  Error Diagnosis Through Connecting Clues from Run-time Logs

Evaluation

Methodologymanually reproduce and diagnose the real-world bugscollect path summaries at runtimecompare the result of SherLog with the reproduction information

Metricsuseful: SherLog infers a subset of bug reproduction informationcomplete: SherLog infers all the bug reproduction information

Lisong Guo (LIP6/REGAL) July 18, 2013 10 / 13

Page 21: SherLog:  Error Diagnosis Through Connecting Clues from Run-time Logs

Evaluation

Methodologymanually reproduce and diagnose the real-world bugscollect path summaries at runtimecompare the result of SherLog with the reproduction information

Metricsuseful: SherLog infers a subset of bug reproduction informationcomplete: SherLog infers all the bug reproduction information

Lisong Guo (LIP6/REGAL) July 18, 2013 10 / 13

Page 22: SherLog:  Error Diagnosis Through Connecting Clues from Run-time Logs

Evaluation

Lisong Guo (LIP6/REGAL) July 18, 2013 11 / 13

Page 23: SherLog:  Error Diagnosis Through Connecting Clues from Run-time Logs

Assumptions/Limitations

Assumptionssufficient logging messages

I reasonable density distribution of logging statementsI reasonable amount of logging statements being activated

well-match between the bug manifestation path and the log tracesequential and single-threaded log messages

I cannot handle multi-thread concurrent program

Technical Limitationsskip the functions that do not involve in log productiondo not parse the complex constructs of C programming language, suchas pointer arithmetics

Lisong Guo (LIP6/REGAL) July 18, 2013 12 / 13

Page 24: SherLog:  Error Diagnosis Through Connecting Clues from Run-time Logs

More Pointers...

Ding Yuan, Soyeon Park, Yuanyuan Zhou: Characterizing logging practicesin open-source software. ICSE 2012

Adam J. Oliner, Archana Ganapathi, Wei Xu: Advances and challenges inlog analysis. Commun. ACM 2012

Ding Yuan, Jing Zheng, Soyeon Park, Yuanyuan Zhou, Stefan Savage:Improving Software Diagnosability via Log Enhancement. ASPLSO 2011

Wei Xu, Ling Huang, Armando Fox, David A. Patterson, Michael I. Jordan:Detecting Large-Scale System Problems by Mining Console Logs. ICML 2010

Thomas Reidemeister, Mohammad Ahmad Munawar, Miao Jiang, Paul A. S.Ward: Diagnosis of recurrent faults using log files. CASCON 2009

Trishul M. Chilimbi, Ben Liblit, Krishna Mehra, Aditya V. Nori, and KapilVaswani: HOLMES: Effective statistical debugging via efficient pathprofiling. ICSE 2009

Lisong Guo (LIP6/REGAL) July 18, 2013 13 / 13