LTTng · 2016-10-31 · i Abstract The Linux Trace Toolkit Next Generation (LTTng) software tracer was thoroughly documented from a user’s perspective. The behaviour of each command

LTTng Quick Start Guide (for version 2.3)

Daniel U. Thibault, DRDC Valcartier Research Centre

The contents are published as open-access documentation distributed under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0, http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Defence Research and Development Canada External Literature DRDC-RDDC-2016-N012 June 2016

IMPO Work Templ

© 20At

© natpuhtt

ORTANT INF

conducted for

late in use: SR

Her Majesty16. LTTng: Qu

ttribution 4.0 In

Sa Majesté tionale, 2016. L

ublique Creativtp://creativecom

ORMATIVE

project 15bl, H

Advanced (20

y the Queen inuick Start Guidnternational Pu

la Reine (en drLTTng: Quick e Commons Ammons.org/lice

STATEMEN

HeLP-CCS: Ho

010) Template_

n Right of Canade (for version ublic License (C

roit du CanadaStart Guide (fo

Attribution 4.0 Ienses/by/4.0/de

NTS

ost-Level Prote

_EN (2015-11-

ada, as represen2.3) is made avCC-BY 4.0, htt

a), telle que repfor version 2.3)International (Ceed.fr)

ection of Critic

-05).dotm

nted by the Mivailable under ttp://creativeco

présentée par le) est rendu dispCC-BY 4.0,

cal Computeriz

inister of Natiothe Creative C

ommons.org/lic

e ministre de laponible sous la

zed Systems

onal Defence, Commons censes/by/4.0).

a Défense a licence

i

Abstract

The Linux Trace Toolkit Next Generation (LTTng) software tracer was thoroughly documented from a user’s perspective. The behaviour of each command was tested for a wide range of input parameter values, installation scripts were written and tested, and a variety of demonstration use cases created and tested. A number of bugs were documented as a result (ranging from “simply broken” to “what exactly are we trying to do here?”) which are being addressed as time passes and the project keeps advancing in capability, reliability, and ease-of-use.

Whereas the companion LTTng Comprehensive User’s Guide delves into the nitty-gritty of using LTTng in all sorts of ways, the LTTng Quick Start Guide assumes the simplest installation and use scenario. It details step-by-step instructions to quickly install and start using LTTng.

Significance to defence and security

The Linux Trace Toolkit Next Generation (LTTng) offers reliable, low-impact, high-resolution, system-wide, dynamically adaptable, near-real-time tracing of Linux systems. Being able to observe what goes on within the operating system and any applications running on it is an essential prerequisite for the cyber security of governmental and military computer systems of all scales. As the LTTng project has now reached a reasonable level of maturity and can be used in production contexts, it was felt necessary to create a document detailing in practical terms how to deploy, use, and extend it. Large parts of this document have already been used to create the LTTng on-line documentation (http://lttng.org/docs/#); this publication can only make LTTng more accessible to all interested parties.

ii

Résumé

Le traceur logiciel Linux Trace Toolkit Next Generation (LTTng) a été documenté exhaustivement du point de vue de l’utilisateur. Le comportement de chaque commande a été testé pour une grande variété de valeurs de paramètres, des scripts d’installation ont été écrits et testés, et bon nombre de démonstrations utilitaires créées et testées. Une quantité de bogues ont été documentées par conséquent (allant de « c’est cassé » à « qu’est-ce qu’on essaie d’accomplir ici ? »), lesquelles succombent progressivement au passage du temps tandis que le projet continue de s’améliorer en capacité, fiabilité, et facilité d’utilisation.

Alors que le document complémentaire LTTng Comprehensive User’s Guide plonge dans le détail de l’utilisation de LTTng de toutes sortes de façons, le LTTng Quick Start Guide suppose le cas le plus simple d’installation et d’utilisation. Il donne des instructions étape-par-étape et détaillées expliquant comment rapidement installer et commencer à utiliser LTTng.

Importance pour la défense et la sécurité

Le traceur Linux Trace Toolkit Next Generation (LTTng) est en mesure de tracer les systèmes Linux en temps presque réel de façon flexible (dynamiquement adaptable), universelle, détaillée, fiable et économe en coût de performance. Être capable d’observer ce qui se passe à l’intérieur du système d’exploitation et de ses applications est une condition sine qua non de la cyber sécurité des systèmes informatiques gouvernementaux et militaires de toute échelle. Comme le projet LTTng a atteint un niveau raisonnable de maturité et peut être utilisé dans des contextes de production, il nous a semblé nécessaire de créer un document détaillant en termes pratiques comment le déployer, l’utiliser et l’augmenter. Une large part de ce document a déjà été utilisée pour créer la documentation en ligne de LTTng (http://lttng.org/docs/#); cette publication ne peut que rendre LTTng encore plus accessible aux parties intéressées.

iii

Table of contents

Abstract ............................................................................................................................................ i Significance to defence and security ................................................................................................ i Résumé ............................................................................................................................................ ii Importance pour la défense et la sécurité ........................................................................................ ii Table of contents ............................................................................................................................ iii List of figures .................................................................................................................................. v List of tables .................................................................................................................................. vii Acknowledgements ...................................................................................................................... viii 1 Introduction ............................................................................................................................... 1

1.1 Plan .................................................................................................................................. 2 2 Installation ................................................................................................................................. 3

2.1 Setting up the drdc software repository ........................................................................ 3 2.2 Installing the packages..................................................................................................... 6

2.2.1 From Synaptic ................................................................................................... 7 2.2.2 From the Ubuntu Software Centre .................................................................... 8 2.2.3 From the command line .................................................................................... 9

2.3 Verifying the installation ................................................................................................. 9 3 Operational concepts ............................................................................................................... 11

3.1 Architecture ................................................................................................................... 11 3.1.1 Client .............................................................................................................. 12 3.1.2 Babeltrace ....................................................................................................... 13 3.1.3 Tracepoint providers ....................................................................................... 13 3.1.4 Session daemon .............................................................................................. 13 3.1.5 Consumer daemons ......................................................................................... 14 3.1.6 Relay daemon ................................................................................................. 14

3.2 Tracing sessions ............................................................................................................. 14 3.2.1 Traces.............................................................................................................. 18 3.2.2 Session folder structure .................................................................................. 19 3.2.3 Snapshots ........................................................................................................ 21

3.3 Domains ......................................................................................................................... 23 3.4 Channels ........................................................................................................................ 23

3.4.1 Context ........................................................................................................... 24 3.4.2 Buffering schemes .......................................................................................... 25

3.5 Events ............................................................................................................................ 27 3.5.1 Kernel events .................................................................................................. 28 3.5.2 User-space events ........................................................................................... 31

iv

4 Use cases ................................................................................................................................. 35 4.1 Tracing the kernel .......................................................................................................... 35 4.2 Tracing user-space ......................................................................................................... 35 4.3 Using babeltrace ............................................................................................................ 35

4.3.1 Examples ........................................................................................................ 36 4.4 Instrumenting a C application ........................................................................................ 38

4.4.1 Writing the tracepoint provider ...................................................................... 43 4.4.2 The tracepoint definition macros .................................................................... 44 4.4.3 C application instrumentation summary ......................................................... 47

4.5 Instrumenting a Java application ................................................................................... 47 4.5.1 liblttng-ust-java ............................................................................................... 47 4.5.2 Java Native Interface (JNI) ............................................................................. 48 4.5.3 Other approaches ............................................................................................ 54

References ..................................................................................................................................... 55 List of symbols/abbreviations/acronyms/initialisms ..................................................................... 57 Glossary ......................................................................................................................................... 59

v

List of figures

Figure 1: Selecting the software sources. ....................................................................................... 4

Figure 2: Adding a software source. .............................................................................................. 5

Figure 3: The NO_PUBKEY error message. ................................................................................... 6

Figure 4: The authentication failure warning message. .................................................................. 6

Figure 5: Functional break-down of the LTTng packages. ............................................................ 7

Figure 6: The Synaptic window. .................................................................................................... 8

Figure 7: The Ubuntu Software Centre window. ........................................................................... 9

Figure 8: Simplified architecture of the LTTng tracing suite. ...................................................... 11

Figure 9: Detailed architecture of the LTTng tracing suite. ......................................................... 12

Figure 10: Work flow of an LTTng tracing session. .................................................................... 15

Figure 11: Example LTTng client commands list and set-session. ............................... 16

Figure 12: Example LTTng client commands create, start, stop, and destroy. .......... 17

Figure 13: Hierarchy of trace folders. .......................................................................................... 18

Figure 14: Choice of sessions root folder. .................................................................................... 19

Figure 15: Example LTTng local kernel tracing session. ............................................................. 19

Figure 16: Example LTTng remote kernel tracing session. ......................................................... 20

Figure 17: EBNF grammar of trace folder paths. ......................................................................... 21

Figure 18: Summary of the trace folder path structure. ................................................................ 22

Figure 19: Example LTTng snapshot-mode tracing session. ....................................................... 22

Figure 20: LTTng hierarchy. ........................................................................................................ 23

Figure 21: Example LTTng client commands enable- and disable-channel. ............... 24

Figure 22: Example LTTng client command add-context. ................................................... 24

Figure 23: Example LTTng client commands enable-event and disable-event. ........ 28

Figure 24: Example LTTng kernel event enabling/disabling commands. .................................... 29

Figure 25: Example LTTng user-space event enabling/disabling commands. ............................. 32

Figure 26: Sample babeltrace output for a per-userID trace. ................................................ 37

Figure 27: Sample babeltrace verbose (all names and fields) output.................................... 37

Figure 28: The simple.c application. ....................................................................................... 39

Figure 29: The instrumented simple.c application. ................................................................. 40

Figure 30: Compiling, linking and running the application and its tracepoint provider. ............. 41

vi

Figure 31: The tp.h tracepoint provider header template. ......................................................... 43

Figure 32: Examples of tracepoint calls (left) and the corresponding definitions (right)............. 45

Figure 33: More examples of tracepoint definitions. ................................................................... 46

Figure 34: The org.lttng.ust.LTTngUst tracepoint methods. ........................................ 47

Figure 35: The simple.java Java application. ....................................................................... 49

Figure 36: The tp.java tracepoint class. .................................................................................. 50

Figure 37: The tp.c JNI wrapper. .............................................................................................. 52

Figure 38: Compiling and running the simple.java application. .......................................... 52

vii

List of tables

Table 1: Mapping of primitive Java data types to C data types. ................................................... 53

viii

Acknowledgements

Special thanks to Mario Couture and Frédéric Painchaud for encouragement and guidance.

Thanks to the LTTng team for all the help provided: Christian Babeux, Raphaël Beamonte, Yannick Brosseau, Julien Desfossez, Francis Deslauriers, Mathieu Desnoyers, Jérémie Galarneau, David Goulet, Matthew Khouzam, Simon Marchi, Alexandre Montplaisir, Philippe Proulx, Dominique Toupin…and many others. Thanks also to the people who made LTTng possible: Michel Dagenais, Karim Yaghmour and many others who worked on LTTng’s prededecessor, LTT: Werner Almesberger, Greg Banks, Andrea Cisternino, Tom Cox, Rocky Craig, Jean-Hughes Deschênes, Klaas Gadeyne, Jacques Gélinas, Philippe Gerum, Theresa Halloran, Andreas Heppel, Jörg Hermann, Stuart Hughes, Dalibor Kranjčič, Vamsi Krishna, Bao-Gang Liu, Andy Lowe, Corey Minyard, Paolo Montegazza, Takuzo O'Hara, Steve Papacharalambous, Jean-Sébastien Perrier, Frank Rowand, David Watson, Bob Wisniewski, and Tom Zanussi.

1

1 Introduction

Tracing (of any kind) can be used in a variety of contexts. When developing an application, one may frequently use a development environment’s facilities for stepping through code, examining variables dynamically along the way, to make sure the algorithm is doing what it should, or that the input received from other parts of the system (files, network data streams, system messages, etc.) is as expected, or to understand what goes on when a reproducible bug is triggered.

When stepping is not appropriate, tracing statements may be added to the application in order to create a log of its actions, decision points and state transitions, so that examination of the log may answer one’s questions. At this point the software engineering definition of trace which the Institute of Electrical and Electronics Engineers (IEEE) gives has been reached: “a record of the execution of a computer program, showing the sequence of instructions executed, the names and values of variables, or both” [1][2][3].

This definition can be broadened to become “a record of the execution of a computer system […]”, where “computer system” is understood as any computing hardware unit: typically a computer workstation, possibly multi-core, likely connected to one or more networks. A number of CPU-level techniques allow the trace to record the execution of any application (regardless of source code availability), including the operating system itself, its device drivers, etc.

LTTng is the ultimate evolution of this concept. LTTng has extremely low overhead, allowing it to perturb as little as possible the traced system (this is important when analysing the behaviour of some component under heavy load circumstances, for example). It is fully re-entrant, so no special actions need be taken regarding signals, changing threads or interrupts (this is important when analysing the behaviour of interrupt-servicing modules, for instance). Its control variables and tracing data components are wait-free, so it will not cause deadlocks; in fact, a design decision was consciously made to make LTTng lossy in situations where the choice is between trace integrity and system perturbation (this is one of the reasons the overhead is so low). The traces are extremely high resolution: the event time-stamps track individual processor cycles, and events specify on which processor they occurred in multi-core systems.

Furthermore, LTTng is extremely agile: multiple traces can be running at once, started and stopped as needed, and most of their parameters can be changed on the fly. It will soon be possible1, using LTTng, to analyse a trace as it is being captured, and to react to events or patterns of events by changing what is being traced dynamically. LTTng can also stream the trace to a remote location over a network or serial connection of some sort, further lightening the overhead imposed on the traced system.

This Quick Start Guide aims to provide step-by-step guidance in the installation and initial use of the LTTng suite. It should be enough to give you a reasonably coherent mental model of how LTTng works, how you can use it, and what it can do. Outside of the examples sprinkled throughout these pages, explanation of the LTTng component command-line options is left to the built-in help and man pages. Once you feel ready to delve into the intricacies of LTTng, you should switch to the companion publication, the LTTng Comprehensive User’s Guide.

1 This feature will be offered by LTTng 2.4, which is expected in early 2014.

2

1.1 Plan

The Quick Start Guide will first explain how to install the LTTng suite (Section 2). Next, the operational concept is explained (Section 3). Finally, a few key use cases are explained (Section 4).

The baseline is assumed to be the Ubuntu 12.04.3 LTS (Long-Term Support) Linux operating system, and the specific release of LTTng is 2.3.0.

3

2 Installation

If you’ve previously installed some version of LTTng other than 2.3.0, particularly any LTTng 0.x version, you should uninstall it completely to avoid any conflicts.

Once you have a “clean” system, you would normally simply fire up Ubuntu’s Software Centre, the Synaptic Package Manager (if installed), or use an apt-get install command line. However, the pre-compiled packages present in the Ubuntu repositories are hopelessly obsolete —they currently hold version 2.0.1.

A more viable option is to add the LTTng PPA (Personal Package Archive) to Ubuntu’s list of software sources. There are minor problems associated with proceeding this way, and there is also the very real possibility that, as time since publication of this manual elapses, the set of packages and their instructions for use will forge ahead and diverge too much from this manual’s outline. The LTTng Comprehensive User’s Guide explains how to add the LTTng PPA to Ubuntu’s software sources in order to install its packages; the procedure is nearly identical to that described in Section 2.1, below.

The best source is the LTTng Git Web site, which allows you to download specific versions. However, it does mean you must compile and install the packages yourself, a process that can get a little complicated. There is also the slight complication that one of the packages (babeltrace) is hosted on a different Git server, the EfficiOS one. See the LTTng Comprehensive User’s Guide if you want to proceed this way.

As a compromise, this document comes with a CD-ROM containing a small (about 5 megabytes) software repository named drdc that was constructed from the LTTng/EfficiOS Git packages and encapsulates the installation process to make it fully compatible with the apt application and its graphical front-ends (Ubuntu’s Software Centre and Synaptic). Should the CD-ROM be missing, the software repository can be obtained by sending an e-mail request to the author at [email protected]. The rest of the installation procedure, below, assumes you’ve made a copy of the drdc folder in some convenient location.

If you do not install LTTng from the drdc software repository, it is important that you ensure your user account has tracing group membership and that the root session manager daemon is running alone, ideally as a service. Some of the examples given here won’t work properly if this isn’t done. See the LTTng Comprehensive User’s Guide for a discussion of the tracing group.

2.1 Setting up the drdc software repository

From Synaptic, choose Settings: Repositories: Other Software: Add… (from the Ubuntu Software Centre, choose Edit: Software Sources…: Other Software: Add…; see Figure 1) and specify the following APT lines (one at a time; see Figure 2): deb file:///repositories/git.lttng.org/ubuntu precise main deb file:///repositories/git.efficios.com/ubuntu precise main

4

Wherfrom lines user p

e repositorthe commandat its end (in

privileges).

Top: from th

ries is the pd line, you wn the “third-p

Fighe Ubuntu So

ath to the copwill need to ed

arty develope

gure 1: Selectftware Centre

py of drdc yodit /etc/apters” section;

ting the softwe; bottom: fro

ou made on yt/sources.editing sour

ware sources.om the Synapt

your file systelist and aprces.list r

tic Package M

em. If workinppend the APrequires supe

Manager.

ng PT er-

OptioSourcsourcethe pSoftw

You wand hselectagain Comp

onally, unchecces) which are code in the ackages from

ware Centre m

will get a NO_harmless. Yting any drdcexpected and

prehensive Us

FTop: The

Bottom

ck the “(Soue added as a expected sen

m the CD-ROmay need to be

_PUBKEY GPYou will alsoc package ford harmless. Tser’s Guide.

Figure 2: AddOther Softwa

m: The softwa

urce Code)result of this

nse but will dOM appear ie closed and r

PG error wheno get a “NOTr installation The procedure

ding a softwarare tab of the are source AP

” lines (in ts action. The

deploy source in the list, Sre-opened.

n reloading ST AUTHENTIand when appe to make the

re source. e Software SouPT line dialog

the Other Sofe CD-ROM rcode as part

Synaptic mus

Synaptic (seeICATED” warplying your i repository tr

urces g.

ftware tab orepository doof its installat be Reload

e Figure 3); thrning (see Finstallation chrusted is foun

f the Softwaroes not contaiation. To havded, while th

his is expecteigure 4) upohoices. This d in the LTTn

5

re in ve he

ed on is

ng

6

2.2

You ccomm

The espeedensure

Install

can install usmand line (Sec

entire installad) because thees a perfect m

Figu

Figure 4:

ling the p

ing Synaptic ction 2.2.3), a

ation process e drdc packamatch with yo

ure 3: The NO

The authentic

packages

(Section 2.2.as you prefer.

takes a littleages compile our operating

O_PUBKEY err

cation failure

.1), Ubuntu’s

e while (5 toand install thsystem config

rror message.

e warning me

s Software Ce

o 10 minuteshe LTTng suguration.

essage.

entre (Section

, depending ite from sour

n 2.2.2), or th

on the systemrce code. Th

he

m his

Figuremodultracinadaptstime) installreverswitho

The iyou toinstall

Depearr

2.2.1

Type menulttnlttnlttnbabe

MarkiYou c

e 5 illustrateles package

ng. The lttns to the presewithout expll it, you willse is not tru

out needing to

nstallation ofo that group.lation is comp

endencies are rows) to lttn

1 From

“lttng” in tor contextua

ng-modulesng-ust ng-tools eltrace

ing lttng-ucan optionally

s the depende enables kerng-tools pence or absenlicitly requirinl have to re-ue: installingo rebuild lttn

f the topmost To ensure t

plete.

Figure 5: Findicated by ng-modules

Synaptic

the Quick filtal menu) the fos

ust for instay install the lt

dencies betwernel tracing ackage compnce of lttngng them. Forinstall lttng

g lttng-modng-tools).

t package, ltthis works as

Functional bredashed arrow

s and lttng-

ter box (see Ffollowing pack

allation will tttngtop and

een the packawhile the lt

pletes the suitg-modules (r example, if g-tools in dules after

ttng-toolsit should, yo

eak-down of tws; lttng-t-ust (it adap

Figure 6). Sekages:

trigger the ind lttv packa

ages in the Lttng-ust pte and must (at run-time) f you initially

order to enalttng-too

s, creates the ou need to lo

the LTTng patools has “wpts itself to the

elect and mark

nstallation of ages as well.

LTTng suite. package enab

be installed and lttng-omit lttng-

able user-spacls enables

tracing grog out and log

ackages. weak” dependeir absence or

k for installat

f userspace

The lttngbles user-spaclast because -ust (at build-ust and latece tracing (thkernel tracin

roup and joing back in onc

dencies (dotter presence).

tion (Packag

e-rcu as wel

7

g-ce it

d-er he ng

ns ce

ed

ge

ll.

8

2.2.2

Type sequelttnlttnlttnbabe

Instalcan al

2 From

“lttng” in tence: ng-modulesng-ust ng-tools eltrace

ling lttng-ulso install the

Th

the Ubunt

the search bo

s

ust will triglttngtop a

Figure 6: The Quick filter

u Software

x (see Figure

ger the instaland lttv pack

The Synaptic wr box is at the

e Centre

e 7). Select a

llation of usekages.

window. e top centre.

and Install eac

erspace-rc

ch package in

cu as well. O

n the followin

Optionally, yo

ng

ou

2.2.3

Run th$ su$ su$ su$ su

You c

2.3

If anySoftwthroug

3 From

he following udo apt-geudo apt-geudo apt-geudo apt-ge

can optionally

Verify

ything goes wware Centre rgh Chapter 3

Figure

the comm

sequence of cet installet installet installet install

y install lttn

ing the in

wrong during report an erroand then star

e 7: The UbunThe search b

and line

commands: l lttng-ml lttng-ul lttng-tl babeltr

ngtop and lt

nstallatio

installation, tor. If everytt working you

…

ntu Software box is at the to

modules ust tools race

ttv as well.

n

the package sthing went sur way throug

Centre windoop right.

scripts will hasmoothly, yough the use cas

ow.

ave Synaptic u should be ses in Chapte

or the Ubuntall set. Rear 4.

9

tu ad

10

If you run into trouble during installation, consulting the configuration, building and installation logs may shed light on the problem (e.g., the lttng-tools package installs to /usr/src/lttng-tools-2.3.0, wherein you will find the sequence bootstrap.log, configure.log, make.log and install.log). If you run into trouble after installation, consult the LTTng Comprehensive User’s Guide. If that fails, ask a question on the LTTng development (lttng-dev) mailing list ([email protected]).

3

3.1

In thecapturlater aallowthey a

Not ologinswhich

Opera

Archit

e simplest pore events occanalysis in a live trace an

are committed

only are the ts or stores theh allows the c

tional co

tecture

Figure 8: Sim

ssible terms, curring in usecollection of

nalysis, that id to storage, w

trace files poe traces in a s

captured event

oncepts

mplified arch

the LTTng ser-space (light

files called as to say, an iwithout havin

tentially acceshared folder,ts to be stored

s

itecture of the

suite (green it blue) and/or

a trace (dark binterface willng to wait for

essible remot, for instanced remotely (se

e LTTng traci

in Figure 8 ar the kernel (blue). The fol allow applicthe trace files

tely (if the tre), but LTTngee Section 3.

ing suite.

and Figure 9) (salmon) and orthcoming vcations to pros to be closed

aced system g also supplie1.6).

1

allows you tstore them fo

version 2.4 wiocess events ad.

allows remoes a componen

11

to or ill as

te nt

12

In greroles. sessiocomprecordoperastoragFinall

3.1.1

The LcontroInterfsubstialternand co

When

lttn

You Compdetail

Red

eater detail (F The client (

on manager donents. Tracds, deposited te in parallel,

ge or sendingly, at some lat

1 Client

LTTng clientol the entire face) allows thitute their ow

nate client, offontrol LTTng

n using lttng

ng [OPTION

can get buiprehensive Us

in its annexe

Figure 9: Dd arrows repre

Figure 9), the Section 3.1.1

daemon (Sectce providers (

in circular m, withdrawingg them to remter stage a tra

t

t is the lttntracing proc

hird parties (wn clients. Tfering an operg, and visualiz

g, the general

NS] COMMAN

lt-in help byser’s Guide les.

Detailed architesent control

Adap

LTTng suite) serves as thtion 3.1.4) wSection 3.1.3

memory buffeg the event remote storage ace reader (Se

ng applicationcess from stsuch as you, The Eclipse rating-systemze and analyz

l syntax is:

ND [ARGUM

y running ‘llists and expl

tecture of theflow, blue ar

pted from [4]

e breaks downhe user’s cont

which in turn 3, yellow in Fers (tawny). ecords from th

with the helection 3.1.2) p

n. This lightart to finishshould you bLinux Tools

m-independentze the traces.

MENTS]

lttng --heains the optio

e LTTng tracinrrows represe.

n into severaltrol interface controls and

Figure 9) are iThe consum

he buffers anlp of the relaprocesses the

ht-weight comh. An API be inclined tos Project [5]t front-end fo

elp’ or ‘maons, comman

ng suite. ent data flow.

l componentsand sends co

d coordinates instructed to

mer daemons (nd committingay daemon (Straces.

mmand-line t(Application

o do some prois an examp

or LTTng that

an lttng’. nds, and argu

s with differenmmands to ththe remainingenerate even(Section 3.1.5g them to locSection 3.1.6

tool is used tProgrammin

ogramming) tple of such at can configur

The LTTnuments in gre

nt he ng nt 5) al

6).

to ng to an re

ng at

13

3.1.2 Babeltrace

The babeltrace application is used to convert stored traces, which use a compact binary format called the Common Trace Format (CTF) [6], to a variety of other formats, each supplied by a plug-in. The current main plug-in produces highly configurable log-like, human-readable output. The babeltrace API can also be used by other applications to read the traces for arbitrary purposes. See Section 4.3 for an overview of its use.

3.1.3 Tracepoint providers

The tracepoint providers are the pieces of code responsible for producing the main type of event, the tracepoint. Other event types can also be captured by LTTng; these other types generally pre-date LTTng and are usable independently from it, as parts of other tracing solutions. Collectively, the trace event providers are referred to as event providers or event writers.

In the kernel’s domain, a large number of tracepoints are already included in the code but lie dormant. While they remain so, their performance impact is utterly negligible (each tracepoint, when encountered, represents one, maybe two processor cycles at most). Like so many other Linux kernel features, they can be opted out: if the kernel is built with the CONFIG_TRACEPOINTS option turned off, then there won’t be any tracepoints in the final image. Of course, in such a case tracing using tracepoints becomes impossible.

The lttng-modules package adds a number of kernel modules to your system. When these modules are loaded, the kernel’s tracepoints become usable. Their performance impact remains negligible until an actual trace is generated by LTTng; even then, LTTng is designed to have as small a performance cost as possible.

In the user-space domain, tracepoint providers are prepared in a similar fashion, as small binary objects that can be included directly in an instrumented application or supplied to it as a shared object (a dynamic library).

3.1.4 Session daemon

A daemon, in Linux terminology, is a background process which supplies a specific service to the rest of the system (the name is taken from classical Greek mythology and designates an invisible supernatural being with no particular bias towards good or evil). They are often root-owned (because they require elevated privileges), and have the common characteristic of being without standard input or output character streams (stdin, stdout, stderr). They are controlled instead through messages, signals, sockets, virtual file systems, etc.

The LTTng session daemon, lttng-sessiond, is in charge of managing the tracing sessions. It keeps track of the system’s tracepoint providers and instructs them as to which of their possible events they should actually generate event records for, and where those event records should be sent. The session daemon also controls the consumer daemons (see Section 3.1.5), instructing them as to where they should fetch event records and where they should commit those records to permanent storage.

14

There is normally but one LTTng session daemon, running as a root service. The LTTng Comprehensive User’s Guide explains what happens when additional (non-root) session daemons are involved.

3.1.5 Consumer daemons

The LTTng consumer daemons, lttng-consumerd, have the simple but sometimes demanding task of reading the event records generated by the event providers and committing them to permanent storage. Up to three copies may exist in parallel, for reasons of convenience; they differ slightly in terms of their privileges although their task remains the same.

The session daemon creates and manages the consumer daemons.

3.1.6 Relay daemon

The LTTng relay daemon, lttng-relayd, is a specialised consumer daemon whose task differs only in that it receives its event records over the network, from consumer daemons on traced systems.

You can start the relay daemon using this simple command: $ lttng-relayd &

You can easily set the relay daemon up as a service by following the example set by the session daemon. Copy /etc/init.d/lttng-sessiond (a symbolic link) to /etc/init.d/lttng-relayd (no changes required) and /etc/init/lttng-sessiond.conf to /etc/init/lttng-relayd.conf, then modify the latter appropriately.

3.2 Tracing sessions

While you use LTTng, you will create and destroy tracing sessions.

Durinuncomuser ithat se

Theresessiof extcommexamp

ng its existenmmitted, pausinterface whilerve to transit

e are two ‘meion. The litant sessions,

mands to. Byple.

Figure 1

nce, a tracinsed, or runninle the other twtion from stat

eta-commandist command while the sey default, thi

10: Work flow

ng session mng (see Figurwo states are te to state (cr

ds’ which dond (when no seet-sessionis is the last

w of an LTTng

may be descrre 10). The labelled ‘ina

reate, start

n’t act directlession name command alsession to ha

g tracing sess

ribed as beinrunning state

active’. Figurt, stop, dest

ly on a tracinargument is sllows you to ave been cre

sion.

ng in one oe is labelled

ure 10 shows troy).

ng session: lispecified) givpick which s

eated. Figure

1

f three state‘active’ by ththe command

ist and setves you the lisession to sene 11 shows a

15

es: he ds

t-ist nd an

16

$ lttng list Currently no available tracing session $ lttng create firstsession Session firstsession created. Traces will be written in […] $ lttng create secondsession Session secondsession created. Traces will be written in […] (Any lttng commands would be sent to secondsession)$ lttng list Available tracing sessions:

1) secondsession ([…]) [inactive]2) firstsession ([…]) [inactive]

$ lttng set-session firstsession Session set to firstsession (Any lttng commands would be sent to firstsession)

Figure 11: Example LTTng client commands list and set-session.

The create command creates a session. The command options serve to specify where the trace will be stored and/or to put the session in snapshot mode (see Section 3.2.3). The trace is created in the uncommitted state.

The destroy command destroys a session once it has outlived its usefulness. Session destruction first pauses the session if it was running, then waits for the consumer daemons to finish emptying the buffers to storage. The trace files are then closed and any memory used by the session (buffers and various control structures) is freed. Destroying an uncommitted session is very fast, and won’t write anything to storage.

The start command activates a session, telling it to start collecting events. It transitions a session from the uncommitted or paused states to the running state.

Conversely, the stop command pauses a session. Events are no longer collected while the session is paused.

17

$ lttng create firstsession Session firstsession created. Traces will be written in […] $ lttng list Available tracing sessions:

1) firstsession ([…]) [inactive](lttng commands are issued to configure the trace) $ lttng start Tracing started for session firstsession $ lttng list Available tracing sessions:

1) firstsession ([…]) [active]$ lttng stop Waiting for data availability Tracing stopped for session firstsession $ lttng list Available tracing sessions:

1) firstsession ([…]) [inactive]$ lttng destroy Session firstsession destroyed

Figure 12: Example LTTng client commands create, start, stop, and destroy.

The remaining commands change the session configuration but not its state: add-context, enable-channel, disable-channel, enable-event, disable-event, and snapshot. As Figure 10 shows, most but not all of these commands are available in any of the three session states. There are as yet no create-channel or create-event commands: the corresponding enable- commands simply do double duty.

For a tracing session to actually capture events and generate a trace in storage, events have to occur, the session must subscribe to those events and be running, and both the event subscriptions and the channels they have been assigned to must be enabled. Channels and events are covered later, in Section 3.4 and Section 3.5. If you just create, start, stop, and destroy sessions (as in Figure 12), you will naturally generate no traces. Before delving into the details of how to configure tracing sessions, the structure of traces as they appear within storage is examined.

18

3.2.1

Each perma(see Foriginchanncardindual cchannoccurseconchunkbe bro

Whenfolderfolderdomaconju

Un

1 Trace

tracing sessianent storageFigure 13). Tn, domain, chanel. Domainnality of the fcore host withnel0_1. Thred on the fi

nd core. Theks if the traceoken up into c

n created, thers within whirs. A single tins (see Sectinction with o

nless configur

s

ion, once pro. A trace is The files inclannels, event

ns and channefile sets reflech a single chahe channel0rst core, whi

ese channel-Ce is so configuchannel0_0

e tracing sessiich the trace tracing sessioion 3.4.2). E

others.

Fred otherwise

Each sessio

operly configa collection

lude a metadtypes, data ty

els are explacts the numb

annel named c0_0 file wouile the channCPU files maured (see Sec_0, channel

ion specifies or traces are

n can generatEach of the tra

Figure 13: Hiee, all session fon folder may

gured and runof files all redata file whiypes and so fained a little er of cores bechannel0 wuld contain tnel0_1 file way be further ction 3.4.2). Al0_0_1, chan

a session fole stored. Thte multiple traaces stands al

erarchy of trafolders are cry hold one or

n, may generesiding in a sich describesforth), and on

later (see Seeing traced o

would producethe event recwould contair broken up iA channel0nnel0_0_2,

lder which ishe leaves of taces if it tracelone and may

ace folders. reated within r more trace fo

rate one or msingle folder, s the trace itsne set of files ections 3.3 a

on the host. Fe the files chacords for thoin the event rinto sequenti0_0 file could

etc.

s the root of this hierarchyes multiple doy be analysed

the sessions folders.

more traces ia trace folde

elf (its host ofor each activand 3.4). ThFor example, annel0_0 anose events threcords for thally numbered, for exampl

a hierarchy oy are the tracomains or sub

d by itself or i

root folder.

in er of ve he a

nd hat he ed le,

of ce b-in

By defolderfolderthey asessiodaemo

3.2.2

In thehost fyou cr

$ ltSessTrac$ ltAll $ ltTrac[…] $ ltSess

This w

/hom

Wher

efault, sessionr). When stor location instare made at tons root foldeon (each liste

LThe session

2 Sessio

e sessions roofolders for remreate mysess

ttng creatsion mysesces will bttng enablkernel ev

ttng startcing start

ttng destrsion myses

will create a p

me/myname/

e the last part

n folders are oring the tractead. When sthe relay daemers on the relening on diffe

FigLocally storedns root folder

on folder s

ot folder, twomote traces. Hsion as a loca

te mysessission creabe writtenle-event -vents are t ted for se

roy ssion dest

Figure 15:

path like this o

/lttng-tra

t of the sessio

rooted in yoes locally, yostoring the tramon level (selay machine, rent ports).

gure 14: Choid sessions eacr of remotely s

structure

o types of folHost folders cal trace:

ion ated. n in […]--kernel enabled

ession my

troyed

: Example LT

one (assumin

aces/myse

on folder nam

our $HOME/ltou can overriaces remotelyee Figure 14)they must co

ce of sessionsch choose thestored session

lders may excontain in tur

--tracepoin channe

ysession

TTng local ker

ng your user n

ession-201

me is a date an

ttng-traceide this and sy, you have th). To have donnect to sep

s root folder.eir sessions rons is chosen b

xist: session frn session fold

oint --allel channel

rnel tracing s

name is mynam

131113-160

nd time stamp

es folder (thespecify an arhe same choicdifferent hostparate instanc

oot folder. by the relay da

folders for loders. For exa

l l0

ession.

me):

0000

.

1

e sessions roorbitrary sessioces except thts use differenes of the rela

aemon.

cal traces, anample, suppos

19

ot on at nt ay

nd se

20

Now suppose you create a similar session on the machine with the network name remote, and instruct it to store the trace on the machine with the network name local:

(on the local machine:) $ lttng-relayd & (on the remote machine:) $ lttng create mysession --set-url net://local Session mysession created. Traces will be written in net://local $ lttng enable-event --kernel --tracepoint –all All kernel tracepoints are enabled in channel channel0 $ lttng start Tracing started for session mysession […]$ lttng destroy Session mysession destroyed

Figure 16: Example LTTng remote kernel tracing session.

This will create a path like this one on the local machine:

/home/myname/remote/lttng-traces/mysession-20131113-161000

Inside each session folder, two types of folders may exist: domain folders and snapshot folders. Snapshot folders in turn contain domain folders (see Section 3.2.3, below).

The contents of domain folders reflect the buffering scheme being used (see Section 3.4.2). There is only one buffering scheme option in the kernel domain, with no sub-domain structure, so the kernel domain folder is a trace folder. For example: /home/tracer/lttng-traces/session-20131113-161000/kernel/

In the user-space domain, there is some sub-domain structure, a consequence of the buffering scheme. The latter can be either per-user ID or per-process ID. In the former case, the events are assigned to separate sub-domains based on the user ID of the originating processes, further subdivided by bitness (32-bit processes are traced separately from 64-bit ones).

For instance, suppose the trace uses the per-user ID buffering scheme and captures user-space events coming from instrumented applications being run by two users, IDs respectively 1000 and 1001. Further suppose the operating system is 64-bit but that some of the instrumented applications are 32-bit. Then there will be two to four trace folders created, e.g.: /home/tracer/lttng-traces/session-20131114-120000/ust/uid/1000/64-bit/ /home/tracer/lttng-traces/session-20131114-120000/ust/uid/1001/64-bit/ /home/tracer/lttng-traces/session-20131114-120000/ust/uid/1001/32-bit/

In the per-process ID case, each process is assigned to a separate sub-domain. This gives rise to trace folders like these: session-20131114-121000/ust/pid/app-1234-20131114-130503/ session-20131114-121000/ust/pid/otherapp-1522-20131114-131033/ session-20131114-121000/ust/pid/app-1810-20131114-131504/

21

Note that no distinction needs to be made between 32- and 64-bit events in this case since each process has only one bitness.

The structure of trace folder paths (rooted at the sessions root folder) can be summed up in EBNF notation [7] like so: trace_folder = [ host_folder, "/", ] session_folder, "/"

[ snapshot_folder, "/", ] domain_folder, "/" ; host_folder = ? host network name2 ? ; session_folder = ? session name ?, "-", timestamp ; snapshot_folder = ? snapshot name ?, "-", timestamp, "-",

sequence ; domain_folder = "kernel" | ( "ust", "/", sub_domain ) ; sequence = "0" | "1" | "2" | ? and so on ? ; sub_domain = ( "uid", "/", user_folder ) |

( "pid", "/", process_folder ) ; user_folder = ? userID ?, "/", bitness ; process_folder = ? process_name ?, "-", ? processID ?, "-",

timestamp ; bitness = "32-bit" | "64-bit" ; timestamp = date, "-", time ; date = ? date in yyyymmdd format ? ; time = ? time in hhmmss format ? ;

Figure 17: EBNF grammar of trace folder paths.

The structure is also summarised by Figure 18 below.

3.2.3 Snapshots

A tracing session may be run in snapshot mode. This is meant to be used when one is not interested in a complete trace, but would rather let a sliding window of event records exist in memory, which is only occasionally copied to storage. For example, one could run a snapshot-mode trace until a certain application freezes, and only then copy the contents of the trace buffers to storage for analysis.

Each snapshot generates a separate trace folder. For example: session-20131113-161000/snapshot-1-20131113-164512-0/kernel/

2 This normally follows RFC (Request For Comments) 1034:1987 [8], which means it starts with a letter (a through z or A through Z) followed by at most 62 other characters (letter, digit, or hyphen).

22

You cYou t

$ ltSessDefaSnapset (Conf$ ltTrac(Wait$ ltSnap(Repe$ ltSess

create a snapthen take snap

ttng creatsion snappault snapspshot modein overwr

figure session)ttng startcing startt for the right ttng snapspshot recoeat as approprttng destrsion snapp

Figure 18

shot-mode sepshots as requ

te snappy py createdshot outpue set. Everite mode n)t ted for semoment)shot recororded succriate)roy py destroy

Figure 19: E

8: Summary of

ession by adduired with the

--snapshd. ut set toery channand mmap

ession sn

rd cessfully

yed

Example LTTn

f the trace fol

ding the snape snapshot

hot

o: […] nel enablep output.

nappy

y for sess

Tng snapshot-m

lder path stru

pshot optionrecord com

ed for tha

sion snapp

mode tracing

ucture.

n to the creammand. For in

at sessio

py

g session.

ate commannstance:

on will be

d.

e

3.3

Therespaceuser),belon

Withieventsqualif

The fospace

Most kerne

3.4

Channdisabloff gr

Theevent

Doma

e are two doms (one per us you have thgs to either o

in a tracing ss. Channels fied name” as

forthcoming vdedicated to

commands rel (--kerne

Chann

nels are contle channels evroups of event

e session mants are a mix of

ins

mains in Linuser account), he privilege of these domai

session, the din either dom

s being made

version 2.4 of the events ge

require that tel or -k) and

nels

ainers for evven while a trts.

nager is uniquf system-defin

ux: the kernewhich canno

of seeing whains, and LTTn

domains are smain can havup of the dom

f LTTng adds enerated by th

the domain od userspace

ents. They hrace is runnin

Figure 20ue. The domaned and user-

channel,

el and user-spot see inside eat goes on inng can trace a

segregated: thve the same main name an

a third domahe java.uti

of interest be(--userspa

have arbitraryng. This affo

0: LTTng hieraains are system-defined. Tra and event lev

pace. There each other. n all user-spaall of them.

they have sepshort name;

nd the channel

ain, jul (--jull.logging

e specified. ace or -u) op

y names, andords you a wa

archy. m-defined an

acing activity vels.

are actually As the super

aces. Any ru

parate sets ofthink of a chl name.

l or -j), a spJava facilities

This is donptions.

d you can freeay of quickly

d limited in nis controlled

2

multiple user-user (the roounning proces

f channels anhannel’s “full

pecialised uses.

ne through th

ely enable anturning on an

number. The at the session

23

er-ot ss

nd ly

er-

he

nd nd

n,

24

If needed, LTTng will define an initial default channel named channel0 for you. You are free to define additional ones using the enable-channel command. Channels are disabled with the disable-channel command. Most commands allow the channel of interest to be specified through the channel option (--channel or -c). For example:

$ lttng create channelly Session channelly created. Traces will be written in […] $ lttng enable-channel channel0 -k Kernel channel channel0 enabled for session channelly $ lttng enable-channel channel0 -u UST channel channel0 enabled for session channelly $ lttng disable-channel channel0 -k Kernel channel channel0 disabled for session channelly […]

Figure 21: Example LTTng client commands enable- and disable-channel.

Note that the enable-channel and disable-channel commands don’t yet have an all option.

3.4.1 Context

LTTng provides a command (add-context) to attach “context” to the event records of a channel. These values may be useful during analysis of the trace, and may be thought of as “additional payload” (see Section 3.5). The context variables available depend on the domain as well as on your system configuration. In user-space, you can add the ip, pthread_id, procname, vpid and vtid context values. In the kernel domain, you can add procname, vpid, vtid, hostname, nice, pid, ppid, prio, tid, and vppid as well as a large number of perf fields (unless the kernel is configured to omit those). You use the add-context --help command to find out the specific fields available for your configuration.

If the channel option is omitted, context is added to all of the channels that have been created so far. Here is an example: $ lttng enable-channel mch --kernel Kernel channel mch enabled for session mysession $ lttng add-context --kernel --channel mch --type nice --type tid kernel context nice added to channel mch kernel context tid added to channel mch $ lttng add-context -k --type procname kernel context procname added to all channels

Figure 22: Example LTTng client command add-context.

Unfortunately, the list command currently does not display what context has been added to channels, so you’ll need to keep track of that yourself. Also note that context cannot be removed once added.

25

3.4.2 Buffering schemes

Channels have a direct correspondence with the buffers used by LTTng to achieve tracing, and these buffers in turn have a direct correspondence with the files that are stored in the trace folders, as briefly touched upon at the beginning of Section 3.2.1. Recall that, as explained in Section 3.1 and shown by Figure 9, the event writers (producers) add event records to the buffers while the event readers (consumers) move those records to permanent storage.

In the kernel domain, each channel uses as many buffers as there are processor cores. In the user-space domain, there may be further multiplication of the number of buffers, depending on the buffering scheme (see below) and circumstances. Each buffer is itself a collection of sub-buffers organised in circularly linked fashion. You decide how each of those buffers and sub-buffers are configured when you create each channel. For instance, suppose you do this: $ lttng create mysession Session mysession created. Traces will be written in […] $ lttng enable-channel mychannel –kernel Kernel channel mychannel enabled for session mysession

Assuming the machine has a dual core, this will create two buffers, one for each CPU core. The buffers will be committed to storage as the files mychannel_0 and mychannel_1. Once a channel has been created by being enabled ex nihilo, it cannot be modified: you can only switch it on or off by enabling or disabling it.

As mentioned in Section 3.1.5, there are at most three consumer daemons: one services the kernel domain, the other two service the user-space domain (one daemon for each bitness). Each consumer daemon services in round-robin fashion all of its domain’s channels, for all the sessions.

In the example, no channel attributes were specified, so the kernel defaults were used. This means that:

The two channel buffers are in discard mode, meaning that should the writers overwhelm the reader, incoming events will be discarded (lost) because the writers won’t pass the reader (this is known as an upstream or early event loss).

The sub-buffers are each 256 kibibytes (262,144 bytes) in size, and

there are four of them3.

The output files are unlimited in size and

will each be in a single chunk.

The sub-buffers become available for consumption by the reader when they fill up (and the writers move on to the next sub-buffer), and

3 The number of sub-buffers and their size default to different values depending on the domain and buffering scheme.

26

the reader checks for their availability every 200 milliseconds (200,000 microseconds).

Finally, access to the sub-buffers internally uses the splice() system call.

The default channel creation command is equivalent to this fully qualified one: $ lttng enable-channel mychannel --kernel --discard \ --subbuf-size 256K --num-subbuf 4 --tracefile-size 0 \ --tracefile-count 0 --switch-timer 0 --read-timer 200000 \ --output splice --buffers-global

Any or all of these settings can be changed, at channel creation time, like this: $ lttng enable-channel channel2 --kernel --overwrite \ --subbuf-size 64K --num-subbuf 16 --tracefile-size 1G \ --tracefile-count 5 --switch-timer 1000000 --read-timer 0 \ --output mmap --buffers-global

This means that:

The two channel2 buffers are in overwrite mode (also known as flight recorder mode), meaning the writers will not wait for the reader: if the reader can’t cope, the writers will pass it and overwrite the as-yet-unread sub-buffers ahead of the reader, causing the loss of the event records contained therein (this is known as a downstream or late event loss).

The sub-buffers are each 64 kibibytes (65,536 bytes) in size, and

there are sixteen of them (actually, in overwrite mode there will be seventeen sub-buffers; the extra sub-buffer serves as a “passing lane” for the writers, allowing them to write into a new copy of the sub-buffer while the reader continues reading from the older copy of the sub-buffer).

The output files are limited to a size of one gibibyte and

there may be up to five chunks of this size (channel2_0_0, channel2_0_1 and so on until channel2_1_4). Once five gibibytes’ worth of files has been written to storage for a given CPU core, the file chunks will be re-used, creating a file circular buffer of sorts.

The sub-buffers become available for consumption at least every second: if the events are few and far between, every second the current sub-buffer is padded and the writers pushed ahead to the next one.

The reader does not check for sub-buffer availability but is instead signalled whenever a writer fills a sub-buffer.

Finally, access to the sub-buffers internally uses the mmap() system call.

Note that the last command option is unchanged: it controls the buffering scheme, and there is only one buffering scheme option in the kernel domain (as already mentioned in Section 3.2.1).

27

Not so in the user-space domain. There, the buffering scheme can be either buffers-uid (the default) or buffers-pid. In the former case, the channel creates a buffer for each user ID which generates events (see the resulting trace folders in Section 3.2.1). In the latter case, buffers-pid, the channel creates a buffer for each process which generates events. As instrumented processes are launched, buffers and trace folders are created. As the instrumented processes terminate, the traces are finalised and the buffers deleted or re-used. This will naturally result in a great number of smaller separate traces, compared to the per-user ID buffering scheme.

Important: The first channel created in each domain determines the buffering scheme used by all channels later created in that domain. In other words, in user-space you cannot mix per-process and per-user buffering. If you need both, you must create two parallel tracing sessions.

In snapshot mode (see Section 3.2.3), the relation between writers and readers is slightly different. The writers behave as with any overwrite-mode session (no snapshot session channel may be in discard mode), but the readers remain inactive until a snapshot record command is issued. When this happens, the readers determine the current extent of the data in the buffers based on the current writers’ position (basically from just ahead of the writers all the way around to just behind them) and then attempt to read those sub-buffers in sequence. If a sub-buffer has been overwritten by the time its reader gets to it, it is skipped. Thus, if the writers are very active, the readers could lose a number of initial (older) sub-buffers.

Note that a snapshot-mode per-process ID trace will capture only the processes that are active at the time the snapshot record command is issued; the buffers of any processes which may have come and gone before are lost.

3.5 Events

An event occurs when some thread of execution hits a specific instruction. This can be an intrinsic or extrinsic event. Intrinsic events are pre-planned, and usually take the form of a tracepoint call (system calls are also intrinsically instrumented, as will be seen shortly). Source code that has had tracepoint calls added to it is said to be instrumented. Extrinsic events, by contrast, are injected or inserted into an application’s memory image, after the fact. Facilities that can set up such extrinsic events allow arbitrary code to be instrumented “without the code’s consent”, so to speak. Extrinsic events typically take the form of traps (software interrupts) that trigger execution of a trap handler. Setting a breakpoint within an IDE (Integrated Development Environment) is a very similar process.

In the trace, each event record consists of a large number of fields which aid in the analysis. The only variable parts of the event records are the context fields which you may have added to some channels, and each event’s set of “payload” fields. The latter is specific to each event. You can use the babeltrace application (see Section 4.3) to produce human-readable output that filters out the fields you’re not interested in.

28

Events are defined (and later re-enabled) using the enable-event command. Events are disabled with the disable-event command. You may freely define, enable and disable individual events (or groups of events in user-space) during tracing. Here are some examples of the two commands: $ lttng create eventful Session eventful created. Traces will be written in […] $ lttng enable-event sched_switch --kernel kernel event sched_switch created in channel channel0 $ lttng disable-event sched_switch -k kernel event sched_switch disabled in channel channel0 for session eventful $ lttng enable-event --userspace --all All UST events are enabled in channel channel0 […]

Figure 23: Example LTTng client commands enable-event and disable-event.

Be careful when defining an event: you only get one shot at specifying any event modifiers (event type, loglevel, filter). Once an event is created, later invocations of the enable-event command serve only to re-enable it after it has been disabled. Event definitions are immutable and cannot be deleted.

If you don’t need to distribute events between multiple channels, you may omit the channel option of most commands and LTTng will assign all your events to the default channel channel0. However, if you define any other channels, you will need to specify the channel assignation of each event. The assignation of events to channels works a little differently depending on the domain, so each one will be examined in turn.

3.5.1 Kernel events

In the kernel domain, you can only assign an event once: each kernel event may be assigned to no more than one channel.

The kernel contains about 260 tracepoints as of this writing. The number actually available to you will depend on your kernel’s build configuration (for instance, if CONFIG_TRACEPOINTS is turned off, there won’t be any tracepoints at all) and loaded modules. You can obtain the list of currently available tracepoints with the list -k command.

29

Enabling and disabling kernel events is done with the enable-event and disable-event commands. The name(s) supplied to the commands must be complete and match exactly (remember that Linux is nearly universally case-sensitive). You can omit the name(s) only when using the all command option. Examples:

$ lttng enable-event -k sched_switch kernel event sched_switch created in channel channel0 $ lttng enable-event -k sched_kthread_stop,sched_pi_setprio kernel event sched_kthread_stop created in channel channel0 kernel event sched_pi_setprio created in channel channel0 $ lttng enable-event --kernel --all All kernel events are enabled in channel channel0 $ lttng disable-event -k -a All kernel events are disabled in channel channel0 $ lttng enable-event -k –a --tracepoint All kernel tracepoints are enabled in channel channel0

Figure 24: Example LTTng kernel event enabling/disabling commands.

You will need to look at the kernel source code to understand in detail where each kernel event occurs and what its payload fields are, although the names are fairly self-explanatory in most cases. For example, the sched_wakeup payload fields are comm, tid, prio, success, and target_cpu: the first is a character string (the command line which launched the target thread, truncated to 15 characters), the others are integers (respectively the thread ID, the thread’s priority, the command’s success Boolean, and the ID of the target CPU).

Currently, filters and loglevels (see Section 3.5.2) are not implemented for the kernel domain. However, this is a feature which LTTng intends to eventually support.

In addition to tracepoints, LTTng supports a few more kernel event types, detailed below.

3.5.1.1 System calls

System call tracing complements the kernel tracepoints. There are about as many system calls as there are kernel tracepoints: nearly 300, although this number varies somewhat with the underlying architecture and the kernel version.

System calls, in Linux, are the gateways into the kernel. The operating system funnels them through a redirection layer, the system call dispatch table, that maps each system call to a service routine. This dispatch table is instrumented for LTTng, allowing you to trace the service routine entry and exit events. Note that a number of system calls end up dispatched to the same routine: for example, malloc(), calloc() and free() are all serviced by brk(). There is thus a sys_brk event, but no sys_malloc, sys_calloc nor sys_free events. Another important routine is the “not implemented” place-holder, which translates as sys_ni_syscall events.

30

Currently, technical constraints make system call tracing an all-or-nothing affair; if you wish to trace only selected system calls, you will have to filter the unwanted ones during trace analysis. Additionally, system call tracing cannot be turned off (disabled) once turned on. You can, however, achieve the same effect by assigning the system calls to a separate channel and enabling and disabling it instead.

You enable system call tracing by specifying the syscall event type in the enable-event command, like this: $ lttng enable-event --kernel --syscall --all --channel syscallchannel All kernel system calls are enabled in channel syscallchannel

In the resulting trace, each system call service routine entry will appear as a sys_ event (e.g. each call to brk() appears as a sys_brk event). The corresponding system call exit event (system calls are never nested) appears as an exit_syscall event, so named to avoid any confusion with the exit() service routine’s sys_exit event. System calls which the dispatch table doesn’t recognise translate as sys_unknown events.

On 64-bit systems, there is a second set of 32-bit retro-compatible system calls, accessed through a separate dispatch table, that are used by 32-bit applications. The corresponding system call event names are further prefixed by compat_ (e.g. compat_sys_brk, compat_sys_exit, compat_sys_ni_syscall, compat_sys_unknown). The exit_syscall event keeps the same name [9].

Note that when you enable all kernel events (lttng enable-event --kernel --all) without specifying the default tracepoint type, the system calls are enabled as well if they were as yet unassigned.

3.5.1.2 Kprobes and Kretprobes

LTTng integrates the pre-existing kprobe/kretprobe mechanism into its set of supported events. Kprobe is a dynamic (extrinsic) probing mechanism applicable to the kernel domain (similar services will soon be integrated into LTTng for the user-space domain). You simply designate an address within kernel space to watch, and an event will be fired whenever the kernel executes this address.

The address can be explicit or symbolic (you can use any kernel code section symbol), and you can also specify a symbol+offset address. For example: $ lttng enable-event -k kernel_sys_open --probe sys_open+0x01 kernel event kernel_sys_open created in channel channel0

The long list of eligible kernel code symbols can be obtained by the following command: $ cat /proc/kallsyms | grep " [Tt] "

The event will have a single field in its payload: ip, the instruction pointer register value (i.e. the probe’s address). It is otherwise indistinguishable from a kernel tracepoint.

31

You can use kretprobe instead of kprobe by using the function event option instead of the probe event option. Although this is listed and controlled (enabled/disabled) as a single event, there will be two distinct events, EVENT_NAME_entry and EVENT_NAME_return, in the trace. They occur upon function entry and return, respectively. The event payloads have an additional field, parent_ip, the address from whence the function was called.

You can probe the same address repeatedly (if you wanted the same event in multiple channels, for instance) as long as you give a different name to each event. The probes stack, meaning that the last one installed fires its event first.

3.5.2 User-space events

In the user-space domain, events can be assigned to multiple channels, unlike the kernel domain. Currently there are no event types other than tracepoints. Instrumenting an application so that it emits tracepoints is explained in Section 4.2.

A major difference between the kernel and user-space domains is that the set of tracepoints available at any given moment is truly arbitrary, as it depends on which running applications are instrumented. If you ask for the list of user-space tracepoints with the list -u command, you will get an empty list unless some instrumented applications are currently running. For this reason, the session manager does not resolve the enabling and disabling of user-space tracepoints immediately, unlike in the kernel domain.

The payload of user-space events is unknown beforehand unless you’ve taken a look at the source code of the instrumented application. LTTng makes no assumptions about this payload and lets the writer (part of the instrumented application) write what it wants in the buffers. In particular, two different applications could declare homonymous user-space events that have completely different payloads. It is even possible for a single instrumented application to dynamically switch its tracepoint provider libraries during its execution, changing the payloads of its events. This causes no difficulties to LTTng nor to babeltrace.

3.5.2.1 Enabling and disabling user-space events

Instrumented applications, when they start running, register the names of the events they may emit with the session manager. The latter then compares those names with those the user has designated as enabled; if they match, the events are enabled. Similarly, whenever you define a new user-space event or set of events of interest, the names of the currently registered events are checked, and appropriate action taken.

This way, you can configure your trace completely before starting it, and it will capture the events you want it to whenever instrumented applications run, however fleetingly.

Most of the time, you will simply enable all user-space events. This can be done in two equivalent ways (although the client’s feedback looks a little different): $ lttng enable-event --userspace "*" UST event * created in channel channel0 $ lttng enable-event --userspace –all All UST events are enabled in channel channel0

32

LTTng maintains what can be thought of as a “checklist” of user-space event names, paired with an “enabled” or “disabled” tag. Wildcards can be used (unlike in the kernel domain), but only at the end of the event name entries. For example, this command enables all user-space events whose names begin with “tr”: $ lttng enable-event -u "tr*" UST event tr* created in channel channel0

To determine whether an event should be enabled, the session manager tries to match the candidate’s name against each of the enabled rules in its list (disabled rules are ignored); as soon as a match is found, the event is enabled. The disable-event command does not disable events as much as it disables a rule from the list. Consider this example: $ lttng enable-event --userspace –all All UST events are enabled in channel channel0 $ lttng disable-event --userspace "*" UST event * disabled in channel channel0 for session session $ lttng enable-event -u "tr*" UST event tr* created in channel channel0 $ lttng enable-event -u "tra*" UST event tra* created in channel channel0 $ lttng disable-event -u "tr*" UST event tr* disabled in channel channel0 for session session $ lttng list session […] Events: tra* (type: tracepoint) [enabled] tr* (type: tracepoint) [disabled]

* (type: tracepoint) [disabled]

Figure 25: Example LTTng user-space event enabling/disabling commands.

Note how the disable-event -u "tr*" command was matched only with the "tr*" rule: the wildcard character was treated as a literal. All events whose names begin with "tra" will be traced.

The disable-event -u --all command disables all rules, whereas the disable-event -u "*" command disables just the "*" rule, if it exists (This distinction does not exist in the kernel domain because wildcards are not allowed there): $ lttng enable-event --userspace "tr*" UST event tr* created in channel channel0 $ lttng disable-event -u "*" Error: Event *: UST event not found (channel channel0, session session) Warning: Some command(s) went wrong

The forthcoming 2.4 version adds an exclusion option to the enable-event command. For example, this command will enable all user-space events whose names begin with "tr" except for those that match "trace":

$ lttng enable-event -u "tr*" -x "trace"

33

3.5.2.2 LogLevels

When they are defined during instrumentation, user-space tracepoints have a logging level (loglevel for short) assigned. This ranges from TRACE_DEBUG up to TRACE_EMERG (see the LTTng Comprehensive User’s Guide or issue the lttng enable-event --help command forthe complete list). The loglevel determines the importance of the message (the Linux syslog system logging facility uses a similar set of logging levels). When defining the event you want to enable, you can specify an importance threshold or matching requirement.

The loglevel option will enable only those events that have a logging level matching the specified one or better (more important), while the loglevel-only option will enable only those events that match that logging level exactly. For example: $ lttng enable-event -u "tr*" --loglevel TRACE_WARNING UST event tr* created in channel channel0 $ lttng enable-event -u "tri*" –loglevel-only TRACE_INFO UST event tri* created in channel channel0

The current feedback is rather laconic, and the lttng list session command doesn’t distinguish between loglevel and loglevel-only events. This should be fixed in an upcoming version.

3.5.2.3 Filters

Finally, a filter can be attached to the event. This is a simple Boolean expression that must be satisfied (evaluate to true) in order for the event to be traced. This is different from the preceding enabling rules: with them, the session manager does a one-time match of the instrumented application’s declared event against its rules, and then imperatively enables or disables the event at its source. When using a filter, the trace provider code is always executed and it is only once the event record is ready to be written into the buffer that the trace provider can evaluate the filter condition. If it fails, the event record is simply discarded at that point without being written to the buffer. Considering the fairly high overhead of filter evaluation, it may be preferable to enable the events without a filter and let the trace analysis application filter out the unwanted event records.

Filters can test any of the event’s payload fields or context fields (see 3.4.1) using a small subset of the C operators: the comparisons (==, !=, >, >=, <, <=), the Booleans (!, &&, ||), and parentheses. Strings admit a wildcard (*) at their end only, and context fields are accessed using the “$ctx.” prefix. Here are some examples: $ lttng enable-event -u "tru*" -f "intfield > 500 && intfield < 503" Filter 'intfield > 500 && intfield < 503' successfully set $ lttng enable-event –u "eve*" --filter '$ctx.procname == "sample_*"' Filter '$ctx.procname == "sample_*"' successfully set

34

The first example will let through user-space events whose name begins with "tru", which have an integer payload field named intfield, and when that field’s value is either 501 or 502. All of these conditions must be met. For instance, an event named "truman" which does not have an intfield payload field would be rejected. The second example accepts events whose name begins with "eve", but only if issuing from a process whose name begins with "sample_". Note that you need not add context to a channel in order to use a context field in a filter: context is always available to filters.

35

4 Use cases

4.1 Tracing the kernel

The most important trace design decision is what to trace. You get the current list of kernel tracepoints through the lttng list -k command. The output starts with a two-line header and continues with event names in the following format: " %s (loglevel: %s (%d)) (type: tracepoint)"

Where the first string is the event name, the second string its logging level, and the decimal value the logging level’s numerical value (starting with zero for the highest logging level, TRACE_EMERG). The event names are not sorted. Here is a sample: $ lttng list -k Kernel events: ------------- writeback_nothread (loglevel: TRACE_EMERG (0)) (type: tracepoint) mm_shrink_slab_end (loglevel: TRACE_EMERG (0)) (type: tracepoint) timer_start (loglevel: TRACE_EMERG (0)) (type: tracepoint)

Tracing system calls is just a matter of enabling the syscall events, preferably in a dedicated channel: $ lttng enable-event -k --syscall --all -c syscalls All kernel system calls are enabled in channel syscalls

Other pertinent trace events may be defined using the kprobe (--probe) and kretprobe (--function) facilities.

Adding your own kernel tracepoints to a kernel module or a custom kernel is a more advanced topic covered in the LTTng Comprehensive User’s Guide.

4.2 Tracing user-space

Tracing user-space is pretty much the same as tracing the kernel (and may be done in concert) if you are dealing with pre-instrumented applications. To design your own tracepoint providers for integration into your own applications, see Sections 4.4 and 4.5.

4.3 Using babeltrace

The babeltrace trace converter reads Common Trace Format (CTF) traces, such as those produced by LTTng. Its main use is to pretty-print CTF traces into a human-readable text output. The command line is: $ babeltrace PATH [PATH2 …] [OPTIONS]

36

The trace is read from the PATH, which is typically the trace’s session folder (see 3.2.2). Multiple PATHs can be specified using separate arguments. Babeltrace does not require that the folder structure be as produced by LTTng; in fact it simply traverses the supplied PATHs, looking for folders that contain metadata files and their accompanying channel files. For instance, if you run babeltrace on your sessions root folder (~/lttng-traces), the output will merge all of the traces contained therein into the output. Babeltrace’s output is always sorted in strictly increasing event timestamp order.

A typical babeltrace command line would be something like this:

$ babeltrace ~/lttng-traces/mysession-20131225-000000 \ -w ~/lttng-traces/myprintout --no-delta \ --fields loglevel,trace:procname --names payload,context

The default settings are usually a pretty good choice, so you can omit all options save the all-important w or output one (if you don’t specify the output option, babeltrace simply outputs to the console, which is rarely desirable). For details of the numerous babeltrace options and the significance of each part of the output lines, consult the LTTng Comprehensive User’s Guide.

4.3.1 Examples

Suppose the instrumented application uses a single tracepoint provider, sample, which produces the following five events: noticevoid (no payload) infoint intfield: an integer dbgsyslong3 long3field: a fixed-size array of 3 longs dbgproglongs longs: a variable-size array of longs event message: a character string

The events have different logging levels, as will be seen later.

Further suppose that the trace captured all events in channel0 and duplicates of the debug events in channel chlong. Two such traces were run in parallel, one using the default per-user ID buffering scheme (sampler_uid), the other using the alternate per-process ID scheme (sampler_pid). Then babeltrace was used to make printouts of the traces:

$ babeltrace ~/lttng-traces/sampler_uid \ -output ~/lttng-traces/sampler_uid_printout $ babeltrace ~/lttng-traces/sampler_pid \ -w ~/lttng-traces/sampler_pid_printout

37

The first few lines of sampler_uid_printout are:

[10:23:21.949289211] (+?.?????????) tracy sample:event: { cpu_id = 0 }, { message = "Hello World 0" } [10:23:21.949294910] (+0.000005699) tracy sample:noticevoid: { cpu_id = 0 }, { } [10:23:21.949295652] (+0.000000742) tracy sample:infoint: { cpu_id = 0 }, { intfield = 0 } [10:23:21.949298080] (+0.000002428) tracy sample:dbgsyslong3: { cpu_id = 0 }, { long3field = [ [0] = 1, [1] = 10, [2] = 100 ] } [10:23:21.949300519] (+0.000002439) tracy sample:dbgsyslong3: { cpu_id = 0 }, { long3field = [ [0] = 1, [1] = 10, [2] = 100 ] } [10:23:21.949300889] (+0.000000370) tracy sample:dbgproglongs: { cpu_id = 0 }, { _longsfield_length = 0, longsfield = [ ] } [10:23:21.949301174] (+0.000000285) tracy sample:dbgproglongs: { cpu_id = 0 }, { _longsfield_length = 0, longsfield = [ ] } […][10:23:21.953053322] (+0.000014517) tracy sample:dbgproglongs: { cpu_id = 0 }, { _longsfield_length = 2, longsfield = [ [0] = 100, [1] = 1000 ] } […]

Figure 26: Sample babeltrace output for a per-userID trace.

The contents of sampler_pid_printout are the same except that the hostname field (“tracy”) is suffixed by the process name and ID (“:sample_static:29466”):

[10:23:21.949289222] (+?.?????????) tracy:sample_static:29466 sample:event: { cpu_id = 0 }, { message = "Hello World 0" } [10:23:21.949295110] (+0.000005899) tracy:sample_static:29466 sample:noticevoid: { cpu_id = 0 }, { } […]

Babeltrace can produce much more verbose and detailed output with long lines such as this one: timestamp = 10:23:21.949289211, delta = +0.000005699, trace = /home/user/lttng-traces/sampler/ust/uid/1000/64-bit, trace:hostname = tracy, trace:domain = ust, loglevel = TRACE_WARNING (4), name = sample:event, stream.packet.context = { timestamp_begin = 1648769605807958, timestamp_end = 1648769805477160, content_size = 1048560, packet_size = 1048576, events_discarded = 0, cpu_id = 0 }, stream.event.header = { id = ( extended : container = 31 ), v = { extended = { id = 0, timestamp = 1648769618291820 } } }, event.fields = { message = "Hello World 0" }

Figure 27: Sample babeltrace verbose (all names and fields) output.

38

The first line of sampler_pid_printout warrants closer examination:

[10:23:21.949289222] (+?.?????????) tracy:sample_static:29466 sample:event: { cpu_id = 0 }, { message = "Hello World 0" }

It begins with a time-stamp, expressed in hours:minutes:seconds down to the nanosecond. Next is the time elapsed since the previous event record, in seconds. Since this is the first event record of the trace, no such value can be computed, hence the question marks. The event’s point of origin is then identified as occurring on the tracy host, process name sample_static, process ID 29466. The event itself is named sample:event and was emitted by the number 0 processor. The payload follows, which in this case consists of a single character string field named message with a value of "Hello World 0". The LTTng Comprehensive User’s Guideexplains how each type of payload field is rendered by babeltrace.

There are a few limitations with babeltrace’s current output. In particular, there is no way to include the channel names. Also note how the timestamps of the members of the dual-traced events (the debugsyslong3 pair and the debugproglongs pair) are different. This is a known feature of LTTng traces: the timestamp given to each event record is actually the time at which space is reserved for it in the buffer. Because multiply-traced events are sent to different buffers (different channels and/or different tracing sessions) and LTTng isn’t instantaneous (the very act of getting a time-stamp from the system’s high-resolution timers takes time, so there is no way to get the “actual time” at which the tracepoint invocation starts to occur), the timestamps will stray from each other. This will rarely be an issue, but it is nevertheless something that should be kept in mind when tracing events in multiple channels and/or sessions.

4.4 Instrumenting a C application

This section will explain how you can design your own tracepoint provider for integration into your own application. The tracepoint provider will be made as a shared object that is then linked into the application, the most common case. A variety of other configurations are possible, ranging from static inclusion of the tracepoint provider into the application to dynamic loading of multiple tracepoint providers by the application. For these, consult the LTTng Comprehensive User’s Guide.

The application is not statically aware of the shared object, which allows you to run it either instrumented or not, simply by changing its invoking command line.

Each application module that needs instrumentation simply adds tracepoint() instructions at the appropriate places and includes a tracepoint provider header (e.g. tp.h) in its prologue. Thetracepoint provider header is also used to generate the tracepoint provider executable, which can then be linked in various ways to the target application.

39

There can be an arbitrary number of tracepoint providers used within an application, but they must each have distinct provider names. Each tracepoint provider may define several tracepoint events, each with its own name and arguments. All tracepoints are called using the tracepoint macro, whose first two arguments are the tracepoint provider name and the tracepoint event name. Because fully qualified event names include a tracepoint provider part and an event part, events supplied by two different tracepoint provider may have the same event names without causing any ambiguity.

For demonstration purposes, here is a very simple application (a simplified version of /usr/src/lttng-ust-2.3.0/doc/examples/drdc/sample.c, which the drdc software repository’s lttng-ust package installs):

#include <stdio.h> //For printf #include <unistd.h>

short NLOOPS = 10000;

short f(short ceiling, short i) {

return ceiling - i; }

int main(int argc, char **argv) {

int i = 0; char themessage[20]; //Can hold up to "Hello World 9999999\0"

fprintf(stderr, "simple starting\n"); for (i = 0; i < NLOOPS; i++) {

sprintf(themessage, "Hello World %u", f(NLOOPS, i)); usleep(1);

} fprintf(stderr, "simple done\n"); return 0;

}

Figure 28: The simple.c application.

This simple application simply sleeps 10,000 times for 1 millisecond each time, modifying an internal “Hello World” string every time. It can be compiled, linked and run like this: $ gcc -c -o simple.o simple.c $ gcc -o simple simple.o $ ./simple simple starting simple done

40

A tracepoint will now be added to this application, in order to trace the internal string. First, the source is modified like this: #include <stdio.h> //For printf #include <unistd.h> #include "tp.h"

int NLOOPS = 10000;

int f(int ceiling, int i) {


int main(int argc, char **argv) {

int i = 0; char themessage[20]; //Can hold up to "Hello World 9999999\0"

fprintf(stderr, "sample starting\n"); for (i = 0; i < NLOOPS; i++) {

sprintf(themessage, "Hello World %u", f(NLOOPS, i)); tracepoint(sample, event, themessage); usleep(1);

} fprintf(stderr, "sample done\n"); return 0;

}

Figure 29: The instrumented simple.c application.

The added lines are in red. Note that the first two tracepoint macro arguments (the tracepoint provider name and tracepoint event name) must be literal strings: they cannot be string variables. The remaining arguments are variables or constants, just as with regular functions. LTTng currently supports between zero and ten arguments (inclusive). Writing the tracepoint provider (tp.h) is discussed in the following sub-section (4.4.1).

41

Here is how compiling, linking and running the instrumented application has changed: $ gcc -I. -fPIC -c -o libtp.o -x c tp.h \ -D TRACEPOINT_CREATE_PROBES $ gcc -shared -Wl,-soname,libtp.so.1 -Wl,-no-as-needed \ -o libtp.so.1.0 -L/usr/local/lib -llttng-ust libtp.o $ ln -sf libtp.so.1.0 libtp.so.1 $ ln -sf libtp.so.1 libtp.so

$ gcc -I. -c -o simple_instr.o simple.c \ -D TRACEPOINT_DEFINE -D TRACEPOINT_PROBE_DYNAMIC_LINKAGE $ gcc -o simple_instr simple_instr.o -L/usr/local/lib -ldl

$ ./simple_instr simple starting simple done $ LD_PRELOAD=./libtp.so ./simple_instr simple starting simple done

Figure 30: Compiling, linking and running the application and its tracepoint provider.

The tracepoint provider (packaged in libtp.so) is dynamically included in simple_instr without static awareness. This lack of a declared dependency allows the application to run with or without libtp.so. If libtp.so is not preloaded, the application runs but won't be traceable. Putting libtp.so in the path won't work: the application does not actively look for it. Since the overhead introduced by inert tracepoints is negligible, this allows you to distribute the application without needing to make a separate tracepoint-less build (unless you don’t want the distributed application to be traceable at all, of course).

The -L option simply adds its argument to the list of directories to be searched to satisfy any -l options [10]. It is used here (second and sixth commands) to add /usr/local/lib, where LTTng installs most of its support libraries, because some distributions (notably Ubuntu) don’t include it in the default path.

The first four commands compile and link libtp.so and create the standard versioning symbolic links (libtp.so is assumed to be version 1.0 here).

The first gcc call compiles tp.h, defining the TRACEPOINT_CREATE_PROBES symbol (-D), and specifying position-independent code (PIC) should be generated (-fPIC). Position-independent code is a requirement of all shared object (.so) libraries. The -x c option ensures gcc treats tp.h as a C file instead of a header (.h) file. Without -x c, gcc would try to generate a compiled header (.gch) file, which wouldn’t be usable even if it succeeded.

42

The second gcc call packages libtp.o into the libtp.so.1.0 shared object library, naming it libtp.so.1 and linking it (-l) with liblttng-ust. Adding the as-needed linker flag (which is the default for certain distributions) would mean the libraries mentioned in the command line (in this case just liblttng-ust) would generate a NEEDED tag in the resulting object only if actually satisfying an otherwise undefined symbol reference. The way tp.h is set up, the resulting library wouldn't NEED liblttng-ust, which would break tracing (the Linux dynamic loader wouldn’t resolve the symbols). Specifying no-as-needed makes certain that the NEEDED tags are included regardless.

The last two gcc calls compile and link the instrumented application. The first difference in the compilation line is that the current directory was added to the header file search path (-I.) so that tp.h can be found. The second difference is that two symbols were defined, TRACEPOINT_DEFINE and TRACEPOINT_PROBE_DYNAMIC_LINKAGE, which instruct tp.h to set up the tracepoint calls in a manner which is compatible with a shared object library. The final gcc call, in addition to telling the linker where to find the LTTng support libraries (-L/usr/local/lib), links the application with libld, the Linux loader’s library. This link is also a standard requirement of shared object library usage.

The ./sample_instr line runs the instrumented application without activating the tracepoints. The LD_PRELOAD=./libtp.so ./sample_instr line, finally, runs the instrumented application and activates the tracepoints, simply by virtue of making the tracepoint provider available to the Linux dynamic loader (ld.so; this is the library that prepares an executable image for execution by completing its linking).

43

4.4.1 Writing the tracepoint provider

A tracepoint provider header file such as tp.h is written using the following template:

#undef TRACEPOINT_PROVIDER #define TRACEPOINT_PROVIDER sample

#undef TRACEPOINT_INCLUDE #define TRACEPOINT_INCLUDE "./tp.h"

#if !defined(_SAMPLE_PROVIDER_H) || defined(TRACEPOINT_HEADER_MULTI_READ) #define _SAMPLE_PROVIDER_H

#include <lttng/tracepoint.h>

#ifdef __cplusplus extern "C" { #endif /* __cplusplus */

TRACEPOINT_EVENT( sample, event, TP_ARGS(const char *, text), TP_FIELDS(

ctf_string(message, text) )

) TRACEPOINT_LOGLEVEL(

sample, event, TRACE_WARNING)

#ifdef __cplusplus } #endif /* __cplusplus */

#include <lttng/tracepoint-event.h>

#endif /* _SAMPLE_PROVIDER_H */

Figure 31: The tp.h tracepoint provider header template.

The variable parts are in red.

The “./tp.h” occurrence must refer to the tracepoint provider header file’s own name.

44

The “sample” and “SAMPLE” occurrences are replaced by the provider’s name. “sample” is the first argument of each “tracepoint()” macro call in the instrumented code.

The “event” occurrences are replaced by the event’s name. “event” is the second argument of each “tracepoint()” macro call.

The TP_ARGS macro is used to specify the types of arguments that can be passed to the tracepoint() macro calls.

The TP_FIELDS macros specify how the TP_ARGS arguments are translated into payload fields.

The “TRACE_WARNING” occurrence is replaced by the event’s logging level. See the LTTng Comprehensive User’s Guide or issue the lttng enable-event --help command for the complete list of tracepoint logging levels.

Here is a printout, generated using babeltrace, of a trace of the user-space events generated by the instrumented application simple_instr:

$ babeltrace ~/lttng-traces/trace-20140114-103355/ [10:34:29.452245652] (+?.?????????) host sample:event: { cpu_id = 0 }, { message = "Hello World 10000" } [10:34:29.452396372] (+0.000150720) host sample:event: { cpu_id = 0 }, { message = "Hello World 9999" } [10:34:29.452484177] (+0.000087805) host sample:event: { cpu_id = 0 }, { message = "Hello World 9998" } …

The trace was called trace and was started at 10:33:55 on 14 January 2014. For an overview of babeltrace’s use, consult Section 4.3 and the man babeltrace pages; for the details, consult the LTTng Comprehensive User’s Guide.

4.4.2 The tracepoint definition macros

The TRACEPOINT_LOGLEVEL macro may be omitted entirely, in which case the tracepoint defaults to a logging level of TRACE_DEBUG_LINE.

45

The additional arguments of the tracepoint call (those that appear after the provider and event names) are specified by the TP_ARGS macro as type and name pairs. Each type may be any primitive C type (char, short, long, etc.) or pointer to a primitive type (e.g. char *, short *, long *, etc.; this is also how arrays are passed in). Here are a few simple examples, showing the tracepoint() macro calls (appearing in the instrumented applications) and the corresponding tracepoint definitions (appearing in the tracepoint provider header): tracepoint(prov, evntA); TRACEPOINT_EVENT(prov, evntA,

TP_ARGS(), TP_FIELDS() )

tracepoint(prov, evntB, string); TRACEPOINT_EVENT(prov, evntB, TP_ARGS(char *, textarg), TP_FIELDS( ctf_string(message, textarg) ) )

tracepoint(prov, evntC, a, b); TRACEPOINT_EVENT(prov, evntC, TP_ARGS(long int, longa, long int, longb), TP_FIELDS( ctf_integer(long int, one, longa) ctf_integer(long int, two, longb) ) )

Figure 32: Examples of tracepoint calls (left) and the corresponding definitions (right).

The TP_FIELDS macro defines the payload fields. This is done using a series of space-separated ctf_* calls that each specify an event payload field name and its contents. Below are some event definitions illustrating some of the other possibilities available to you when designing tracepoints. Note in particular how the relationship between the tracepoint’s arguments and the payload fields is a rather loose correspondence.

46

//event1 has a 'message' payload field (event-defined) TRACEPOINT_EVENT(sample, event1,

TP_ARGS(), TP_FIELDS(ctf_string(message, "Unchanging string"))

) //event2 has a 'message' payload field (application-defined) TRACEPOINT_EVENT(sample, event2,

TP_ARGS(const char *, textarg), TP_FIELDS(ctf_string(message, textarg))

) //event3 has 'number' and 'xored' payload fields TRACEPOINT_EVENT(sample, event3,

TP_ARGS(unsigned long int, longarg), TP_FIELDS(

ctf_integer(unsigned long int, number, longarg) ctf_integer_hex(

unsigned long int, xored, longarg ^ 0xFFFFFFFFL) )

) //event4 has an 'array' payload field (of variable size) TRACEPOINT_EVENT(sample, event4,

TP_ARGS(long int*, longs, size_t, arraysize), TP_FIELDS(

ctf_sequence(long int, long_array, longs, size_t, arraysize) )

) //event5 has no payload but does expose a filtering field TRACEPOINT_EVENT(sample, event5,

TP_ARGS(short int, value), TP_FIELDS(

ctf_integer_nowrite(short int, filter_switch, value) )

)

Figure 33: More examples of tracepoint definitions.

The event3 event uses ctf_integer_hex, which does not change the way the payload is stored in the trace but instructs babeltrace to render the xored field in hexadecimal notation instead of the default decimal. Thus tracepoint(sample, event4, 0x02468ACFL) will generate the payload { number = 38177487, xored = 0xFDB97530 } instead of { number = 38177487, xored = 4256789808 } in the babeltrace output.

The last event, event5, illustrates the nowrite tracepoint mechanism. Instances of event5 accept a signed short integer (16-bit wide) argument in the tracepoint() call, but this value is not written into the trace. Its only purpose is to be used by an eventual filter attached to theevent by a tracing session. For instance: $ lttng enable-event -u "sample:event5" -f "filter_switch > 500"

47

Several other field specifications (ctf_* calls) exist in order to specify statically or dynamically sized arrays, floating-point values, or network byte ordering. See the LTTng Comprehensive User’s Guide for the complete listing of ctf_* calls.

4.4.3 C application instrumentation summary Write the instrumentation instructions in the original application C code, importing

the tracepoint provider header as required;

Write the tracepoint provider C header (use tp.h as a template and simply replace or add tracepoint event declarations as required);

Compile the C tracepoint provider header into a shared object library;

Recompile the instrumented C application;

Run the instrumented C application while pre-loading the shared object library.

4.5 Instrumenting a Java application

With the current 2.3 version of LTTng, there are several ways of instrumenting a Java application for LTTng exploitation.

4.5.1 liblttng-ust-java

A ready-made one is provided by the lttng-ust package, which compiles and installs its liblttng-ust-java subfolder if Java is detected during its own build. This results in the /usr/local/lib/liblttng-ust-java.so shared library, which acts as a container for a particular tracepoint provider named lttng_ust_java. From Java, an org.lttng.ust.LTTngUst object (packaged in liblttng-ust-java.jar) wraps native calls to liblttng-ust-java.so.

LTTngUst is an abstract class with a handful of static methods. From your Java application, you use the org.lttng.ust package, call LTTngUst.init() to connect to the liblttng-ust-java.so native library (this is analogous to using dlopen for the same purpose) and later invoke the tracepoint methods as needed, depending on the payloads: public static native void tracepointInt(String name, int payload); public static native void tracepointIntInt(String name, int payload1, int payload2); public static native void tracepointLong(String name, long payload); public static native void tracepointLongLong(String name, long payload1, long payload2); public static native void tracepointString(String name, String payload);

Figure 34: The org.lttng.ust.LTTngUst tracepoint methods.

48

The trace events generated by this library are named <type>_event, where <type> identifies the payload (e.g. int_event, int_int_event, etc.). For example, the LTTngUst.tracepointInt method generates a lttng_ust_java:int_event with a payload consisting of the fields name and int_payload.

Using liblttng-ust-java is a quick-and-dirty solution to Java tracing needs, although it leaves you no choice as to the tracepoint provider name, event names, or payload field names. If you need other payload types (such as float), you will need to augment the liblttng-ust-java code and recompile it.

4.5.2 Java Native Interface (JNI)

A more flexible approach consists in wrapping the tp.h code in an adaptor tp.c to make the libtp.so shared object usable by Java’s JNI facility. This has the advantage that the same shared object tracepoint provider may be used by C and Java applications, and leaves tp.h untouched.

49

First, here is a simple.java application that does the same thing as simple.c (the tracepoint call is highlighted in red): package org.lttng.ust.drdc;

//import org.lttng.ust.drdc.tp;

class simple {

public static short NLOOPS = 10000;

public static short f(short ceiling, short i) {


public static void main(String[] args) { int i = 0;

System.out.println("simple starting");

while (i < NLOOPS) { //Prepare themessage in this module String themessage = String.format("Hello World %d",

f(NLOOPS, i)); tp.tracepoint(themessage); try { Thread.sleep(0, 1000); // 0 ms + 1000 ns = 1 µs } catch (InterruptedException ie) { //No action }; i++; } System.out.println("simple done");

} }

Figure 35: The simple.java Java application.

50

The tracepoint facility is supplied by a Java class, org.lttng.ust.drdc.tp. Because it lies in the same package as simple.java, the import line is unnecessary and has been commented out. The tp.java class uses JNI to invoke C shared library methods that are presumed to implement the tracepoints: package org.lttng.ust.drdc;

public abstract class tp { private static boolean loaded = false;

static { load(); }

public static boolean isLoaded() { return loaded; }

/** * Load the tracepoint provider.* @return a boolean indicating whether the provider has been loaded

*/ public static boolean load() { if (loaded) return true; try {

//On Linux, this will look for libtp.so //On Windows, it’ll look for tp.dll System.loadLibrary("tp"); loaded = true;

} catch (SecurityException se) { System.err.println(se.toString());

} catch (UnsatisfiedLinkError ule) { System.err.println(ule.toString());

} catch (NullPointerException npe) { //Can't happen System.err.println(npe.toString());

} return loaded; }

/** * Safely insert a simple:event tracepoint.* @param text The payload.

*/ public static void tracepoint(String text) { if (loaded) _tracepoint(text); } private static native void _tracepoint(String text); }

Figure 36: The tp.java tracepoint class.

51

Each tracepoint event defined in tp.h will need a pair of tracepoint and _tracepoint entries in the Java wrapper. This is tedious but not hard. By guarding the tracepoint invocations with the tp.loaded member, the Java applications that use the tp Java class will behave almost exactly like the compiled C applications: if the library is in the path (the java.library.path, which is Linux’s LD_LIBRARY_PATH), tracing is possible; if it is not, tracing is not possible but the application works unhindered. (LD_LIBRARY_PATH and LD_PRELOAD work in very similar ways)

The tp Java class is compiled and a C-compatible header is then extracted from the compiled class using the Java Development Kit’s javah utility:

$ javac org/lttng/ust/drdc/tpx.java $ javah org.lttng.ust.drdc.tpx

Note the subtly different conventions: javac expects a file path, whereas javah expects a Java class name. The javac line creates the file org/lttng/ust/drdc/tpx.class, while the javah line creates the org_lttng_ust_drdc_tpx.h file.

The gcc compilation of tp.h is modified slightly in order to proceed from tp.c instead. The original line from Section 4.4.1 and the modified line are as follows: $ gcc -I. -fPIC -c -o libtp.o -x c tp.h \ -D TRACEPOINT_CREATE_PROBES $ gcc -I. -fPIC -c -o libtp.o tp.c \ -D TRACEPOINT_CREATE_PROBES \ -D TRACEPOINT_DEFINE \ -I/usr/lib/java_include_path

The additional -I option is to make sure jni.h is found (it is used by org_lttng_ust_drdc_tpx.h). The actual path will vary depending on the Java package used. Note also that a -D TRACEPOINT_DEFINE was added to the tracepoint provider object compilation. This is because the tracepoint used by Java is in the shared object (since it cannot be put in the Java class itself). The augmented libtp.so remains perfectly usable by C applications: all that was done was to add entry points that JNI can use.

52

Here is the tp.c adaptor:

#include "org_lttng_ust_drdc_tp.h" #include "tp.h" JNIEXPORT void JNICALL Java_org_lttng_ust_drdc_tp__1tracepoint(

JNIEnv *env, jclass jcls, jstring text)

{ jboolean iscopy; const char *text_cstr =

(*env)->GetStringUTFChars(env, text, &iscopy); tracepoint(simple, event, text_cstr); (*env)->ReleaseStringUTFChars(env, text, text_cstr);

}

Figure 37: The tp.c JNI wrapper.

The org_lttng_ust_drdc_tp.h included in tp.c declares the method exportation that tp.java expects from the library. This method is implemented by tp.c in its closing lines. There will be a JNIEXPORT method for each TRACEPOINT_EVENT in tp.h. You can get the method signature line from org_lttng_ust_drdc_tp.h: you only need to supply parameter names and a method body. Quite often, this method body is just a tracepoint call because the fundamental numerical types are nearly the same between Java and C. Here some extra operations were required to turn a Java String object into a C string (a null-terminated char array) and release the C string once the tracepoint is done. Dealing with primitive arrays requires similar operations such as GetIntArrayElements / ReleaseIntArrayElements. See [11] for the details.

Compiling and running the Java equivalent application is done like this: $ javac org/lttng/ust/drdc/simple.java $ java org.lttng.ust.drdc.simple simple starting java.lang.UnsatisfiedLinkError: no tp in java.library.path simple done $ LD_LIBRARY_PATH=. java org.lttng.ust.drdc.simple simple starting simple done $ java –Djava.library.path=. org.lttng.ust.drdc.simple simple starting simple done

Figure 38: Compiling and running the simple.java application.

53

The first run of simple.java is done without supplying libtp.so, so the application gives an UnsatisfiedLinkError error message but otherwise continues executing. The other two runs show two ways that libtp.so may be supplied to simple.java.

Table 1 below lists the mapping that exists between the primitive Java and C data types [12]. The type conversions are also consigned in the JDK’s include/jni.h header.

Table 1: Mapping of primitive Java data types to C data types.

Java C

void void

boolean unsigned char†

Byte signed char‡

short signed short

Int signed long‡

Long signed long long‡

Char unsigned short

float float

double double

† The jni.h header defines the C unsigned char values JNI_FALSE (0) and JNI_TRUE (1).

‡ These types may be mapped differently on some machine architectures.

To summarise the JNI instrumentation process:

Add the instrumentation instructions to the original application Java code, importing the tracepoint provider Java class as required;

Write the tracepoint provider Java class (use tp.java as a template, replacing or adding tracepoint methods as required);

Write the tracepoint provider C header if it does not already exist;

Compile the tracepoint provider class (javac);

Extract the JNI tracepoint header (javah);

Create the C tracepoint provider code wrapper, adding implementations of each JNI tracepoint method (copy and complete the JNIEXPORT lines from the JNI tracepoint header);

Compile the C tracepoint provider code wrapper into a shared object library;

Recompile the instrumented Java application (javac);

54

Run the instrumented Java application while specifying the java.library.path to the shared object library.

4.5.3 Other approaches

Whereas JNI allows one to write supporting libraries for Java, JNA (Java Native Access, available on Ubuntu 12.04 LTS as the libjna-java package) focuses on giving Java direct access to existing external libraries. JNA uses a small native library called the foreign function interface library (libffi) to dynamically invoke native code. Using JNA, one creates a Java interface that describes the functions and structures of the target native library, without static bindings, header files, or any compile phase. JNA could be a much better match than JNI for the relatively simple task of invoking tracepoints from Java.

The forthcoming version 2.4 of LTTng includes a new domain, jul, which can capture java.util.logging events. Using LTTng 2.4, Java instrumentation follows the standard JUL pattern and no specially-cooked tracepoint providers are required.

55

References

[1] IEEE 610.12-1990 (revising 729-1983 [misprinted as 792-1983]), IEEE Standard Glossary of Software Engineering Terminology, 1990, http://web.ecs.baylor.edu/faculty/grabow/Fall2013/csi3374/secure/Standards/IEEE610.12.pdf

[2] IEEE 100-2000, The Authoritative Dictionary of IEEE Standards Terms, Seventh Edition, IEEE, 2000, http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4116787

[3] ISO/IEC/IEEE 24765:2010(E), Systems and software engineering — Vocabulary (Ingénierie des systèmes et du logiciel — Vocabulaire), ISO/IEC/IEEE, 2010, http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=05733835

[4] Mathieu Desnoyers, LTTng 2.0 Low-Overhead Tracing Architecture diagram, 2011 http://lttng.org/sites/lttng.org/files/LTTng2_0Architecture_pa3.pdf

[5] Eclipse Linux Tools Project Web page, http://www.eclipse.org/linuxtools/projectPages/lttng/

[6] Mathieu Desnoyers, Common Trace Format (CTF) Specification (v1.8.2), 2012 http://git.efficios.com/?p=ctf.git;a=blob_plain;f=common-trace-format-specification.txt;hb=master

[7] ISO/IEC 14977:1996(E), Information technology — Syntactic metalanguage — Extended BNF, ISO/IEC, 1996, http://www.cl.cam.ac.uk/~mgk25/iso-14977.pdf

[8] RFC 1034:1987, P. Mockapetris, Domain Names – Concepts and Facilities, November 1987, http://tools.ietf.org/html/rfc1034

[9] Mathieu Desnoyers, Re: What are sys calls starting with compat?, lttng-dev Digest, vol. 68, no. 21, 10 dec 2013 http://www.mail-archive.com/[email protected]/msg05116.html

[10] Richard M. Stallman et al., Using the GNU Compiler Collection: For gcc version 4.9.0 (pre-release) (GCC), http://gcc.gnu.org/onlinedocs/gcc.pdf

[11] Oracle, Java SE Documentation, Chapter 4: JNI Functions http://docs.oracle.com/javase/7/docs/technotes/guides/jni/spec/functions.html

[12] Oracle, Java SE Documentation, Chapter 3: JNI Types and Data Structures, Primitive Types http://docs.oracle.com/javase/7/docs/technotes/guides/jni/spec/types.html#wp428

[13] IEEE 729-1983, IEEE Standard Glossary of Software Engineering Terminology, IEEE, 1983 (superseded by IEEE 610.12-1990)

56

This page intentionally left blank.

57

List of symbols/abbreviations/acronyms/initialisms

API Application Programming Interface

APT Advanced Packaging Tool

BNF Backus–Naur Form

CD-ROM Compact Disc – Read-Only Memory

CPU Central Processing Unit

CTF Common Trace Format

DND Department of National Defence

DRDC Defence Research and Development Canada

DSTKIM Director Science and Technology Knowledge and Information Management

EBNF Extended Backus–Naur Form

GNU GNU’s Not Unix!

GPG GNU Privacy Guard

HeLP-CCS Host-Level Protection of Critical Computerized Systems

ID Identification

IDE Integrated Development Environment

IEC International Electrotechnical Commission

IEEE Institute of Electrical and Electronics Engineers

ISO Originally International Organization for Standardization

JDK Java Development Kit

JNA Java Native Access

JNI Java Native Interface

JUL java.util.logging

LTS Long-Term Support

LTT Linux Trace Toolkit

LTTng Linux Trace Toolkit next generation

PIC Position-Independent Code

PPA Personal Package Archive

RDDC Recherche et développement pour la défense Canada

RFC Request For Comments

UST User-Space Tracing

58


59

Glossary

Words in italics refer to other glossary entries.

Buffering scheme

A template describing how, in a given domain, each channel should be multiplied according to some pertinent parameter, as well as the number and size of the sub-buffers making up each instance. For instance, in the user-space domain buffer multiplication can be done either per-user-ID or per-process-ID. For a given channel, the buffering scheme further specifies the number and size of the sub-buffers making up each buffer. Multiplication per CPU is implicit in all buffering schemes.

Channel

A named arbitrary grouping of event types. The channel scope is the domain; tracing can be activated and deactivated on a per-channel basis (there are other control modalities as well). Each channel has an associated buffering scheme, the leaves of which are sets of buffers, one buffer per CPU. Each of the buffers maps to a trace file (or set of files if chunked).

Consumer (daemon)

A daemon whose single task is to read tracing session buffers and write the recovered event records to local or remote storage. Reading a sub-buffer frees it for eventual re-use by a tracer (except in the snapshot case). There can be up to three consumer daemons per session daemon: one for the kernel domain, and a pair of 32- and 64-bit consumers for user-space. Each consumer daemon may service several tracing session buffers at once: all of the buffers of the active channels of each active session of its owning session daemon for its domain and bitness. Recall that each channel may have as many buffers as there are CPUs times the number of active users or processes (depending on the buffering scheme).

Domain

A top-level subdivision of the tracing parameter space. Originally, these were hierarchical protection domains, matching the operating system’s hardware-enforced privilege levels: kernel and user-space. Now each domain is simply characterised by the existence of specific restrictions on the possible tracing commands and tracing parameter values. For example, events from the kernel domain can only be assigned to a channel once, whereas events from the user-space domain can each be assigned to multiple channels. The possible buffering schemes depend on the domain, etc. Event and channel naming scopes match domains: a channel name in one domain refers to a separate object than the same channel name in another domain, and so on.

Event (occurrence)

An occasion where an execution thread stepped into a tracepoint call. Besides tracepoints, events can also be generated by other facilities such as syscall (whichproduces an event every time a system call service routine is entered and exited) and kprobe/kretprobe (which produces an event every time its interrupt is serviced).

60

Event field

Each of the logical units that make up an event record. Each event field has a name and a value. Some event fields are themselves records (C structures) and there may be several levels of such nesting. Non-record fields are primitive (integers, floats, characters) data types or simple arrays of primitives. The event’s payload, when there is one, is contained by the event.fields event field and appears last in each event record.

Event name

A unique identifier within the containing tracepoint provider’s scope that serves to specify the event type. In the user-space domain, the fully qualified event name consists of the provider name and the event name; in the kernel domain, the provider is implicit. During tracing, events can be turned on and off individually (there are other control modalities as well).

Event packet

The contents of an individual tracing sub-buffer, minus any trailing padding, consisting of contiguous event records. The event records of an event packet are in strictly increasing timestamp sequence. The event packet is the consumer daemon’s unit of data transmittal.

Event stream bundle

The set of event streams issuing from all CPUs for a given event name. The per-CPU event streams are directed to distinct buffers and eventually wend their way to distinct trace files, even though all event occurrences are handled by the same consumer daemon thread. Event enabling and disabling acts over the event stream bundle.

Event type

Each event occurrence is an instance of an event type, an abstract description of the occurrence’s expected record structure. Each event type is identified by its fully qualified event name and is handled by one or more tracepoint providers.

Instrumentation

The process of adding intrinsic probes to the source code of an executable.

Payload

The set of event fields that is peculiar to that event type, if any. An event type may be payload-less. The payload is user-defined (programmed when designing the tracepoint provider) and appears as the last field of the event record when present.

Relay daemon

A specialised consumer daemon that receives instructions and tracing data from remote systems. Its only task is to commit the tracing data it receives to its system’s local storage. It is independent of any local session daemons. There may be several relay daemons running on the same system, as long as they use distinct command and data communication ports.

61

Session daemon

The central session manager process. It sets up, runs and tears down tracing sessions, manages the consumer daemons, instructs tracepoint providers to activate and deactivate their tracepoints, etc. There are at most one root session daemon plus one local session daemon per user. All session daemons operate independently of each other. The local session daemons can only trace their respective local user-spaces, while the root session daemon can trace the kernel and all user-spaces.

Session folder

A folder rooting a hierarchy, the leaves of which are trace folders. A session folder matches a particular tracing session. This is, by default, reflected in the session folder’s name, which is prefixed by the session name.

Session name

An identifier given by the user to the tracing session (or supplied by default). The trace name is constructed from the session name by suffixing it with a one-second resolution timestamp. Session names may be re-used freely over time by any single session daemon, and can be used simultaneously by several session daemons as long as they do not share the same $HOME.

Trace

(1) [noun] A record of the execution of a computer program, showing the sequence of instructions executed, the names and values of variables, or both. […] (2) [verb] To produce a record as in (1). [1] [2] [3]

In the LTTng context, a record of the activity of a computer system over a span of time, showing the sequence of instructions executed and the names and values of pertinent variables at the time of each instruction’s execution. More narrowly, a time-ordered sequence of event records, manifested as a set of trace files and trace file chunks organised within a single trace folder.

Trace file

A sub-set of the event records defined by a common channel and CPU ID. The metadatafile is a special trace file which holds the trace’s metadata (event data representation (byte ordering, bitness, etc.), host name, domain, tracer name and version, type of clock, and so on).

Trace file chunk

A trace file holding the event records for a partial span of the trace’s entire duration. By default, a trace consists of single-chunked trace files, but may be broken up into multiple non-overlapping chunks. Chunks are sequentially numbered but the configuration can be that of a circular buffer, meaning the earliest chunk is not necessarily the first. Trace file chunks are self-consistent: deleting a number of chunks will create gaps in the trace’s time coverage but won’t make the remaining chunks unusable —chunking is not file-splitting.

62

Trace folder

A folder containing a set of trace files (including in particular a metadata file). By default,trace folders are organised under a user’s $HOME/lttng-traces folder to segregate the trace folders by origin (local or remote), session start time, snapshot time (if applicable), tracing domain and sub-domains (e.g. the user-space domain can be broken down by user ID and bitness or by process instance ID). See Section 3.2.2 for the details.

Trace name

A folder identifier constructed from the session name by suffixing it with a one-second resolution timestamp. Trace names are supposed to be unique (to the storage device, at least).

Trace root folder

The folder that will hold the session folders. By default, this is $HOME/lttng-traces.Local traces may be stored in other, arbitrary locations (see the output option of the create command). Traces streaming in from remote hosts cannot be stored elsewhere than under the trace root folder unless using a custom relay daemon is used.

Tracepoint

A type of intrinsic probe that can be dynamically activated and deactivated by the tracing session manager (session daemon). A tracepoint invocation identifies the tracepoint provider used and the event type, and passes a number of references to contextual entities as arguments. The values of the latter are generally (but not necessarily) stored in the event record, and may also be used to control whether or not the event is recorded —either intrinsically to the tracepoint’s implementation or through a session-manager-provided filter expression. LTTng also encapsulates other tracing mechanisms (kprobes, syscalls) as tracepoints.

Tracepoint provider

A tracer handling a tracepoint. Other tracers include the syscall and kprobe services;these can be managed separately from LTTng but also have encapsulations provided by LTTng.

Tracer

(software) A software tool used to trace. [13] [2]

In the LTTng context, the parts of code that generate event records whenever there is an event occurrence. Tracers operate in the context of the traced processes and threads. The session daemon merely configures the tracers; the tracers write into the tracing buffers; the consumers read from the buffers (generally—but not always—freeing them up in the process) and write to the network or to local storage.

63

Tracing session

The process of defining, starting, pausing, adjusting and concluding a trace. As a data object, the tracing session is a dynamic set of tracing parameters defining the event types, their groupings in channels within domains, the storage instructions and which events and channels are active at any given moment.

64


Documents

LTTng · 2016-10-31 · i Abstract The Linux Trace Toolkit Next Generation (LTTng) software tracer was thoroughly documented from a user’s perspective. The behaviour of each command