Perl at SkyCon'12

Preview:

DESCRIPTION

Slides for my talk at SkyCon'12 in Limerick. Here I've squeezed four talks into one, covering a lot of ground quickly, so I've included links to more detailed presentations and other resources.

Citation preview

http://xkcd.com/224/

PerlCode Profiling

Memory ProfilingPerl 5Perl 6

Tim Bunce - SkyCon’12

Devel::NYTProfPerl Source Code Profiler

Tim Bunce - SkyCon’12

Devel::DProf Is Broken$ perl -we 'print "sub s$_ { sqrt(42) for 1..100 }; s$_({});\n" for 1..1000' > x.pl

$ perl -d:DProf x.pl

$ dprofpp -rTotal Elapsed Time = 0.108 Seconds Real Time = 0.108 SecondsExclusive Times%Time ExclSec CumulS #Calls sec/call Csec/c Name 9.26 0.010 0.010 1 0.0100 0.0100 main::s76 9.26 0.010 0.010 1 0.0100 0.0100 main::s323 9.26 0.010 0.010 1 0.0100 0.0100 main::s626 9.26 0.010 0.010 1 0.0100 0.0100 main::s936 0.00 - -0.000 1 - - main::s77 0.00 - -0.000 1 - - main::s82

Profiling 101The Basics

CPU Time Real Time

Subroutines

Statements

? ?? ?

What To Measure?

Devel::NYTProf Does it all

Running NYTProfperl -d:NYTProf ...

perl -MDevel::NYTProf ...

Configure profiler via the NYTPROF env varperldoc Devel::NYTProf for the details

To profile code that’s invoked elsewhere:PERL5OPT=-d:NYTProf

NYTPROF=file=/tmp/nytprof.out:addpid=1:...

Reporting: KCachegrind

• KCachegrind call graph - new and cool- contributed by C. L. Kao.- requires KCachegrind

$ nytprofcg # generates nytprof.callgraph

$ kcachegrind # load the file via the gui

KCachegrind

Reporting: HTML• HTML report

- page per source file, annotated with times and links- subroutine index table with sortable columns- interactive Treemap of subroutine times- generates Graphviz dot file of call graph- -m (--minimal) faster generation but less detailed

$ nytprofhtml # writes HTML report in ./nytprof/...

$ nytprofhtml --file=/tmp/nytprof.out.793 --open

Summary

Links to annotatedsource code

Timings for perl builtins

Link to sortable tableof all subs

Timings for each location calling into, or out of, the subroutine

Overall time spent in and below this sub

(in + below)

Time between starting this perl statement and starting the next.So includes overhead of calls to

perl subs.

Color coding based onMedian Average Deviationrelative to rest of this file

Boxes represent subroutinesColors only used to show

packages (and aren’t pretty yet)

Hover over box to see detailsClick to drill-down one level

in package hierarchy

Treemap showing relative proportions of exclusive time

Calls between packages

Calls to/from/within package

Questions?For more details see

Slides: http://www.slideshare.net/Tim.Bunce/develnytprof-v4-at-yapceu-201008-4906467Screencast: http://blip.tv/timbunce/devel-nytprof-yapc-asia-2012-6376582

Perl Memory UseTim Bunce @ SkyCon’12

Ouch!

$ perl some_script.plOut of memory!$

$ perl some_script.plKilled.$

$ perl some_script.pl$Someone shouts: "Hey! My process has been killed!"

$ perl some_script.pl[...later...] "Umm, what's taking so long?"

Process Memory

$ perl -e 'system("cat /proc/$$/stat")' # $$ = pid4752 (perl) S 4686 4752 4686 34816 4752 4202496 536 0 0 0 0 0 0 0 20 0 1 0 62673440 123121664 440 18446744073709551615 4194304 4198212 140735314078128 140735314077056 140645336670206 0 0 134 0 18446744071579305831 0 0 17 10 0 0 0 0 0 0 0 0 0 0 4752 111 111 111

$ perl -e 'system("cat /proc/$$/statm")'30059 441 346 1 0 160 0

$ perl -e 'system("ps -p $$ -o vsz,rsz,sz,size")' VSZ RSZ SZ SZ120236 1764 30059 640

$ perl -e 'system("top -b -n1 -p $$")'... PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND13063 tim 20 0 117m 1764 1384 S 0.0 0.1 0:00.00 perl

$ perl -e 'system("cat /proc/$$/status")'...VmPeak:! 120236 kBVmSize:! 120236 kB <- total (code, libs, stack, heap etc.)VmHWM:! 1760 kBVmRSS:! 1760 kB <- how much of the total is resident in physical memoryVmData:! 548 kB <- data (heap)VmStk:! 92 kB <- stackVmExe:! 4 kB <- codeVmLib:! 4220 kB <- libs, including libperl.soVmPTE:! 84 kBVmPTD:! 28 kBVmSwap:! 0 kB ...

Further info on unix.stackexchange.com

C Program Code int main(...) { ... }Read-only Data eg “String constants”Read-write Data un/initialized variables

Heap

(not to scale!)

Shared Lib Code \\Shared Lib R/O Data repeated for each libShared Lib R/W Data //

C Stack (not the perl stack)System

$ perl -e 'system("cat /proc/$$/maps")'address perms ... pathname00400000-00401000 r-xp ... /.../perl-5.NN.N/bin/perl00601000-00602000 rw-p ... /.../perl-5.NN.N/bin/perl

0087f000-008c1000 rw-p ... [heap]

7f858cba1000-7f8592a32000 r--p ... /usr/lib/locale/locale-archive-rpm

7f8592c94000-7f8592e1a000 r-xp ... /lib64/libc-2.12.so7f8592e1a000-7f859301a000 ---p ... /lib64/libc-2.12.so7f859301a000-7f859301e000 r--p ... /lib64/libc-2.12.so7f859301e000-7f859301f000 rw-p ... /lib64/libc-2.12.so7f859301f000-7f8593024000 rw-p ...

...other libs...

7f8593d1b000-7f8593e7c000 r-xp ... /.../lib/5.NN.N/x86_64-linux/CORE/libperl.so7f8593e7c000-7f859407c000 ---p ... /.../lib/5.NN.N/x86_64-linux/CORE/libperl.so7f859407c000-7f8594085000 rw-p ... /.../lib/5.NN.N/x86_64-linux/CORE/libperl.so7f85942a6000-7f85942a7000 rw-p ...

7fff61284000-7fff6129a000 rw-p ... [stack]

7fff613fe000-7fff61400000 r-xp ... [vdso]ffffffffff600000-ffffffffff601000 r-xp ... [vsyscall]

$ perl -e 'system("cat /proc/$$/smaps")' # note ‘smaps’ not ‘maps’

address perms ... pathname...

7fb00fbc1000-7fb00fd22000 r-xp ... /.../5.10.1/x86_64-linux/CORE/libperl.soSize: 1412 kB <- size of executable code in libperl.soRss: 720 kB <- amount that's currently in physical memoryPss: 364 kBShared_Clean: 712 kBShared_Dirty: 0 kBPrivate_Clean: 8 kBPrivate_Dirty: 0 kBReferenced: 720 kBAnonymous: 0 kBAnonHugePages: 0 kBSwap: 0 kBKernelPageSize: 4 kBMMUPageSize: 4 kB

... repeated for every segment ...

... repeated for every segment ...

Memory Pages

✦ Process view:

✦ Single large memory space. Simple.

✦ Operating System view:

✦ Memory is divided into pages

✦ Pages are loaded to physical memory on demand

✦ Mapping can change without the process knowing

C Program Code

Memory is divided into pagesPage size is typically 4KB

Read-only Data Memory is divided into pagesPage size is typically 4KBRead-write DataMemory is divided into pagesPage size is typically 4KB

Heap

Memory is divided into pagesPage size is typically 4KB

← Page ‘resident’ in physical memory ← Page not resident

← Page ‘resident’ in physical memory ← Page not resident

← Page ‘resident’ in physical memory ← Page not resident

RSS “Resident Set Size”is how much process memory is currently in physical memory

RSS “Resident Set Size”is how much process memory is currently in physical memoryShared Lib Code

RSS “Resident Set Size”is how much process memory is currently in physical memory

Shared Lib R/O Data

RSS “Resident Set Size”is how much process memory is currently in physical memory

Shared Lib R/W Data

RSS “Resident Set Size”is how much process memory is currently in physical memory

RSS “Resident Set Size”is how much process memory is currently in physical memory

RSS “Resident Set Size”is how much process memory is currently in physical memory

C Stack

RSS “Resident Set Size”is how much process memory is currently in physical memory

System

RSS “Resident Set Size”is how much process memory is currently in physical memory

Key Point

✦ Don’t use Resident Set Size (RSS)

✦ It can shrink even while the process size grows.

✦ Heap size or Total memory size is a good indicator.

The Heap

Heap ← Your perl stuff goes here

• Heap is managed by malloc()• Memory freed is rarely returned to

the operating system• Heap grows but rarely shrinks

Perl Data Anatomy

Head Body Data

Integer (IV)

String (PV)

Number with a string

Illustrations from illguts

Array (IV)

Hash (HV)

Glob (GV) Symbol Table (Stash)

Sub (CV)

lots of tiny chunks!

Devel::Peek• Gives you a textual view of data

$ perl -MDevel::Peek -e '%a = (42 => "Hello World!"); Dump(\%a)'SV = IV(0x1332fd0) at 0x1332fe0 REFCNT = 1 FLAGS = (TEMP,ROK) RV = 0x1346730 SV = PVHV(0x1339090) at 0x1346730 REFCNT = 2 FLAGS = (SHAREKEYS) ARRAY = 0x1378750 (0:7, 1:1) KEYS = 1 FILL = 1 MAX = 7 Elt "42" HASH = 0x73caace8 SV = PV(0x1331090) at 0x1332de8 REFCNT = 1 FLAGS = (POK,pPOK) PV = 0x133f960 "Hello World!"\0 CUR = 12 <= length in use LEN = 16 <= amount allocated

Devel::Size

• Gives you a measure of the size of a data structure

$ perl -MDevel::Size=total_size -le 'print total_size( 0 )'24

$ perl -MDevel::Size=total_size -le 'print total_size( [] )'64

$ perl -MDevel::Size=total_size -le 'print total_size( {} )'120

$ perl -MDevel::Size=total_size -le 'print total_size( [ 1..100 ] )'3264

• Is very fast, and accurate for most simple data types.• Has limitations and bugs, but is the best tool we have.

Memory Profiling

What?

✦ Track memory size over time?

✦ See where memory is allocated and freed?

✦ Experiments with Devel::NYTProf

✦ Turned out to not seem useful

Space in Hiding

✦ Perl tends to consume extra memory to save time

✦ This can lead to surprises, for example:

✦ sub foo { my $var = "X" x 10_000_000;}foo(); # ~20MB still used after return!

✦ sub bar{ my $var = "X" x 10_000_000; bar($_[0]-1) if $_[0]; # recurse}bar(50); # ~1GB still used after return!

My Plan

The Plan✦ Extend Devel::Size✦ Add a C-level callback hook✦ Add some kind of "data path name" for the callback to use✦ Add a function to Devel::Size to return the size of everything✦ Stream the data to disk✦ Write tools to visualize the data✦ Add multi-phase scan

1. scan symbol tables, skip where ref count > 12. process the skipped items3. scan arenas for other values (e.g. leaks)

✦ Write tool to compare two sets of data

The Status✓ Add a C-level callback hook✓ Add some kind of "data path name" for the callback to use✓ Add a function to Devel::Size to return the size of everything.✓ Stream the data to disk✓ Write tools to visualize the data

• Will become a separate distribution• “Devel::SizeMeGraph”?• Source repo available by Sunday• Ready to demonstrate

Demonstration

PerlA Summary of the Onions

Tim Bunce – SkyCon’12

- Perl 5 isn’t the new kid on the block

- Perl is 25 years old

- Perl 5 is 18 years old

- A mature language with a mature culture

- Perl is 25 years old

- Perl 5 is 18 years old

- Perl 6 was conceived 12 years ago

- Perl 6 “Rakudo Star” is 2 years old

- Vast library of free code modules on “CPAN”

- Over 5,100 active authors (making releases)

- Over 26,000 distributions (110,000 modules)

- ~1,000 uploads per month (by ~500 authors)

- ~250 new distributions per month

- Automated smoke testing for all uploads

“CPAN is my language, Perl is my VM”

Dependency Analysis available for all Moduleshttp://deps.cpantesters.org/?module=Moose;perl=latest

Automated Smoke Testing

http://matrix.cpantesters.org/?dist=Moose%202.0603

Perl 5 Development

-5.10 – 2007

-5.12 – 2010

-5.14 – 2011

-5.16 – 2012

-5.17.4 latest monthly development release

-5.18 – due May 2013

Perl 5 New Features

-Refactored internals many fixes, more speed, less memory

-New language features (state, say, //, autodie, ...)

-Language feature management

-Unicode 6.1 and many new Unicode features

-Many powerful new regex features

See http://www.slideshare.net/rjbs/whats-new-in-perl-v510-v516

http://xkcd.com/208/

DemoRegexp::Debuger

See http://www.youtube.com/watch?v=zcSFIUiMgAs

A Culture of Testing

-2002: Perl 5.8.0 had 26,725 core tests +41,666 more for bundled libraries etc.

-2007: Perl 5.10.0 has 78,883 core tests +109,427

-2010: Perl 5.11.5 has 191,008 core tests +167,015

-2012: Perl 5.16.0 has 262,370 core tests +261,776

Another member of the Perl language family

Perl 6

Learn it once, use it many times. Learn as you go. Many acceptable levels of competence. Multiple ways to say the same thing. No shame in borrowing. Indeterminate dimensionality. Local ambiguity is okay. Punctuation by prosody and inflection. Disambiguation by number, case and word order. Topicalization. Discourse structure. Pronominalization. No theoretical axes to grind. Style not enforced except by peer pressure. Cooperative design. “Inevitable” Divergence.

Natural Language Principles in Perl

http://www.wall.org/~larry/natural.html

Timeline

2000: Perl 6 conceived (after a smashed coffee mug)

2001-2004: Initial design docs

2005: First prototype, pugs (in Haskell)

2005+ Continual radical evolution (the whirlpool)

2010: 1st Rakudo Star release

2012: 17th Rakudo Star release

Mostly implemented in Perl6 (>60% and rising)

“If we'd done Perl 6 on a schedule, you'd have it by now. And it would be crap.”

—Larry Wall, 2002

“Do it right.” and “It's ready when it's ready.”

Freedom to explore deeply and change

“Truly radical and far-reaching improvements over the past few years.”

“We've spent a decade stealing the very best ideas from the best programming languages, and making them simple and practical for mortal developers to use.”

—Damian Conway

“feedback at many levels from multiple implementations”

“Should be called Perl 8 or Perl 9”

Multiple Implementations

Two main implementations:

Rakudo - built on Parrot Compiler Toolchain

Niecza - targeting the CLR (.NET and Mono)

Plus: STD, viv, Perlito, Pugs and others

Work on a JVM implementation is starting

All sharing a common test suite

See http://perl6.org/compilers

Some Perl 6 Features

Rich set of operators and meta-operatorsRich type systemRich object system, including roles/traitsA full Meta Object ProtocolMultiple dispatch using types and expressive signaturesGradual typingLazy lists and iterators, with controllable eagernessDeep introspectionNative Call InterfaceRepresentational polymorphismPowerful matching and parsing with subclassable grammars

Series and Reduction‣ say 1, 2, 4 ... 1024;

1 2 4 8 16 32 64 128 256 512 1024

‣ my @fib = 1, 1, *+* ... *; # infinitesay @fib[^10]; # 0..91 1 2 3 5 8 13 21 34 55 89

‣ say [*] 1..10; # reduction, use any operator3628800

‣ sub postfix:<!>($n) { [*] 1..$n }say 10!3628800

Multiple Dispatch‣ multi fact(0) { 1 }

multi fact($n) { $n * fact($n – 1) }

‣ multi fib(0) { 0 }multi fib(1) { 1 }multi fib($n) { fib($n – 1) + fib($n – 2) }

‣ multi quicksort([]) { () }multi quicksort([$pivot, *@rest]) { my @before = @rest.grep(* < $pivot); my @after = @rest.grep(* >= $pivot); (quicksort(@before), $pivot, quicksort(@after))}

Native Call Interface sub PQexecPrepared(

- OpaquePointer $conn,Str $statement_name,Int $n_params,CArray[Str] $param_values,CArray[int] $param_length,CArray[int] $param_formats,Int $resultFormat)

- returns OpaquePointeris native('libpq'){ ... }

Also supports structures and callbacks.

my @suits = < ♣ ♢ ♡ ♠ >;my @ranks = 2..10, < J Q K A >;

# concatenate each rank with each suitmy @deck = @ranks X~ @suits;

# create hash of card to points valuemy %points = @deck Z ( (2..10, 10, 10, 10, 11) X+ (0,0,0,0) );

# grab five cards from the deckmy @hand = @deck.pick(5);

# display my handsay ~@hand;

# tell me how many points it's worthsay [+] %points{@hand};

@xyz»++ # increment all elements of @xyz

@x = @a »min« @b # @x is smallest of @a and @b

$mean = ([+] @a) / @a # calculate mean of @a

$sumsq = [+] (@x »**» 2) # sum of squares of @x

$fact = [*] 1..$n # $n factorial

for %hash.kv -> $k, $v { say "$k: $v" }

Example ModuleJSON::Tiny

https://github.com/moritz/json

module JSON::Tiny;

proto to-json($) is export {*}multi to-json(Real:D $d) { ~$d }multi to-json(Bool:D $d) { $d ?? 'true' !! 'false'; }multi to-json(Str:D $d) {    '"'    ~ $d.trans(['"', '\\', "\b", "\f", "\n", "\r", "\t"]            => ['\"', '\\\\', '\b', '\f', '\n', '\r', '\t'])\            .subst(/<-[\c32..\c126]>/, { ord(~$_).fmt('\u%04x') }, :g)    ~ '"'}multi to-json(Positional:D $d) {  return '[ ' ~ $d.map(&to-json).join(', ') ~ ' ]';}multi to-json(Hash:D $d) { return '{ '~ $d.map({ to-json(.key)~' : '~to-json(.value) }).join(', ')~ ' }';}multi to-json(Any:U $) { 'null' }multi to-json(Any:D $s) { die "Can't serialize an object of type "~$s.WHAT.perl}

use JSON::Tiny::Actions;use JSON::Tiny::Grammar;

sub from-json($text) is export {    my $a = JSON::Tiny::Actions.new();    my $o = JSON::Tiny::Grammar.parse($text, :actions($a));    return $o.ast;}

grammar JSON::Tiny::Grammar;

rule TOP { ^ [ <object> | <array> ] $ }rule object { '{' ~ '}' <pairlist> }rule pairlist { <?> <pair> * % \, }rule pair { <?> <string> ':' <value> }rule array { '[' ~ ']' <arraylist> }rule arraylist { <?> <value>* % [ \, ] }

proto token value {*};token value:sym<number> {    '-'?    [ 0 | <[1..9]> <[0..9]>* ]    [ \. <[0..9]>+ ]?    [ <[eE]> [\+|\-]? <[0..9]>+ ]?}token value:sym<true> { <sym> };token value:sym<false> { <sym> };token value:sym<null> { <sym> };token value:sym<object> { <object> };token value:sym<array> { <array> };token value:sym<string> { <string> }

token string { \" ~ \" ( <str> | \\ <str_escape> )* }token str { <-["\\\t\n]>+ }token str_escape { <["\\/bfnrt]> | u <xdigit>**4 }

class JSON::Tiny::Actions;

method TOP($/) { make $/.values.[0].ast };

method object($/) { make $<pairlist>.ast.hash }method pairlist($/) { make $<pair>>>.ast.flat }method pair($/) { make $<string>.ast => $<value>.ast }method array($/) { make $<arraylist>.ast }method arraylist($/) { make [$<value>>>.ast] }method string($/) {    make $0.elems == 1        ?? ($0[0].<str> || $0[0].<str_escape>).ast        !! join '', $0.list.map({ (.<str> || .<str_escape>).ast });}method value:sym<number>($/) { make +$/.Str }method value:sym<string>($/) { make $<string>.ast }method value:sym<true>($/) { make Bool::True }method value:sym<false>($/) { make Bool::False }method value:sym<null>($/) { make Any }method value:sym<object>($/) { make $<object>.ast }method value:sym<array>($/) { make $<array>.ast }method str($/) { make ~$/ }method str_escape($/) {    if $<xdigit> {        make chr(:16($<xdigit>.join));    } else {        my %h = '\\' => "\\", '/' => "/", 'b' => "\b", 'n' => "\n", 't' => "\t", 'f' => "\f", 'r' => "\r", '"' => "\"";        make %h{~$/};    }}

Perl 6

Already full of awesome

Hundreds of examples on http://rosettacode.org

Developers happy to stay in stealth mode for now

Get ahead of the revolution: http://perl6.org

In Summary...

Perlhas a massive library of reusable code

has a culture of best practice and testing

has a happy welcoming growing community

has a great future in Perl 5 and Perl 6

is a great language for getting your job done

for the last 25 years, and the next 25!

Recommended