Upload
younggun-kim
View
350
Download
6
Embed Size (px)
Citation preview
Badass Alien @ District 9, SMARTSTUDY
http://pengpenghu.com
PyCon Korea Organizer http://pycon.kr @pyconkr PyCon APAC 2016 Host (2016/Aug/13-15)
What I Think My Code Run
Movie - The Good The Bad The Weird, 2008
How My Code Really Run
The Killers : All These Things That I’ve Done M/V https://youtu.be/sZTpLvsYYHw
Objective
1. Understanding How Computer Works
2. How to use Profiler
But why?
Say, thousands of people using your code everyday and if you save 1 second to run it, this means you could save over 4 days of time human race wasted per a year.
See How Computer Works and How Fast Computer
and it’s peripherals
I/O >> 4D Wall >> Memory
Morse Code Modem (2400) CDMA(2G) HSPA(3G, DL)LTE*USB 2.0802.11nUSB 3.0SATA 3.0Thunderbolt 2DDR2 1066MhzDDR3 1600Mhz
≈ 21 bps≈ 2400 bps≈ 153 kbit/s≈ 13.98 Mbit/s≈ 100 Mbit/s≈ 480 Mbit/s≈ 600 Mbit/s≈ 3 Gbit/s≈ 6 Gbit/s≈ 20 Gbit/s≈ 64 Gbit/s≈ 102.4 Gbit/s
https://en.wikipedia.org/wiki/List_of_device_bit_rates
Yes! Memory is blazing fast! (Really?)
DDR3 1600MhzFSB 400 (old Xeon)PCI Express 3.0 (x16)QuickPath InterconnectHyperTransport 3.1L3 Cache(i7-4790X)L2 Cache(i7-4790X)
≈ 12.8 GB/s≈ 12.8 GB/s≈ 16 GB/s≈ 38.4 GB/s≈ 51.2 GB/s≈ 170 GB/s≈ 308 GB/s
Nope!
Computer Knows Only 0 and 1
00100000001000100000000101011110
Like This
00100000001000100000000101011110opcode
addr 1
addr 2
value
MIPS32 Add Immediate instruction (ADDI)
addi $r1, $r2, 350
$r1 = $r2 + 350
Computer Execute These Instruction per clock basis
Clock (Hz)
1Hz
1Hz
L1 Cache AccesL2 Cache AccessL3 Cache AccessRAM AccessSSD I/OHDD I/OInternet: Tokyo to SFRun IPython (0.6s)Reboot (5m)
3s 9s
43s6m
2-6 days1-12 months
12 years63 years
32,000 years!!
Disassemble Python Code To CPython Bytecode To Support Analysis
dis module
https://docs.python.org/3/library/dis.html https://github.com/python/cpython/blob/master/Include/opcode.h
line # of source
op addr / instruction annotations
param
An Empty List Creation
[] vs list()
Dictionary
{} vs dict()
Find an element in a list
using for-loop vs in
A tool for dynamic program analysisthat measure the space or time
complexity of a program.
Profilers
• cProfile (profile) • hotshot • line_profiler • memory_profiler • yappi • profiling • pyinstrument • plop • pprofile
cProfile
• built-in profiling tool • hook into the VM in CPython • introduces overhead a bit
https://docs.python.org/3.5/library/profile.html
cProfile
python -m cProfile python_code.py
line_profiler
• can profile line-by-line basis • Uses a decorator to mark the
chosen function (@profile) • introduces greater overhead
https://github.com/rkern/line_profiler
profiling• Interactive Python profiler which
inspired from Unity3D Profiler • Keep the call stack. • Live Profiling • Only Support Linux
https://github.com/what-studio/profiling
https://github.com/sublee/pyconkr2015-profiling-resources/blob/master/continuous.gif
fibona
Use profiler with real code
fibonaKorean Fried Chicken Served as one chicken. (not pieces)
And it’s quite complex to determine how many chicken would enough for N people.
fibonaThe problem can be solved easily using fibonacci number.
1 1 2 3 5 8 13 21 34 …
For Nth fibonacci number of people, N-1 th fibonacci number of chicken would be perfect.
fibona
Awesome Idea! but how do you get enough chicken if number of the people is not an fibonacci number?
fibonaApply Zeckendorf’s theorem, which is about the representation of integers as sum of Fibonacci number
https://en.wikipedia.org/wiki/Zeckendorf's_theorem
KEEPCALMAND USE
THEPROFILER
cProfile
python -m cProfile fibonachicken.py
cProfile
line_profiler
line_profiler
kernprof -l -v fibonachicken.py
line_profiler
line_profiler
line_profiler
line_profiler
Both fib() and is_fibonacci() is the bottleneck. Should replace these with better one
Hypothesis #1
Improvement of fib() could result better performance
Binet’s Formula
https://en.wikipedia.org/wiki/Jacques_Philippe_Marie_Binet
cProfile
Hypothesis #2
Can we improve is_fibonacci() not to use fib() at all?
n is a Fibonacci number if and only if 5n*n+4 or 5n*n-4 is a square
Gessel’s Formula
http://www.maths.surrey.ac.uk/hosted-sites/R.Knott/Fibonacci/fibFormula.html
cProfile
Summary
Consider efficiency of codes, along with peripherals, and circumstances around you
Form a hypothesis and confirm (using good profilers)
QA
Thanks!