22
Experiments on the Experiments on the Effectiveness Effectiveness of an Automatic of an Automatic Insertion of Memory Reuses into ML- Insertion of Memory Reuses into ML- like Programs like Programs Oukseh Lee (Hanyang University) Kwangkeun Yi (Seoul National Uni versity)

Experiments on the Effectiveness of an Automatic Insertion of Memory Reuses into ML-like Programs Oukseh Lee (Hanyang University) Kwangkeun Yi (Seoul National

Embed Size (px)

Citation preview

Page 1: Experiments on the Effectiveness of an Automatic Insertion of Memory Reuses into ML-like Programs Oukseh Lee (Hanyang University) Kwangkeun Yi (Seoul National

Experiments on the Experiments on the Effectiveness Effectiveness of an Automatic of an Automatic Insertion of Memory Reuses into ML-Insertion of Memory Reuses into ML-like Programslike Programs

Oukseh Lee (Hanyang University)Kwangkeun Yi (Seoul National University)

Page 2: Experiments on the Effectiveness of an Automatic Insertion of Memory Reuses into ML-like Programs Oukseh Lee (Hanyang University) Kwangkeun Yi (Seoul National

QuestionQuestion

Our SAS 2003 paper* presented an algorithm to replace allocations by memory reuse (or

destructive update); and some promising yet preliminary experiment numbers.

When and how much is it cost-effective? Space & time-wise. Before launching it inside our nML compiler.

* Oukseh Lee, Hongseok Yang, and Kwangkeun Yi. Inserting Safe Memory Reuse Commands into ML-like Programs. In Proceedings of the Annual International Static Analysis Symposium, volume 2694 of Lecture Notes in Computer Science, pp. 171-188, San Diego, California, June 2003.

Page 3: Experiments on the Effectiveness of an Automatic Insertion of Memory Reuses into ML-like Programs Oukseh Lee (Hanyang University) Kwangkeun Yi (Seoul National

Brief Overview of Our Brief Overview of Our AlgorithmAlgorithm

Page 4: Experiments on the Effectiveness of an Automatic Insertion of Memory Reuses into ML-like Programs Oukseh Lee (Hanyang University) Kwangkeun Yi (Seoul National

Example: insertExample: insert

1 2 3 4 6 nil

linsert 5 l

fun insert i l = case l of [] => i::[] | h::t => if i<h then i::l else let z = insert i t in h::z

54321result

fun insert i l = case l of [] => i::[] | h::t => if i<h then i::l else let z = insert i t in free l; h::z

Page 5: Experiments on the Effectiveness of an Automatic Insertion of Memory Reuses into ML-like Programs Oukseh Lee (Hanyang University) Kwangkeun Yi (Seoul National

3 4

Example: insertExample: insert

1 2 3 4 6 nil

linsert 5 l

fun insert i l = case l of [] => i::[] | h::t => if i<h then i::l else let z = insert i t in h::z

fun insert i l = case l of [] => i::[] | h::t => if i<h then i::l else let z = insert i t in free l; h::z

5

21result

fun insert b i l = case l of [] => i::[] | h::t => if i<h then i::l else let z = insert b i t in free l when b; h::z

Page 6: Experiments on the Effectiveness of an Automatic Insertion of Memory Reuses into ML-like Programs Oukseh Lee (Hanyang University) Kwangkeun Yi (Seoul National

AnalysisAnalysis

fun insert i l =

case l of

[] => i::[]

| h::t =>

if i<h then

i::l

else

let z = insert i t in h::z

X1

X2

X3

X4

Z

L.tl

L

X1

X2 [ L

X4 [ Z

L.hd

L.tl

X1[X2[L[X4[Z L.hd [ L.tl

Z µ X3 [ L.tl

X [ L L[µ

L.hd

result usage

X=X1[ X2[ X3[ X4

=L.hd[L.tl

Page 7: Experiments on the Effectiveness of an Automatic Insertion of Memory Reuses into ML-like Programs Oukseh Lee (Hanyang University) Kwangkeun Yi (Seoul National

Transformation [1/3]Transformation [1/3]fun insert i l = case l of [] => i::[] | h::t => if i<h then i::l else let z = insert i t in h::z

fun insert b i l = case l of [] => i::[] | h::t => if i<h then i::l else let z = insert i t in h::z

When b=true, the transformed insert functiondeallocates the cons cells of the input list l

excluding those of the result list.

Page 8: Experiments on the Effectiveness of an Automatic Insertion of Memory Reuses into ML-like Programs Oukseh Lee (Hanyang University) Kwangkeun Yi (Seoul National

Transformation [2/3]Transformation [2/3]

must not be freed

when area overlap?

necessary condition

the input list l

b=false

L yes b=true

the result list X4 [ Z no none

When is it safe to free the tail cells t not in the result z (L.tl\Z)?

fun insert b i l = case l of [] => i::[] | h::t => if i<h then i::l else let z = insert b i t in h::z

b

Page 9: Experiments on the Effectiveness of an Automatic Insertion of Memory Reuses into ML-like Programs Oukseh Lee (Hanyang University) Kwangkeun Yi (Seoul National

Transformation [3/3]Transformation [3/3]

must not freed when area overlap?

necessary condition

the input list l b=false

L yes b=true

the cons cells freed during insert b i t

b=true L.tl \ Z no none

the result list X4 [ Z no none

When is it safe to free the head cell (L.hd)?

fun insert b i l = case l of [] => i::[] | h::t => if i<h then i::l else let z = insert b i t in free l when ; h::z

b

Page 10: Experiments on the Effectiveness of an Automatic Insertion of Memory Reuses into ML-like Programs Oukseh Lee (Hanyang University) Kwangkeun Yi (Seoul National

ExperimentsExperiments

Page 11: Experiments on the Effectiveness of an Automatic Insertion of Memory Reuses into ML-like Programs Oukseh Lee (Hanyang University) Kwangkeun Yi (Seoul National

Analysis & Transformation Analysis & Transformation CostCost

lines cost (s)

sieve 29 0.001merge 40 0.001qsort 41 0.001queens 44 0.003msort 73 0.003professor 193 0.012mirage 245 0.015life 366 0.017k-eval 645 0.220kb 808 0.095nucleic 3019 0.488

slope=1.461,500~29,000 lines/sec

program size(logarithmic scale)

an

aly

sis

& t

ran

sform

ati

on

co

st(l

og

ari

thm

ic s

cale

)

Page 12: Experiments on the Effectiveness of an Automatic Insertion of Memory Reuses into ML-like Programs Oukseh Lee (Hanyang University) Kwangkeun Yi (Seoul National

Reuse RatioReuse Ratio

total allocation reuse ratio

(kilo words) (kilo words)

B C C/B

sieve 18694 15760 84.3%

merge 11719 5860 50.0%

qsort 95450 89664 93.9%

queens 122505 5206 4.2%

msort 45455 40573 89.3%

professor 822589 344794 41.9%

mirage 101972 86054 84.4%

life 48305 5125 10.6%

k-eval 132234 41710 31.5%

kb 57705 1948 3.4%

nucleic 31487 5307 16.9%

3.4%~93.9%of allocations are

avoided.

low reuse ratio due to

much sharing.

Page 13: Experiments on the Effectiveness of an Automatic Insertion of Memory Reuses into ML-like Programs Oukseh Lee (Hanyang University) Kwangkeun Yi (Seoul National

Memory Peak ReductionMemory Peak Reduction

reuse peak peak (reuse)

ratio (words) (words)

D E (D-E)/E

sieve 84.3% 690 300 56.5%

merge 50.0% 1197 606 49.4%

qsort 93.9% 1189 334 71.9%

queens 4.2% 255 255 0.0%

msort 89.3% 714 321 55.0%

professor 41.9% 1394 1281 8.1%

mirage 84.4% 1398 1361 2.6%

life 10.6% 2346 1746 25.6%

k-eval 31.5% 1044 944 9.6%

kb 3.4% 27125 26501 2.3%

nucleic 16.9% 103677 89352 13.8%

0.0%~71.9%peak reduction

much reuse =much peak reduction

memory reuse ratio

mem

ory

peak

red

uct

ion

84.4%

10.6%

2.6%

25.6%

41.9% 8.1%

Page 14: Experiments on the Effectiveness of an Automatic Insertion of Memory Reuses into ML-like Programs Oukseh Lee (Hanyang University) Kwangkeun Yi (Seoul National

Difference in Live CellsDifference in Live Cells

sieve84.3%56.5%

merge50.0%49.4%

qsort93.9%71.9%

msort89.3%55.0%

Page 15: Experiments on the Effectiveness of an Automatic Insertion of Memory Reuses into ML-like Programs Oukseh Lee (Hanyang University) Kwangkeun Yi (Seoul National

Difference in Live CellsDifference in Live Cells

queens4.2%0.0%

kb3.4%2.3%

nucleic16.9%13.8%

k-eval31.5% 9.6%

Page 16: Experiments on the Effectiveness of an Automatic Insertion of Memory Reuses into ML-like Programs Oukseh Lee (Hanyang University) Kwangkeun Yi (Seoul National

Difference in Live CellsDifference in Live Cells

life10.6%25.6%

mirage84.4% 2.6%

professor41.9% 8.1%

Page 17: Experiments on the Effectiveness of an Automatic Insertion of Memory Reuses into ML-like Programs Oukseh Lee (Hanyang University) Kwangkeun Yi (Seoul National

GC Time & Runtime GC Time & Runtime ChangesChanges

reuse runtimeGC time GC time (reuse)runtime (flags) runtime (reuse)ratio A B B/A C (B-C)/B D (A-D)/A E (A-E)/A

(Intel Pentium4 3.0GHz, Linux RedHat 9.0)sieve 84.3% 0.40 0.178 44.2% 0.087 51.0% 0.41 -2.6% 0.41 -0.7%merge 50.0% 0.62 0.470 76.0% 0.243 48.3% 0.68 -9.8% 0.47 24.0%qsort 93.9% 2.08 1.312 63.2% 0.124 90.5% 2.16 -4.1% 1.26 39.1%queens 4.2% 1.58 0.822 52.2% 0.812 1.3% 1.68 -6.8% 1.65 -4.7%msort 89.3% 0.95 0.572 59.9% 0.140 75.6% 0.98 -2.9% 0.75 21.6%professor 41.9% 2.99 0.215 7.2% 0.134 37.8% 3.27 -9.3% 3.16 -5.5%mirage 84.4% 1.06 0.060 5.6% 0.011 82.0% 1.12 -5.1% 1.09 -2.6%life 10.6% 3.44 0.050 1.4% 0.051 -2.8% 3.64 -6.0% 3.57 -3.8%k-eval 31.5% 1.01 0.019 1.9% 0.015 21.0% 1.04 -3.2% 1.04 -2.9%kb 3.4% 0.80 0.255 31.7% 0.255 -0.3% 0.83 -3.6% 0.85 -5.8%nucleic 16.9% 0.44 0.230 52.1% 0.147 36.0% 0.43 2.5% 0.41 7.2%(Sun UltraSparc 400MHz, Solaris 2.7)sieve 84.3% 4.14 1.464 35.4% 0.740 49.5% 4.39 -6.1% 4.04 2.4%merge 50.0% 4.47 3.492 78.2% 1.835 47.5% 5.02 -12.4% 3.13 30.0%qsort 93.9% 15.87 9.073 57.2% 0.901 90.1% 16.71 -5.3% 11.40 28.2%queens 4.2% 13.23 6.132 46.4% 6.557 -6.9% 14.34 -8.4% 14.20 -7.3%msort 89.3% 7.47 4.126 55.3% 0.999 75.8% 7.74 -3.7% 5.92 20.7%professor 41.9% 34.35 1.465 4.3% 0.969 33.8% 36.32 -5.7% 32.71 4.8%mirage 84.4% 9.79 0.409 4.2% 0.073 82.1% 10.29 -5.1% 9.79 0.1%life 10.6% 32.56 0.370 1.1% 0.365 1.5% 33.50 -2.9% 32.84 -0.9%k-eval 31.5% 9.34 0.120 1.3% 0.091 24.5% 9.46 -1.3% 9.29 0.6%kb 3.4% 5.76 1.420 24.7% 1.509 -6.2% 6.28 -9.1% 6.15 -6.7%nucleic 16.9% 2.57 1.188 46.3% 0.770 35.2% 2.58 -0.7% 2.34 8.8%

-6.9%~90.5%GC-time reduction

-7.3%~39.1%runtime reduction

in Objective Caml system

Page 18: Experiments on the Effectiveness of an Automatic Insertion of Memory Reuses into ML-like Programs Oukseh Lee (Hanyang University) Kwangkeun Yi (Seoul National

GC Time & Runtime GC Time & Runtime ChangesChanges

reuse runtimeGC time GC time (reuse)runtime (flags) runtime (reuse)ratio A B B/A C (B-C)/B D (A-D)/A E (A-E)/A

(Intel Pentium4 3.0GHz, Linux RedHat 9.0)sieve 84.3% 0.40 0.178 44.2% 0.087 51.0% 0.41 -2.6% 0.41 -0.7%merge 50.0% 0.62 0.470 76.0% 0.243 48.3% 0.68 -9.8% 0.47 24.0%qsort 93.9% 2.08 1.312 63.2% 0.124 90.5% 2.16 -4.1% 1.26 39.1%queens 4.2% 1.58 0.822 52.2% 0.812 1.3% 1.68 -6.8% 1.65 -4.7%msort 89.3% 0.95 0.572 59.9% 0.140 75.6% 0.98 -2.9% 0.75 21.6%professor 41.9% 2.99 0.215 7.2% 0.134 37.8% 3.27 -9.3% 3.16 -5.5%mirage 84.4% 1.06 0.060 5.6% 0.011 82.0% 1.12 -5.1% 1.09 -2.6%life 10.6% 3.44 0.050 1.4% 0.051 -2.8% 3.64 -6.0% 3.57 -3.8%k-eval 31.5% 1.01 0.019 1.9% 0.015 21.0% 1.04 -3.2% 1.04 -2.9%kb 3.4% 0.80 0.255 31.7% 0.255 -0.3% 0.83 -3.6% 0.85 -5.8%nucleic 16.9% 0.44 0.230 52.1% 0.147 36.0% 0.43 2.5% 0.41 7.2%(Sun UltraSparc 400MHz, Solaris 2.7)sieve 84.3% 4.14 1.464 35.4% 0.740 49.5% 4.39 -6.1% 4.04 2.4%merge 50.0% 4.47 3.492 78.2% 1.835 47.5% 5.02 -12.4% 3.13 30.0%qsort 93.9% 15.87 9.073 57.2% 0.901 90.1% 16.71 -5.3% 11.40 28.2%queens 4.2% 13.23 6.132 46.4% 6.557 -6.9% 14.34 -8.4% 14.20 -7.3%msort 89.3% 7.47 4.126 55.3% 0.999 75.8% 7.74 -3.7% 5.92 20.7%professor 41.9% 34.35 1.465 4.3% 0.969 33.8% 36.32 -5.7% 32.71 4.8%mirage 84.4% 9.79 0.409 4.2% 0.073 82.1% 10.29 -5.1% 9.79 0.1%life 10.6% 32.56 0.370 1.1% 0.365 1.5% 33.50 -2.9% 32.84 -0.9%k-eval 31.5% 9.34 0.120 1.3% 0.091 24.5% 9.46 -1.3% 9.29 0.6%kb 3.4% 5.76 1.420 24.7% 1.509 -6.2% 6.28 -9.1% 6.15 -6.7%nucleic 16.9% 2.57 1.188 46.3% 0.770 35.2% 2.58 -0.7% 2.34 8.8%

-6.9%~90.5%GC-time reduction

-7.3%~39.1%runtime reduction

High reuse ratio & big GC portion:

runtime speedup

50.0%93.9%

89.3%

16.9%

50.0%93.9%

89.3%

16.9%

76.0%63.2%

59.9%

52.1%

78.2%57.2%

55.3%

46.3%

24.0%39.1%

21.6%

7.2%

30.0%28.2%

20.7%

8.8%

in Objective Caml system

Page 19: Experiments on the Effectiveness of an Automatic Insertion of Memory Reuses into ML-like Programs Oukseh Lee (Hanyang University) Kwangkeun Yi (Seoul National

GC Time & Runtime GC Time & Runtime ChangesChanges

reuse runtimeGC time GC time (reuse)runtime (flags) runtime (reuse)ratio A B B/A C (B-C)/B D (A-D)/A E (A-E)/A

(Intel Pentium4 3.0GHz, Linux RedHat 9.0)sieve 84.3% 0.40 0.178 44.2% 0.087 51.0% 0.41 -2.6% 0.41 -0.7%merge 50.0% 0.62 0.470 76.0% 0.243 48.3% 0.68 -9.8% 0.47 24.0%qsort 93.9% 2.08 1.312 63.2% 0.124 90.5% 2.16 -4.1% 1.26 39.1%queens 4.2% 1.58 0.822 52.2% 0.812 1.3% 1.68 -6.8% 1.65 -4.7%msort 89.3% 0.95 0.572 59.9% 0.140 75.6% 0.98 -2.9% 0.75 21.6%professor 41.9% 2.99 0.215 7.2% 0.134 37.8% 3.27 -9.3% 3.16 -5.5%mirage 84.4% 1.06 0.060 5.6% 0.011 82.0% 1.12 -5.1% 1.09 -2.6%life 10.6% 3.44 0.050 1.4% 0.051 -2.8% 3.64 -6.0% 3.57 -3.8%k-eval 31.5% 1.01 0.019 1.9% 0.015 21.0% 1.04 -3.2% 1.04 -2.9%kb 3.4% 0.80 0.255 31.7% 0.255 -0.3% 0.83 -3.6% 0.85 -5.8%nucleic 16.9% 0.44 0.230 52.1% 0.147 36.0% 0.43 2.5% 0.41 7.2%(Sun UltraSparc 400MHz, Solaris 2.7)sieve 84.3% 4.14 1.464 35.4% 0.740 49.5% 4.39 -6.1% 4.04 2.4%merge 50.0% 4.47 3.492 78.2% 1.835 47.5% 5.02 -12.4% 3.13 30.0%qsort 93.9% 15.87 9.073 57.2% 0.901 90.1% 16.71 -5.3% 11.40 28.2%queens 4.2% 13.23 6.132 46.4% 6.557 -6.9% 14.34 -8.4% 14.20 -7.3%msort 89.3% 7.47 4.126 55.3% 0.999 75.8% 7.74 -3.7% 5.92 20.7%professor 41.9% 34.35 1.465 4.3% 0.969 33.8% 36.32 -5.7% 32.71 4.8%mirage 84.4% 9.79 0.409 4.2% 0.073 82.1% 10.29 -5.1% 9.79 0.1%life 10.6% 32.56 0.370 1.1% 0.365 1.5% 33.50 -2.9% 32.84 -0.9%k-eval 31.5% 9.34 0.120 1.3% 0.091 24.5% 9.46 -1.3% 9.29 0.6%kb 3.4% 5.76 1.420 24.7% 1.509 -6.2% 6.28 -9.1% 6.15 -6.7%nucleic 16.9% 2.57 1.188 46.3% 0.770 35.2% 2.58 -0.7% 2.34 8.8%

-6.9%~90.5%GC-time reduction

-7.3%~39.1%runtime reduction

High reuse ratio & big GC portion:

runtime speedup

Low reuse ratio: flags overhead4.2%

3.4%

4.2%

3.4%

-8.4%

-9.1%

-6.8%

-3.6%

-5.8%

-6.7%

-4.7%

-7.3%

in Objective Caml system

Page 20: Experiments on the Effectiveness of an Automatic Insertion of Memory Reuses into ML-like Programs Oukseh Lee (Hanyang University) Kwangkeun Yi (Seoul National

GC Time & Runtime GC Time & Runtime ChangesChanges

reuse runtimeGC time GC time (reuse)runtime (flags) runtime (reuse)ratio A B B/A C (B-C)/B D (A-D)/A E (A-E)/A

(Intel Pentium4 3.0GHz, Linux RedHat 9.0)sieve 84.3% 0.40 0.178 44.2% 0.087 51.0% 0.41 -2.6% 0.41 -0.7%merge 50.0% 0.62 0.470 76.0% 0.243 48.3% 0.68 -9.8% 0.47 24.0%qsort 93.9% 2.08 1.312 63.2% 0.124 90.5% 2.16 -4.1% 1.26 39.1%queens 4.2% 1.58 0.822 52.2% 0.812 1.3% 1.68 -6.8% 1.65 -4.7%msort 89.3% 0.95 0.572 59.9% 0.140 75.6% 0.98 -2.9% 0.75 21.6%professor 41.9% 2.99 0.215 7.2% 0.134 37.8% 3.27 -9.3% 3.16 -5.5%mirage 84.4% 1.06 0.060 5.6% 0.011 82.0% 1.12 -5.1% 1.09 -2.6%life 10.6% 3.44 0.050 1.4% 0.051 -2.8% 3.64 -6.0% 3.57 -3.8%k-eval 31.5% 1.01 0.019 1.9% 0.015 21.0% 1.04 -3.2% 1.04 -2.9%kb 3.4% 0.80 0.255 31.7% 0.255 -0.3% 0.83 -3.6% 0.85 -5.8%nucleic 16.9% 0.44 0.230 52.1% 0.147 36.0% 0.43 2.5% 0.41 7.2%(Sun UltraSparc 400MHz, Solaris 2.7)sieve 84.3% 4.14 1.464 35.4% 0.740 49.5% 4.39 -6.1% 4.04 2.4%merge 50.0% 4.47 3.492 78.2% 1.835 47.5% 5.02 -12.4% 3.13 30.0%qsort 93.9% 15.87 9.073 57.2% 0.901 90.1% 16.71 -5.3% 11.40 28.2%queens 4.2% 13.23 6.132 46.4% 6.557 -6.9% 14.34 -8.4% 14.20 -7.3%msort 89.3% 7.47 4.126 55.3% 0.999 75.8% 7.74 -3.7% 5.92 20.7%professor 41.9% 34.35 1.465 4.3% 0.969 33.8% 36.32 -5.7% 32.71 4.8%mirage 84.4% 9.79 0.409 4.2% 0.073 82.1% 10.29 -5.1% 9.79 0.1%life 10.6% 32.56 0.370 1.1% 0.365 1.5% 33.50 -2.9% 32.84 -0.9%k-eval 31.5% 9.34 0.120 1.3% 0.091 24.5% 9.46 -1.3% 9.29 0.6%kb 3.4% 5.76 1.420 24.7% 1.509 -6.2% 6.28 -9.1% 6.15 -6.7%nucleic 16.9% 2.57 1.188 46.3% 0.770 35.2% 2.58 -0.7% 2.34 8.8%

-6.9%~90.5%GC-time reduction

-7.3%~39.1%runtime reduction

High reuse ratio & big GC portion:

runtime speedup

Low reuse ratio: flags overhead

Small GC portion: almost no effect

7.2%5.6%1.4%1.9%

4.3%4.2%1.1%1.3%

-5.5%-2.6%-3.8%-2.9%

4.8%0.1%

-0.9%0.6%

in Objective Caml system

Page 21: Experiments on the Effectiveness of an Automatic Insertion of Memory Reuses into ML-like Programs Oukseh Lee (Hanyang University) Kwangkeun Yi (Seoul National

GC-time & Runtime GC-time & Runtime ChangesChanges

much reuse =

much GC-time reduction

much reuse & big GC-time portion

= much runtime reduction

memory reuse ratio

GC

tim

e

red

uct

ion

GC portion x memory reuse ratio

run

tim

e

red

uct

ion

Page 22: Experiments on the Effectiveness of an Automatic Insertion of Memory Reuses into ML-like Programs Oukseh Lee (Hanyang University) Kwangkeun Yi (Seoul National

ConclusionConclusion

programtransformation

resultprogram

performance

not muchsharing

+big GC-time

portionruntime speedup

high reuse ratio

memory peakreduction& GC timespeedup