Upload
jarvis-beckwith
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
3
SMP design goals
• Low Overhead for scalar operation
• Source Compatible
• Warnings for unsafe usage
• Tools for smp enhanced programming
4
SMP: Goal Realization
• Low Overhead
• 9.0 (non-smp) virtually the same as 8.2
• SMP has <10% hit over non-smp (sometimes faster)
• Other lisps have 15%-100% hit over 8.2
• (YMMV)
5
SMP: Goal Realization
• Source Compatible: Success
• Exceptions:
• hash-tables
• threads
• 4 new smp runtime-lisp files (compare 28 for new port)
6
SMP: Goal realization
• Warnings for unsafe usage
• smp-macros module (8.1 patch, integral in 8.2)
• not aggressive in looking for problems
• setf races
• potential deadlocks
• alists (but plists/getf are protected by convention)
7
SMP: goal realization
• Tools for enhanced SMP programming
• synchronizing operators and locks
• Separate GC
8
SMP: implementation
• Deprecation of SMP-unsafe Macros
• New macros
• Bindability and settability of specials
• synchronization operators
• GC
9
SMP: implementation
S
• Deprecated smp-unsafe macros
• without-interrupts, without-scheduling
• excl::fast / excl::atomically
• excl::*warn-smp-usage*
10
SMP: Implementation
• New Macros
• fast-and-clean: replaces (excl::fast (excl::atomically ...))
• with-pinned-objects: replaces (excl::fast (excl::atomically ...))
• with-delayed-interrupts: replaces without-{interrupts,scheduling}
• defvar-nonbindable
11
SMP:Implementation
• Bindability and Settability
• bindable, settable: defvar/defparameter
• not bindable, settable: defvar-nonbindable
• bindable, not settable: excl::defvar-nonsettable
• not bindable, not settable: defconstant
12
SMP:implementationold 8.1wide binding
headersize
valuewaste
symbol
headersize
global valuethread 1 locativethread 2 locative
thread 3 locativethread 4 locative
symbol locatives
headersize
valuewaste
sv-vectortag=type=
#xb
tag=2/type=#x70
tag=2/type=#x70
headervaluehash
functionnameplist
13
SMP:implementationnew wide binding
headervaluehash
functionnameplistsv-
vectorlock-
index
headervalue
symbolfunction
symbol
headersize
symbolthread 1 locativethread 2 locative
thread 3 locativethread 4 locative
symbol locatives
headervalue
symbolfunction
sv-vectortag=type=
#xbtag=#xb/
type=#x8btag=2/
type=#x85
14
smp: implementation
• Synchronization operators
• push-atomic/pop-atomic
• incf-atomic/decf-atomic
• update-atomic
• atomic-conditional-setf (implementor for above)
• atomic-conditional-setq (special operator)
15
smp: implementation
• Lower level smp operators
• get-atomic-modify-expansion
• excl::atomic-modify-form (may change!)
• excl::*force-csw-opcodes* (may change!)
• excl::defsetf-conditional (may change!)
16
SMP: implementation
• excl::defsetf-conditional built-in operator conversions
• excl::.inv-structure-ref
• excl::.inv-svref
• excl::.inv-car, excl::.inv-cdr
• excl::.inv-symbol-plist
• excl::.inv-global-symbol-value
17
SMP: implementation
• Lowest level operators
• gc-setf-protect-atomic
• ll :cas low-level instruction form
18
SMP: implementation
• with-locked-object
• good on any lockable object
• lighter weight than process-lock
• use with care to avoid deadlocks
• special versions for structs and streams
19
SMP: implementation
• Other higher-level synchronizing tools
• sharable-locks
• barriers
• queues (uses process-lock; with-locked-object is lighter weight)
• condition-variables
20
SMP Implementation: gc• Runs in separate (non-Lisp) thread
• Able to provide per-thread object allocation
• Currently implemented:
• conses: everywhere
• floats: on x86-64
• Synchronizes with all threads
• Still written in C
21
GC states• 0: Lisping: thread is running Lisp code; GC can’t happen
• 1: Foreign: thread is running foreign code; GC can happen
• 2: Blocking GC: thread is running Lisp code; GC wants it to pause
• 3: Blocked by GC: thread is trying to get from foreign to Lisp; GC is running
• 4: Beside GC: thread is running foreign code and GC is happening
22
GC state diagram
Lisping(0)
Foreign(1)
Blocking GC(2)
Beside GC(4)
Blocked by GC
(3)
thread goes lisping
thread goes foreign
GCDone
GCStarting
GC Starting (signal
thread:GC waits)
thread goes foreign
threadgoes
lisping
(threadwaits)
GC Done(post thread)
threadinvokes GC
threadinvokes GC
(thread waits)
(thread waits)
(post gc)22
23
Fasl Reader
• Rewritten in Lisp runtime code
• 50% faster than C version
• fewer transitions to/from C
• Started in 8.2; used only to load source debug info
• Used exclusively in 9.0
24
Fasl coding example
case ff_complex: /* tos = imag, tos-1 = real */ if (Building_BOTH) { LispVal comp = new_lisp_obj(TYPEcomplex, 0, 0); /* complex object is new */
*(nat *)((nat)comp + c_imag_adj/PtrScale) = (nat)f_pop(); *(nat *)((nat)comp + c_real_adj/PtrScale) = (nat)f_pop(); f_push(comp); } break;
(#.ff_complex (when (building-both) (let* ((imag (fasl-pop thread)) (real (fasl-pop thread))
(complex (q-qint-call sys::make-complex real imag))) (fasl-push complex thread))))
Lisp:
C:
24
26
Debugger: reimplementation of Api
• db:next-newer-frame hard to implement
• Always moving!
• either cached frames become invalid or have to start over each time
• :zoom on overflowed stack takes much too long
27
Frame descriptors
• Frame-descriptors are now larger
• newer, older, chain slots
• validated via interlock with their real frames.
• frame descriptor has argcount shadow slot
28
frames
(link)
return adddress
function
argcount
...
headerclass
fpnewerolderchain
(argcount)others ...
Stack Frame descrptor