Upload
phungkien
View
222
Download
0
Embed Size (px)
Citation preview
c© 2012– Andrei Alexandrescu. 1 / 51
Systems Programming & Beyond using C++ and D
Hello, WorlD!
Andrei Alexandrescu
Prepared for LASER Summer School 2012
D is. . .
c© 2012– Andrei Alexandrescu. 3 / 51
• Efficient when it can
• Convenient when it should
• Safe when it must
Basics
c© 2012– Andrei Alexandrescu. 4 / 51
• Statically typed
Use deduction wherever possible
• Sininimp-Mulinint object-oriented style
• Gray-box modularity:
◦ Python-inspired modules for “classic” entities
◦ White-box parameterized types and functions
• Heterogeneous translation of parameterized code
Hello, World!
c© 2012– Andrei Alexandrescu. 5 / 51
#!/usr/bin/rdmd
import std.stdio;
void main()
{
writeln("Hello, world!");
}
• “Meh”worthy
• However:
◦ Simple
◦ Correct
◦ Scriptable
c© 2012– Andrei Alexandrescu. 6 / 51
bool binarySearch(R, T)(R input, T value)
if (isRandomAccessRange!R
&& is(ElementType!R == T)) {
while (!input.empty) {
auto i = input.length / 2;
auto mid = input[i];
if (mid > value) input = input[0 .. i];
else if (mid < value) {
input = input[i + 1 .. $];
}
else return true;
}
return false;
}
Memory Safety
c© 2012– Andrei Alexandrescu. 8 / 51
• All data is read with the type it was written
• Eliminates unpleasant class of errors
• Makes bugs reproducible
• Preserves the integrity of the type system
Challenge
c© 2012– Andrei Alexandrescu. 9 / 51
• System languages must:
◦ Forge pointers (linkers/loaders)
◦ Magically change type of memory (allocators/GCs)
◦ Recycle memory “early”
◦ Circumvent type system
Attack
c© 2012– Andrei Alexandrescu. 10 / 51
• All of D can’t be safe
• Define a safe subset
• Safe subset usable for large portions or entire programs
• Tagged with @safe attribute
@safe double cube(double x) {
return x * x * x;
}
• Unsafe functions tagged with @system
@system void free(void*);
But wait
c© 2012– Andrei Alexandrescu. 11 / 51
• Important case: safe functions with
unknown/unprovable implementation
◦ fopen, malloc, isAlpha
◦ Small buffer optimization
◦ Tagged union implementation
• Define attribute @trusted
Rules
c© 2012– Andrei Alexandrescu. 12 / 51
• @safe functions cannot:
◦ call @system functions
◦ escape addresses of locals
◦ forge pointers
◦ cast pointers around
◦ do pointer arithmetic
• N.B. Not all rules completely implemented
◦ Some on the conservative side
◦ A few details in flux
Resulting setup
c© 2012– Andrei Alexandrescu. 13 / 51
• Low notational overhead: entire modules or portions
thereof can be annotated
• Application divided in:
◦ Safe part, uses GC, easy to write and debug
◦ Trusted part, encapsulated unsafe manipulation,
proven by means of code review
◦ System part, non-encapsulated unsafe
manipulation, only when solidly justified
Thesis
c© 2012– Andrei Alexandrescu. 15 / 51
• Writing entire programs in pure style difficult
• Writing fragments of programs in pure style easy, useful
• + Easier to verify useful properties, debug
• + Better code generation
• − Challenge: interfacing pure and impure code
Functional Factorial (yawn)
c© 2012– Andrei Alexandrescu. 16 / 51
ulong factorial(uint n) {
return n <= 1 ? 1 : n * factorial(n - 1);
}
• It’s PSPACE!
• Somebody should do hard time for this
However, it’s pure
c© 2012– Andrei Alexandrescu. 17 / 51
pure ulong factorial(uint n) {
return n <= 1 ? 1 : n * factorial(n - 1);
}
• Pure is good
Functional Factorial, Fixed
c© 2012– Andrei Alexandrescu. 18 / 51
pure ulong factorial(uint n) {
ulong crutch(uint n, ulong result) {
return n <= 1
? result
: crutch(n - 1, n * result);
}
return crutch(n, 1);
}
• Threads state through as parameters
• You know what? I don’t care for it
Honest Factorial
c© 2012– Andrei Alexandrescu. 19 / 51
ulong factorial(uint n) {
ulong result = 1;
for (; n > 1; --n) {
result *= n;
}
return result;
}
• But no longer pure!
• Well allow me to retort
Pure is as pure does
c© 2012– Andrei Alexandrescu. 21 / 51
• “Pure functions always return the same result for the
same arguments”
• No reading and writing of global variables
◦ Global constants okay!
• No calling of impure functions
• Who said anything about local, transient state inside the
function?
Transitive State
c© 2012– Andrei Alexandrescu. 22 / 51
pure void reverse(T)(T[] a) {
foreach (i; 0 .. data.length / 2) {
swap(data[i], data[$ - i - 1]);
}
}
• Possibility: disallow
• More useful: relaxed rule
• Operate with transitive closure of state reachable
through parameter
• Not functional pure, but an interesting superset
• No need for another annotation, it’s all in the signature!
(BTW, my int is purer than your int)
c© 2012– Andrei Alexandrescu. 23 / 51
pure BigInt factorial(uint n) {
BigInt result = 1;
for (; n > 1; --n) {
result *= n;
}
return result;
}
• Works, but not in released version
• Not all stdlib “purified” yet
Aftermath
c© 2012– Andrei Alexandrescu. 24 / 51
• If parameters reach mutable state:
◦ Relaxed pure—no globals, no I/O, no impure calls
• If parameters can’t reach mutable state:
◦ “Haskell-grade” observed purity
◦ Yet imperative implementation possible
◦ As long as it’s local only
Generic Programming
c© 2012– Andrei Alexandrescu. 26 / 51
• “Write once, instantiate anywhere”
◦ Specialized implementations welcome
• Prefers static type information and static expansion
(macro style)
• Fosters strong mathematical underpinnings
◦ Minimizing assumptions common theme in math
• Tenuous relationship with binary interfaces
• Starts with algorithms, not interfaces or objects
• “Premature encapsulation is the root of some
derangement”
Warmup: the “dream min”
c© 2012– Andrei Alexandrescu. 27 / 51
• Should work at efficiency comparable to hand-written
code
• Avoid nonsensical calls min("hello", 42) etc.
• Work for all ordered types and conversions
• Decline incompatible types, without prejudice
First prototype
c© 2012– Andrei Alexandrescu. 28 / 51
auto min(L, R)(L lhs, R rhs) {
return rhs < lhs ? rhs : lhs;
}
auto a = min(x, y);
• This function is as efficient as any specialized,
handwritten version (true genericity)
• Deduction for argument types and result type
Reject nonsensical calls
c© 2012– Andrei Alexandrescu. 29 / 51
• min("hello", 42) client code error
auto min(L, R)(L lhs, R rhs)
if (typeof(rhs < lhs) == bool) {
return rhs < lhs ? rhs : lhs;
}
• Rejects calls with 0 or 1 arguments
• Allows other overloads to take over, e.g. min element
over a collection
Naming names
c© 2012– Andrei Alexandrescu. 30 / 51
template ordered(L, R) {
enum ordered =
is(typeof(L.init < R.init) == bool);
}
auto min(L, R)(L lhs, R rhs)
if (ordered!(L, R)) {
return rhs < lhs ? rhs : lhs;
}
• The “eponymous template trick”
How about min over many elements?
c© 2012– Andrei Alexandrescu. 31 / 51
auto min(R)(R range)
if (isInputRange!R &&
is(typeof(range.front < range.front) == bool))
{
auto result = range.front;
range.popFront();
foreach (e; range) {
if (e < result) result = e;
}
return result;
}
auto m = [ 1, 5, 2, 0, 7, 9 ].min();
• Works over anything that can be iterated
How about argmin now?
c© 2012– Andrei Alexandrescu. 32 / 51
auto argmin(alias fun, R)(R r)
if (isInputRange!R &&
is(typeof(fun(r.front) < fun(r.front)) == bool))
{
auto result = r.front;
auto cache = fun(result);
r.popFront();
foreach (e; r) {
auto cand = fun(e);
if (cand > cache) continue;
result = e;
cache = cand;
}
return result;
}
argmin
c© 2012– Andrei Alexandrescu. 33 / 51
auto s = ["abc", "a", "xz"];
auto m = s.argmin!((x) => x.length);
• Works on anything iterable and any predicate
• Predicate is passed by alias
• No loss of efficiency
Generative programming
c© 2012– Andrei Alexandrescu. 35 / 51
• In brief: code that generates code
• Generic programming often requires algorithm
specialization
• Specification often present in a DSL
Embedded DSLs
c© 2012– Andrei Alexandrescu. 37 / 51
• Formatted printing?
• Regular expressions?
• EBNF?
Embedded DSLs
c© 2012– Andrei Alexandrescu. 37 / 51
• Formatted printing?
• Regular expressions?
• EBNF?
• PEG?
Embedded DSLs
c© 2012– Andrei Alexandrescu. 37 / 51
• Formatted printing?
• Regular expressions?
• EBNF?
• PEG?
• SQL?
Embedded DSLs
c© 2012– Andrei Alexandrescu. 37 / 51
• Formatted printing?
• Regular expressions?
• EBNF?
• PEG?
• SQL?
• . . . Pasta for everyone!
Embedded DSLs
c© 2012– Andrei Alexandrescu. 38 / 51
Here: use with native grammar
Process during compilation
Generate D code accordingly
Compile-Time Evaluation
c© 2012– Andrei Alexandrescu. 39 / 51
• A large subset of D available for compile-time evaluation
ulong factorial(uint n) {
ulong result = 1;
for (; n > 1; --n) result *= n;
return result;
}
...
auto f1 = factorial(10); // run-time
static f2 = factorial(10); // compile-time
Code injection with mixin
c© 2012– Andrei Alexandrescu. 40 / 51
mixin("writeln(\"hello, world\");");
mixin(generateSomeCode());
• Not as glamorous as AST manipulation but darn
effective
• Easy to understand and debug
• Now we have compile-time evaluation AND mixin. . .
Example: bitfields in library
c© 2012– Andrei Alexandrescu. 42 / 51
struct A {
int a;
mixin(bitfields!(
uint, "x", 2,
int, "y", 3,
uint, "z", 2,
bool, "flag", 1));
}
A obj;
obj.x = 2;
obj.z = obj.x;
Parser
c© 2012– Andrei Alexandrescu. 43 / 51
import pegged.grammar; // by Philippe Sigaud
mixin(grammar("
Expr < Factor AddExpr*
AddExpr < (’+’/’-’) Factor
Factor < Primary MulExpr*
MulExpr < (’*’/’/’) Primary
Primary < Parens / Number / Variable
/ ’-’ Primary
Parens < ’(’ Expr ’)’
Number <~ [0-9]+
Variable <- Identifier
"));
Usage
c© 2012– Andrei Alexandrescu. 44 / 51
// Parsing at compile-time:
static parseTree1 = Expr.parse(
"1 + 2 - (3*x-5)*6");
pragma(msg, parseTree1.capture);
// Parsing at run-time:
auto parseTree2 = Expr.parse(readln());
writeln(parseTree2.capture);
Compile-time regular expressions
c© 2012– Andrei Alexandrescu. 48 / 51
• GSoC project by Dmitry Olshansky: FReD
• Fully UTF capable, no special casing for ASCII
• Two modes sharing the same backend:
auto r1 = regex("^.*/([^/]+)/?$");
static r2 = ctRegex!("^.*/([^/]+)/?$");
• Run-time version uses intrinsics in a few places
• Static version generates specialized automaton, then
compiles it