Upload
moris-russell
View
218
Download
1
Tags:
Embed Size (px)
Citation preview
D u k e S y s t e m s
CPS 210Unix and All That
Jeff ChaseDuke University
http://www.cs.duke.edu/~chase/cps210
Unix: A lasting achievement?
“Perhaps the most important achievement of Unix is to demonstrate that a powerful operating system for interactive use need not be expensive…it can run on hardware costing as little as $40,000.”
The UNIX Time-Sharing System* D. M. Ritchie and K. Thompson
1974
DEC PDP-11/24
http://histoire.info.online.fr/pdp11.html
Let’s pause a moment to reflect...
From Hennessy and Patterson, Computer Architecture: A Quantitative Approach, 4th edition, 2006
Core Rate(SPECint)
Notelog scale
Today Unix runs embedded in devices costing < $100.
[RT74]: historical hardware details
• [Ritchie/Thompson74] is the classic reference on Unix.
• In 1974, the advances we take for granted were in the future.
• They had to prove it on the hardware they had at the time.
• Many specific implementation choices have changed.
– 14 –character file names
– assembly language C
– 7 protection bits on files
– i-numbers and i-list
– 512-byte blocks
– ppt is “paper tape”???
– vowel embargo
The UNIX Time-Sharing System* D. M. Ritchie and K. Thompson
1974
Some lessons of history• At the time it was created, Unix was the “simplest multi-
user OS people could imagine.”– It’s in the name: Unix vs. Multics
• Simple abstractions can deliver a lot of power.– Many people have been inspired by the power of Unix.
• The community spent four decades making Unix complex again....but the essence is unchanged.
• Unix is a simple context to study core issues for classical OS design. “It’s in there.”
• Unix variants continue to be in wide use.
• They serve as a foundation for advances.
Simple?
• users
• files
• processes
• pipes– which “look like” files
These persist across reboots. They have symbolic names (you choose it) and internal IDs (the system chooses).
These exist within a running system, and they are transient: they disappear on a crash or reboot. They have internal IDs.
Unix supports dynamic create/destroy of these objects.It manages the various name spaces.It has system calls to access these objects.It checks permissions.
Unix: some key concepts
• Names and namespaces– directories and pathnames
– name tree and subtree grafting (mount)
– root directory and current directory
– path prefix list
– resolution
– links (aliases) and reference counting
• Access control by tags and labels– inheritance of tags and labels
• Context manipulation– fork vs. exec
Files: hierarchical name spaceroot directory
mount point
user home directory
external media volume or network storage
applications etc.
“Everything is a file”
Universal Set
A
B
regular files
“Files” special files
directories
The UNIX Time-Sharing System* D. M. Ritchie and K. Thompson,1974
File I/O
char buf[BUFSIZE];int fd;
if ((fd = open(“../zot”, O_TRUNC | O_RDWR) == -1) {perror(“open failed”);exit(1);
}while(read(0, buf, BUFSIZE)) {
if (write(fd, buf, BUFSIZE) != BUFSIZE) {perror(“write failed”);exit(1);
}}
Open files are named within the process by an integer file descriptor.
Pathnames may be relative to process current directory.
Process passes status back to parent on exit, to report success/failure.
Process does not specify current file offset: the system remembers it.
Standard descriptors (0, 1, 2) for input, output, error messages (stdin, stdout, stderr).
Program
Context(Domain)
Thread
“Components in context”
For our purposes, an operating system is a platform that supports protection and isolation: every component runs within a context.Program, context and thread are OS abstractions.
execute
A context defines an isolated sandbox for a running program, so that it can use only the data and resources that the OS grants it.
Program
Running a program
“Unix Classic” simplificationsContext == process == (1 VAS + 1 thread + ...)Each process runs exactly one program/component instance (at a time).IPC channels are pipes.All I/O is based on a simple common abstraction: file / stream.
data
codeconstants
initialized dataimports/exports
symbolstypes/interfaces
The theater analogy
Threads
Address space
Program
scriptcontext (stage)
[lpcox]
Running a program is like performing a play.
Processes and the kernel
data dataPrograms
run asindependent processes.
Protected system calls
...and upcalls (e.g., signals)
Protected OS kernel
mediates access to
shared resources.
Threads enter the kernel for
OS services.
Each process has a private
virtual address space and one
thread.
The kernel is a separate component/context with enforced modularity.The kernel syscall interface supports processes, files, pipes, and signals.
Enforced modularity
pipe(or other channel)
An important theme from Monday’s classBy putting each component instance in a separate context, we can enforce modularity boundaries among components. Each component runs in a sandbox: they can interact only through
pipes. Neither can access the internals of the other.
Other application programs
cc
Other application programs
Hardware
Kernel
sh who a.out
date
wc
grepedvi
ld
as
comp
cppnroff
Unix defines uniform, modular ways to combine programs to build up more complex functionality.
Unix programming environment
stdoutstdin
Standard unix programs read a byte stream from standard input (fd==0).
They write their output to standard output (fd==1).
That style makes it easy to combine simple programs using pipes or files.
If the parent sets it up, the program doesn’t even have to know.
Stdin or stdout might be bound to a file, pipe, device, or network socket.
Unix fork/exec/exit/wait syscalls
fork parent fork child
wait exit
int pid = fork();Create a new process that is a clone of its parent.
exec*(“program” [, argvp, envp]);Overlay the calling process with a new program, and transfer control to it.
exit(status);Exit with status, destroying the process. Note: this is not the only way for a process to exit!
int pid = wait*(&status);Wait for exit (or other status change) of a child, and “reap” its exit status. Note: child may have exited before parent calls wait!
exec
initialize child context
Unix: users and their namespaces
• A unix system has a set of user accounts.– identities, principals
– often correspond to real users, but not always
• Each account has a username.– a human-readable character string: “chase”
– also called a symbolic name
• Each account has a userID– a number for internal use
• These namespaces are flat.
• The system keeps a bidirectional map:– f(username) = userID or
Principles of Computer System Design Saltzer & Kaashoek 2009
Protection Systems 101
Reference monitorExample: Unix kernel
Isolation boundary
Labels and access control
login
shell
tool foo
login
shell
tool
log in
fork, setuid(“alice”), exec
fork/execcreat(“foo”)
write,close open(“foo”)
read
fork/exec
fork, setuid(“bob”), exec
owner=“alice”uid=“alice”
uid=“bob”
Every file and every process is labeled/tagged with a user ID.
A process inherits its userID from its parent process.
A file inherits its owner userID from its creating process.
A privileged process may set its user ID.
Alice Bob
Labels and access control
login
shell
tool foo
login
shell
tool
creat(“foo”)
write,close open(“foo”)
readowner=“alice”
uid=“alice”
uid=“bob”
Should processes running with Bob’s userID be permitted to
open file foo?
Alice BobEvery system defines rules for assigning security labels to
subjects (e.g., Bob’s process) and objects (e.g., file foo).
Every system defines rules to compare the security labels to authorize attempted accesses.
Post-note• We talked about access policy in vanilla Unix.
• The owner of a Unix file may tag it with additional status specifying access rights for subjects.
– Access types = {read, write, execute} [3 bits]
– Subject types = {owner, group, other/anyone} [3 bits]
– If the file is executed, should the system setuid the process to the userID of the file’s owner. [1 bit]
– 10 bits total: (3x3)+1. Usually given in octal: e.g., “777” means 9 bits set: anyone can r/w/x the file, but no setuid.
– It is a very simple form of an access control list (ACL). Later systems like AFS have richer ACLs.
• Unix provides a syscall and shell command for owner to set the permission bits on each file (inode).
• “Group” was added later and is a little more complicated: a user may belong to multiple groups.
Init and Descendents
Kernel “handcrafts” initial process to run “init” program.
Other processes descend from init, and also run as root, including user login guards.
Login invokes a setuid system call to run user shell in a child process after user authenticates.
Children of user shell inherit the user’s identity (uid).
Processes: A Closer Look
+ +user ID
process IDparent PIDsibling links
children
virtual address space process descriptor (PCB)
resources
thread
stack
Each process has a thread bound to the VAS.
The thread has a stack addressable through the
VAS.
The kernel can suspend/restart the thread wherever and whenever it
wants.
The OS maintains some state for each
process in the kernel’s internal
data structures: a file descriptor table, links to maintain the process tree, and a place to store the
exit status.
The address space is a private name space for a set of memory
segments used by the process.
The kernel must initialize the process
memory for the program to run.
0x0
0x7fffffff
Static data
Dynamic data(heap/BSS)
Text(code)
Stack
ReservedVAS example (32-bit)
• An addressable array of bytes…
• Containing every instruction the process thread can execute…
• And every piece of data those instructions can read/write…– i.e., read/write == load/store
• Partitioned into logical segments with distinct purpose and use.
• Every memory reference by a thread is interpreted in its VAS context.– Resolve to a location in machine memory
• A given address in different VAS may resolve to different locations.
64 bytes: 3 waysp + 0x0
0x1f
0x0
0x1f
0x1f
0x0
char p[]char *p
int p[]int* p
p
char* p[]char** p
Pointers (addresses) are 8 bytes on a 64-bit machine.
Alignmentp + 0x0
0x1f
0x0
0x1f
0x1f
0x0
char p[]char *p
int p[]int* p
p
char* p[]char** p
The machine requires that an n-byte value is aligned on an n-byte boundary. n = 2i
XX
X
Heap allocation
Allocated heap blocks for structs or objects.
Align!
A contiguous chunk of memory obtained from
OS kernel.E.g., with Unix sbrk()
system call.
A runtime library obtains the block and manages it as a
“heap” for use by the programming language environment, to store
dynamic objects.
E.g., with Unix malloc and free library calls.
Alternative: block maps
map
The storage in a heap block is contiguous in the VAS. C and
other PL environments require this.
That complicates the heap manager because the heap
blocks may be different sizes.
Idea: use a level of indirection through a map to assemble a
storage object from “scraps” of storage in different locations.
The “scraps” can be fixed-size slots: that makes allocation
easy because they are interchangeable.
Example: page tables that implement a VAS.
Variable PartitioningVariable partitioning is the strategy of parking differently sized carsalong a street with no marked parking space dividers.
Wasted spaceexternal fragmentation
2
3
1
What’s in an Object File or Executable?
int j = 327;char* s = “hello\n”;char sbuf[512];
int p() { int k = 0; j = write(1, s, 6); return(j);}
text
dataidata
wdata
header
symboltable
relocationrecords
Used by linker; may be removed after final link step and strip.
Header “magic number”indicates type of image.
Section table an arrayof (offset, len, startVA)
program sections
program instructionsp
immutable data (constants)“hello\n”
writable global/static dataj, s
j, s ,p,sbuf
A Peek Inside a Running Program
0
high
code library
your data
heap
registers
CPU
R0
Rn
PC
“memory”
x
x
your program
common runtime
stack
address space(virtual or physical)
SP
y
y
Process Creation in Unix
int pid;int status = 0;
if (pid = fork()) {/* parent */…..pid = wait(&status);
} else {/* child */…..exit(status);
}
Parent uses wait to sleep until the child exits; wait returns child pid and status.
Wait variants allow wait on a specific child, or notification of stops and other signals.
The fork syscall returns twice: it returns a zero to the child and the child process ID (pid) to the parent.
The Shell
• Users may select from a range of interpreter programs available– or even write their own (to add to the confusion)
– csh, sh, ksh, tcsh, bash: choose your flavor…
• Shells execute commands composed of program filenames, args, and I/O redirection symbols.– Shells can run files of commands (scripts) for more
complex tasks, e.g., by redirecting shell’s stdin.
– Shell’s behavior is guided by environment variables.
– E.g., $PATH
Using the shell
• Commands: ls, cat, and all that
• Current directory: cd and pwd
• Arguments: echo
• Signals: ctrl-c
• Job control, foreground, and background: &, ctrl-z, bg, fg
• Environment variables: printenv and setenv
• Most commands are programs: which, $PATH, and /bin
• Shells are commands: sh, csh, ksh, tcsh, bash
• Pipes and redirection: ls | grep a
• Files and I/O: open, read, write, lseek, close
• stdin, stdout, stderr
• Users and groups: whoami, sudo, groups