139
CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

  • View
    219

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

CS 519: Operating System Theory

Instructor: Liviu Iftode (iftode@cs)TA: Nader Boushehrinejadmoradi (naderb@cs)

Fall 2011

Page 2: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

2CS 519Operating System

Theory

Logistics

Location and schedule: CoRE A, Thursdays from noon to 3 pm

Instructor: Liviu Iftode Office: CoRE 311 Office Hours: Thursdays 10-11 am

TA: Nader Boushehrinejadmoradi Office: Hill 353 Office Hours: Wednesdays, 2-4 pm

More information http://www.cs.rutgers.edu/~iftode/cs519-2011.html http://sakai.rutgers.edu page for the course

Page 3: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

3CS 519Operating System

Theory

Course Overview

Goals: Understand how an operating system works Learn how OS concepts are implemented in a real

operating system Introduce to systems programming Learn about performance evaluation Learn about current trends in OS research

Page 4: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

4

Course Structure

Structure: Each major area:

Review basic material Discuss the implementation on xv6 Read, present, and discuss interesting papers

Programming assignments and project

CS 519Operating System

Theory

Page 5: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

5

Suggested Studying Approach

Read the assigned chapter and/or code before the lecture and try to understand it

Start homework and project right away, systems programming is not easy, it takes a lot of time!

Ask questions during the lecture Use the mailing list for discussion, do not be afraid to

answer a question posted by your colleague even if you are not sure. This is a way to validate your understanding of the material.

Page 6: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

6

Course Topics

Processes, threads and synchronization Memory management and virtual memory CPU scheduling File systems and I/O management Distributed systems New trends in OS research

Page 7: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

7CS 519Operating System

Theory

Textbooks - Required

Stallings. Operating Systems: Internals and Design Principles, Prentice-Hall.

Any recent version will do Papers available on the Web

Page 8: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

8CS 519Operating System

Theory

Textbooks - Recommended

Andrew Tanenbaum. Distributed Operating Systems, Prentice-Hall.

Page 9: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

9

MIT xv6 OS

Teaching OS developed by Russ Cox, Frans Kaashoek and Robert Morris from MIT

UNIX V6 ported to Intel x86 machines

Download source code and lecture materials from xv6 home page at MIT

CS 519Operating System

Theory

Page 10: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

10CS 519Operating System

Theory

Course Requirements

Prerequisites Undergraduate OS and computer architecture courses Good programming skills in C and UNIX

What to expect Several programming assignments with write-ups Challenging project (a distributed shared memory

protocol) Midterm and final exams Substantial reading Read, understand, and extend/modify xv6 code Paper presentations

Page 11: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

11CS 519Operating System

Theory

Homework

Goals Learn to design, implement, and evaluate a significant

OS-level software system Improve systems programming skills: virtual memory,

threads, synchronization, sockets, etc

Structure They are all individual assignments Software must execute correctly Performance evaluation Written report for each assignment

Page 12: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

12CS 519Operating System

Theory

Grading

Midterm 25% Final 25% Programming assignments and in-class presentations

25% Project 25%

Page 13: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

13CS 519Operating System

Theory

Today

What is an Operating System? Stallings 2.1-2.4

Architecture refresher …

Page 14: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

14CS 519Operating System

Theory

What is an operating system?

A software layer between the hardware and the application programs/users, which provides a virtual machine interface: easy to use (hides complexity) and safe (prevents and handles errors)

Acts as resource manager that allows programs/users to share the hardware resources in a protected way: fair and efficient

hardware

operating system

application (user)

Page 15: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

15CS 519Operating System

Theory

How does an OS work?

Receives requests from the application: system calls Satisfies the requests: may issue commands to

hardware Handles hardware interrupts: may upcall the application OS complexity: synchronous calls + asynchronous

events

hardware

OS

application (user) system calls upcalls

commands interrupts

hardware independent

hardware dependent

Page 16: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

16CS 519Operating System

Theory

Mechanism and Policy

Mechanisms: data structures and operations that implement an abstraction (e.g. the buffer cache)

Policies: the procedures that guide the selection of a certain course of action from among alternatives (e.g. the replacement policy for the buffer cache)

Traditional OS is rigid: mechanism together with policy

hardware

operating system: mechanism+policy

application (user)

Page 17: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

17CS 519Operating System

Theory

Mechanism-Policy Split

Single policy often not the best for all cases Separate mechanisms from policies:

OS provides the mechanism + some policy Applications may contribute to the policy

Flexibility + efficiency require new OS structures and/or new OS interfaces

Page 18: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

18CS 519Operating System

Theory

OS Mechanisms and Policies

Page 19: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

19CS 519Operating System

Theory

System Abstraction: Processes

A process is a system abstraction:illusion of being the only job in the system

hardware: computer

operating system: process

user: application create, kill processes,inter-process comm.

multiplex resources

Page 20: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

20CS 519Operating System

Theory

Processes: Mechanism and Policy

Mechanism: Creation, destruction, suspension, context switch,

signaling, IPC, etc.

Policy: Minor policy questions:

Who can create/destroy/suspend processes? How many active processes can each user have?

Major policy question that we will concentrate on: How to share system resources between multiple

processes? Typically broken into a number of orthogonal policies for

individual resources such as CPU, memory, and disk.

Page 21: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

21CS 519Operating System

Theory

Processor Abstraction: Threads

A thread is a processor abstraction: illusion of having 1 processor per execution context

hardware: processor

operating system: thread

application: execution contextcreate, kill, synch.

context switch

Process vs. Thread: Process is the unit of resource ownership,while Thread is the unit of instruction execution.

Page 22: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

22CS 519Operating System

Theory

Threads: Mechanism and Policy

Mechanism: Creation, destruction, suspension, context switch,

signaling, synchronization, etc.

Policy: How to share the CPU between threads from different

processes? How to share the CPU between threads from the same

process?

Page 23: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

23CS 519Operating System

Theory

Threads

Traditional approach: OS uses a single policy (or at most a fixed set of policies) to schedule all threads in the system. Assume two classes of jobs: interactive and batch.

More sophisticated approaches: application-controlled scheduling, reservation-based scheduling, etc

Page 24: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

24CS 519Operating System

Theory

Memory Abstraction: Virtual memory

hardware: physical memory

operating system: virtual memory

Virtual memory is a memory abstraction: illusion of large contiguous memory, often more memory than physically available

application: address spacevirtual addresses

physical addresses

Page 25: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

25CS 519Operating System

Theory

Virtual Memory: Mechanism

Mechanism: Virtual-to-physical memory mapping, page-fault, etc.

physical memory:

v-to-p memory mappings

processes:

virtual address spacesp1 p2

Page 26: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

26CS 519Operating System

Theory

Virtual Memory: Policy

Policy: How to multiplex a virtual memory that is larger than

the physical memory onto what is available? How to share physical memory between multiple

processes?

Page 27: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

27CS 519Operating System

Theory

Virtual Memory

Traditional approach: OS provides a sufficiently large virtual address space for each running application, does memory allocation and replacement, and may ensure protection

More sophisticated approaches: external memory management, huge (64-bit) address space, global virtual address space

Page 28: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

28CS 519Operating System

Theory

Storage Abstraction: File System

hardware: disk

operating system: files, directories

A file system is a storage abstraction: illusion of structured storage space

application/user: copy file1 file2 naming, protection,operations on files

operations on disk blocks

Page 29: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

29CS 519Operating System

Theory

File System

Mechanism: File creation, deletion, read, write, file-block-to-disk-

block mapping, file buffer cache, etc.

Policy: Sharing vs. protection? Which block to allocate for new data? File buffer cache management?

Page 30: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

30CS 519Operating System

Theory

File System

Traditional approach: OS does disk block allocation and caching (buffer cache), disk operation scheduling, and management of the buffer cache

More sophisticated approaches: application-controlled buffer cache replacement, log-based allocation (makes writes fast)

Page 31: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

31CS 519Operating System

Theory

Communication Abstraction: Messaging

hardware: network interface

operating system: TCP/IP protocols

Message passing is a communication abstraction: illusion of reliable (sometimes ordered) msg transport

application: socketsnaming, messages

network packets

Page 32: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

32CS 519Operating System

Theory

Message Passing

Mechanism: Send, receive, buffering, retransmission, etc.

Policy: Congestion control and routing Multiplexing multiple connections onto a single NIC

Page 33: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

33CS 519Operating System

Theory

Message Passing

Traditional approach: OS provides naming schemes, reliable transport of messages, packet routing to destination

More sophisticated approaches: user-level protocols, zero-copy protocols, active messages, memory-mapped communication

Page 34: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

34CS 519Operating System

Theory

Character & Block Devices

hardware: keyboard, mouse, etc.

operating system: character & block API

The device interface gives the illusion that devices support the same API – character stream and block access

application/user: read character from device

naming, protection,read, write

hardware-specificPIO, interrupt handling, or DMA

Page 35: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

35CS 519Operating System

Theory

Devices

Mechanisms Open, close, read, write, ioctl, etc.

Buffering

Policies Protection

Sharing?

Scheduling?

Page 36: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

36CS 519Operating System

Theory

UNIX

Source: Silberschatz, Galvin, and Gagne 2005

Page 37: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

37CS 519Operating System

Theory

Major Issues in OS Design

Programming API: what should the VM look like?Resource management: how should the hardware resources be multiplexed among multiple users?Sharing: how should resources be shared among multiple users?Protection: how to protect users from each other? How to protect programs from each other? How to protect the OS from applications and users?Communication: how can applications exchange information?Structure: how to organize the OS?Concurrency: how do we deal with the concurrency that is inherent in OS’es?

Page 38: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

38CS 519Operating System

Theory

Major Issues in OS Design

Performance: how to make it all run fast?

Reliability: how do we keep the OS from crashing?

Persistence: how can we make data last beyond program execution?

Accounting: how do we keep track of resource usage?

Distribution: how do we make it easier to use multiple computers in conjunction?

Scaling: how do we keep the OS efficient and reliable as the offered load increases (more users, more processes, more processors)?

Page 39: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

Architecture Refresher

Page 40: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

40CS 519Operating System

Theory

von Neumann Machine

The first computers (late 40’s) were calculators The advance was the idea of storing the

instructions (coded as numbers) along with the data in the same memory

Page 41: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

41CS 519Operating System

Theory

Conceptual Model

Addresses ofmemory cells

+-*/

+-*/

CPU 012

Memory contents

34

5

76

89

"big byte array"

Page 42: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

42CS 519Operating System

Theory

Operating System Perspective

A computer is a piece of hardware that runs the fetch-decode-execute loop Next slides: walk through a very simple computer to illustrate

Machine organizationWhat the pieces are and how they fit together

The basic fetch-decode-execute loopHow higher-level constructs are translated into machine instructions

At its core, the OS builds what looks like a more sophisticated machine on top of this basic hardware

Page 43: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

43CS 519Operating System

Theory

Fetch-Decode-Execute

Computer as a large, general-purpose calculator Want to program it for multiple functions

All von Neumann computers follow the same loop: Fetch the next instruction from memory Decode the instruction to figure out what to do Execute the instruction and store the result

Instructions are simple. Examples: Increment the value of a memory cell by 1 Add the contents of memory cells X and Y and store in Z Multiply contents of memory cells A and B and store in B

Page 44: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

44CS 519Operating System

Theory

Instruction Encoding

How to represent instructions as numbers?

operators+: 1-: 2*: 3/: 4

operands destination8 bits 8 bits 8 bits 8 bits

Page 45: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

45CS 519Operating System

Theory

Example Encoding

Add cell 28 to cell 63 and place result in cell 100:

operator

+: 1-: 2*: 3/: 4

source operands destinationCell 100Cell 63 Cell 28

8 bits 8 bits 8 bits 8 bits

Instruction as a number in: Decimal: 1:28:63:100Binary: 00000001:00011100:00111111:01100100 Hexadecimal: 01:1C:3F:64

Page 46: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

46CS 519Operating System

Theory

The Program Counter

Where is the “next instruction”? A special memory cell in the CPU called the

“program counter" (the PC) points to it Special-purpose memory in the CPU and devices is

called a register

Naïve fetch cycle: Increment the PC by the instruction length (4) after each execute Assumes all instructions are the same length

Page 47: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

47CS 519Operating System

Theory

Conceptual Model

+-*/

+-*/

CPU

44Program Counter

Arithmetic Units

01

2

Memory

operator

operand 1

operand 2

destination34

5

76

89

Instruction 0@ memory address 0

Instruction 1@ memory address 4

Page 48: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

48CS 519Operating System

Theory

Memory Indirection

How do we access array elements efficiently if all we can do is name a cell? Modify the operand to allow for fetching an operand "through" a memory location

E.g.: LOAD [5], 2 means fetch the contents of the cell whose address is in cell 5 and put it into cell 2 So, if cell 5 had the number 100, we would place the contents of cell 100 into cell 2

This is called indirectionFetch the contents of the cell “pointed to” by the cell in the opcode

Use an operand bit to signify if an indirection is desired

Page 49: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

49CS 519Operating System

Theory

Conditionals and Looping

Primitive “computers” only followed linear instructions Breakthrough in early computing was addition of conditionals and branching

Instructions that modify the Program CounterConditional instructions

If the content of this cell is [positive, not zero, etc.] execute the instruction or not

Branch InstructionsIf the content of this cell is [zero, non zero, etc.], set the PC to this locationjump is an unconditional branch

Page 50: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

50CS 519Operating System

Theory

Example: While Loop

while (counter > 0) { sum = sum + Y[counter]; counter–-;};

Variables to memory cells: counter is cell 1sum is cell 2index is cell 3Y[0]=cell 4, Y[1]=cell 5…

100 LOOP: BZ 1,END // Branch to address of END

// if cell 1 is 0.104 ADD 2,[3],2 // Add cell 2 and the

value // of the cell pointed to by// cell 3 then place the // result in cell 2

108 DEC 3 // Decrement cell 3 by 1112 DEC 1 // Decrement cell 1 by 1116 JUMP LOOP// Start executing from the // address of LOOP120 END: <next code block>

Memory cell

address

Assembly label

Assembly "mnemonic" English

Page 51: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

51CS 519Operating System

Theory

Registers

Architecture rule: large memories are slow, small ones are fast

But everyone wants more memory! Solution: Put small amount of memory in the CPU for faster operation

Most programs work on only small chunks of memory in a given time period. This is called locality.So, if we cache the contents of a small number of memory cells in the CPU memory, we might be able to execute many instructions before having to access memory

Small memory in CPU named separately in the instructions from the “main memory”

Small memory in CPU = registersLarge memory = main memory

Page 52: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

52CS 519Operating System

Theory

Register Machine Model

+,-,*,/+,-,*,/

CPU 01

2

Memory

34

5

76

89

88Program Counter

Arithmetic Units

Logic Units <,>,!=<,>,!=

2424

100100

1818

register 0

register 1

register 2

Page 53: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

53CS 519Operating System

Theory

Registers (cont)

Most CPUs have 16-32 “general-purpose” registers All look the “same”: combination of operators, operands, and destinations possible

Operands and destination can be in:Registers only (Sparc, PowerPC, Mips, Alpha)Registers & 1 memory operand (Intel x86 and clones) Any combination of registers and memory (Vax)

Only memory operations possible in "register-only" machines are load from and store to memoryOperations 100-1000 times faster when operands are in registers compared to when they are in memorySave instruction space too

Only address 16-32 registers, not GB of memory

Page 54: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

54CS 519Operating System

Theory

Typical Instructions

Add the contents of register 2 and register 3 and place result in register 5 ADD r2,r3,r5

Add 100 to the PC if register 2 is not zero Relative branch BNZ r2,100

Load the contents of memory location whose address is in register 5 into register 6 LDI r5,r6

Page 55: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

55CS 519Operating System

Theory

Memory Hierarchy

cpu

cache

main memory

word transfer

block transfer

disks

page transfer

decrease cost per bit decrease frequency of

access increase capacity increase access time increase size of transfer

unit

Page 56: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

56CS 519Operating System

Theory

Memory Access Costs

Intel Pentium IV Level Size Assoc Block Access Extreme Edition Size Time(3.2 GHz, 32 bits) L1 8KB 4-way 64B 2 cycles L2 512KB 8-way 64B 19 cycles L3 2MB 8-way 64B 43 cycles Mem 206 cycles

AMD Athlon 64 FX-53(2.4 GHz, 64 bits, L1 128KB 2-way 64B 3 cycleson-chip mem cntl) L2 1MB 16-way 64B 13 cycles Mem 125 cycles

Processors introduced in 2003

Page 57: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

57CS 519Operating System

Theory

Memory Access Costs

Intel Core 2 Quad Level Size Assoc Block Access Q9450 Size Time(2.66 GHz, 64 bits) L1 128KB 8-way 64B 3 cycles shared L2 6MB 24-way 64B 15 cycles Mem 229 cycles

Quad-core AMD Opteron 2360(2.5 GHz, 64 bits) L1 128KB 2-way 64B 3 cycles L2 512KB 16-way 64B 7 cycles shared L3 2MB 32-way 64B 19 cycles Mem 356 cycles

Processors introduced in 2008

Page 58: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

58CS 519Operating System

Theory

Hardware Caches

Motivated by the mismatch between processor and memory speed

Closer to the processor than the main memory Smaller and faster than the main memory Act as “attraction memory”: contains the value of main

memory locations that were recently accessed (temporal locality)

Transfer between caches and main memory is performed in units called cache blocks/lines

Caches contain also the value of memory locations that are close to locations that were recently accessed (spatial locality)

Page 59: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

59CS 519Operating System

Theory

Cache Architecture

CPU

L1

L2

Memory

cache line

associativity

Capacity missConflict missCold miss

Cache line ~32-128Associativity ~2-32

2 ways, 6 sets

Page 60: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

60CS 519Operating System

Theory

Cache Design Issues

Cache size and cache block size Mapping: physical/virtual caches, associativity Replacement algorithm: direct or LRU Write policy: write through/write back

cpu

cache

mainmemory

word transfer

block transfer

Page 61: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

61CS 519Operating System

Theory

Abstracting the Machine

Bare hardware provides a computation device How to share this expensive piece of equipment

between multiple users? Sign up during certain hours? Give program to an operator?

They run it and give you the results Software to give the illusion of having it all to yourself

while actually sharing it with others!

This software is the Operating System Need hardware support to “virtualize” machine

Page 62: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

62CS 519Operating System

Theory

Architecture Features for the OS

Next, we'll look at the mechanisms the hardware designers add to allow OS designers to abstract the basic machine in software Processor modes Exceptions Traps Interrupts

These require modifications to the basic fetch-decode-execute cycle in hardware

Page 63: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

63CS 519Operating System

Theory

Processor Modes

OS code is stored in memory … von Neumann model, remember?

What if a user program modifies OS code or data?Introduce modes of operation

Instructions can be executed in user mode or system modeA special register holds which mode the CPU is in Certain instructions can only be executed when in system modeLikewise, certain memory locations can only be written when in system mode

Only OS code is executed in system modeOnly OS can modify its memoryThe mode register can only be modified in system mode

Page 64: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

64CS 519Operating System

Theory

Simple Protection Scheme

Addresses < 100 are reserved for OS useMode register provided

zero = SYS = CPU is executing the OS (in system mode)one = USR = CPU is executing in user mode

Hardware does this check: On every fetch, if the mode bit is USR and the address is less than 100, then do not execute the instructionWhen accessing operands, if the mode bit is USR and the operand address is less than 100, do not execute the instructionMode register can only be set if mode is SYS

Page 65: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

65CS 519Operating System

Theory

Simple Protection Model

Memory

0

99100101

102

104103

105106

+,-,*,/+,-,*,/

CPU

88Program Counter

Arithmetic Units

Logic Units <,>,!=<,>,!=

Registers 0-31

Mode register 00

OS

User

Page 66: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

66CS 519Operating System

Theory

Fetch-decode-execute Revised

Fetch: if ((PC < 100) && (mode register == USR)) then Error! User tried to access the OS else fetch the instruction at the PC Decode: if ((destination register == mode) && (mode register == USR)) then Error! User tried to set the mode register < more decoding > Execute: if ((an operand < 100) && (mode register == USR) then Error! User tried to access the OS else execute the instruction

Page 67: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

67CS 519Operating System

Theory

Exceptions

What happens when a user program tries to access memory holding the operating system code or data? Answer: exceptions

An exception occurs when the CPU encounters an instruction that cannot be executed

Modify fetch-decode-execute loop to jump to a known location in the OS when an exception happens

Different errors jump to different places in the OS (are "vectored" in OS speak)

Page 68: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

68CS 519Operating System

Theory

Fetch-decode-execute with Exceptions

Fetch: if ((PC < 100) && (mode register == USR)) then set the PC = 60 set the mode = SYS fetch the instruction at the PC Decode: if ((destination register == mode) && (mode register == USR)) then set the PC = 64 set the mode = SYS goto fetch < more decoding > Execute: < check the operands for a violation >

60 is the well-known entry pointfor a memory violation

64 is the well- known entry pointfor a mode register violation

Page 69: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

69CS 519Operating System

Theory

Access Violations

Notice both instruction fetch from memory and data access must be checked

Execute phase must check both operands Execute phase must check again when

performing an indirect load This is a very primitive memory protection

scheme. We'll cover more complex virtual memory mechanisms and policies later in the course

Page 70: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

70CS 519Operating System

Theory

Recovering from Exceptions

The OS can figure out what caused the exception from the entry pointBut how can it figure out where in the user program the problem was? Solution: add another register, the PC’

When an exception occurs, save the current PC to PC’ before loading the PC with a new value

OS can examine the PC' and perform some recovery action

Stop user program and print an error message: error at address PC'Run a debugger

Page 71: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

71CS 519Operating System

Theory

Fetch-decode-execute with Exceptions & Recovery

Fetch: if ((PC < 100) && (mode register == USR)) then set the PC' = PC set the PC = 60 set the mode = SYSDecode: if ((destination register == mode) && (mode register == USR)) then set the PC' = PC set the PC = 64 set the mode = SYS goto fetch < more decoding > Execute: …

Page 72: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

72CS 519Operating System

Theory

Traps

Now, we know what happens when a user program illegally tries to access OS code or data

How does a user program legitimately access OS services? Solution: Trap instruction

A trap is a special instruction that forces the PC to a known address and sets the mode to system mode

Unlike exceptions, traps carry some arguments to the OS

Foundation of the system call

Page 73: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

73CS 519Operating System

Theory

Fetch-decode-execute with Traps

Fetch: if ((PC < 100) && (mode register == USR)) then < memory exception > Decode: if (instruction is a trap) then set the PC' = PC set the PC = 68 set the mode = SYS goto fetch if ((destination register == mode) && (mode register == USR)) then < mode exception > Execute: …

Page 74: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

74CS 519Operating System

Theory

How does the OS know which service the user program wants to invoke on a trap?

User program passes the OS a number that encodes which OS service is desired

This example machine could include the trap ID in the instruction itself:

Most real CPUs have a convention for passing the trap ID in a set of registers

E.g. the user program sets register 0 with the trap ID, then executes the trap instruction

Traps

Trap opcode Trap service ID

Page 75: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

75CS 519Operating System

Theory

Returning from a Trap

How to "get back" to user mode and the user's code after a trap?

Set the mode register = USR then set the PC? But after the mode bit is set to user, exception!

Set the PC, then set the mode bit?Jump to "user-land", then in kernel mode

Most machines have a "return from exception" instruction

A single hardware instruction: Sets the PC to PC' Sets the mode bit to user mode

Traps and exceptions use the same mechanism (RTE)

Page 76: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

76CS 519Operating System

Theory

Fetch-decode-execute with Traps

Fetch: if ((PC < 100) && (mode register == USR)) then < memory exception > Decode: if (instruction is RTE) then set the PC = PC' set the mode = USR goto fetch if ((destination register == mode) && (mode register == USR)) then < mode exception > Execute: …

Page 77: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

77CS 519Operating System

Theory

Interrupts

How can we force the CPU back into system mode if the user program is off computing something? Solution: Interrupts

An interrupt is an external event that causes the CPU to jump to a known address

Link an interrupt to a periodic clock Modify fetch-decode-execute loop to check an

external line set periodically by the clock

Page 78: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

78CS 519Operating System

Theory

Simple Interrupt Model

Clock

+,-,*,/+,-,*,/

CPU

88

PC'

Arithmetic Units

Logic Units <,>,!=<,>,!=

Registers 0-31

Mode register 00

Program Counter

Memory

Interrupt line

Reset line

OSUser

Page 79: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

79CS 519Operating System

Theory

The Clock

The clock starts counting to 10 milliseconds When 10 milliseconds elapse, the clock sets the

interrupt line "high" (e.g. sets it to logic 1) When the CPU toggles the reset line, the clock

sets the interrupt line low and starts count to 10 milliseconds again

Page 80: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

80CS 519Operating System

Theory

Fetch-decode-execute with Interrupts

Fetch: if (clock interrupt line == 1) then set the PC' = PC set the PC = 72 set the mode = SYS goto fetch if ((PC < 100) && (mode register == USR)) then < memory exception > fetch next instructionDecode: if (instruction is a trap) then < trap exception > if ((destination register == mode) && (mode register == USR)) then < mode exception > < more decoding >Execute: …

Page 81: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

81CS 519Operating System

Theory

Entry Points

What are the "entry points" for our little example machine?

60: memory access violation64: mode register violation68: User-initiated trap 72: Clock interrupt

Each entry point is typically a jump to some code block in the OSAll real OS’es have a set of entry points for exceptions, traps and interrupts

Sometimes they are combined and software has to figure out what happened.

Page 82: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

82CS 519Operating System

Theory

Saving and Restoring Context

Recall the processor state: PC, PC', R0-R31, mode register

When an entry to the OS happens, we want to start executing the correct routine then return to the user program such that it can continue executing normally Can't just start using the registers in the OS!

Solution: save/restore the user context Use the OS memory to save all the CPU state Before returning to user, reload all the registers and

then execute a return from exception instruction

Page 83: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

83CS 519Operating System

Theory

Input and Output

How can humans get at the data? How to load programs? What happens if I turn the machine off? Can I send the data to another machine?

Solution: add devices to perform these tasks Keyboards, mice, graphics Disk drives Network cards

Page 84: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

84CS 519Operating System

Theory

A Simple I/O device

Network card has 2 registers: A store into the “transmit” register sends the byte over the wire

Transmit often is written as TX (E.g. TX register) A load from the “receive” register reads the last byte that was read from the wire

Receive is often written as RX How does the CPU access these registers? Solution: map them into the memory space

An instruction that accesses memory cell 98 really accesses the transmit register instead of memoryAn instruction that accesses memory cell 99 really accesses the receive registerThese registers are said to be memory-mapped

Page 85: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

85CS 519Operating System

Theory

Basic Network I/O

Clock

+,-,*,/+,-,*,/

CPU

88

PC'

Arithmetic Units

Logic Units <,>,!=<,>,!=

Registers 0-31

Mode register 00

Program Counter

Memory

Interrupt line

Reset line

0

Network card

9899

Transmit Reg.Transmit Reg.

Receive Reg.Receive Reg.

Page 86: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

86CS 519Operating System

Theory

Why Memory-Mapped Registers?

"Stealing" memory space for device registers has 2 functions: Allows protected access --- only the OS can access the

device. User programs must trap into the OS to access I/O devices

because of the normal protection mechanisms in the processor

Why do we want to prevent direct access to devices by user programs?

OS can control devices and move data to/from devices using regular load and store instructions

No changes to the instruction set are required This is called programmed I/O

Page 87: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

87CS 519Operating System

Theory

Status Registers

How does the OS know if a new byte has arrived? How does the OS know when the last byte has been transmitted? (so it can send another one)Solution: status registersA status register holds the state of the last I/O operationOur network card has 1 status register

To transmit, the OS writes a byte into the TX register and sets bit 0 of the status register to 1. When the card has successfully transmitted the byte, it sets bit 0 of the status register back to 0.When the card receives a byte, it puts the byte in the RX register and sets bit 1 of the status register to 1. After the OS reads this data, it sets bit 1 of the status register back to 0.

Page 88: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

88CS 519Operating System

Theory

Polled I/O

To Transmit:While (status register bit 0 == 1); // wait for card to be readyTX register = data;Status reg = status reg | 0x1; // tell card to TX (set bit 0 to 1)

Naïve Receive:While (status register bit 1 != 1); // wait for data to arrive Data = RX register;Status reg = status reg & 0x01; // tell card got data (clear bit 1)

Can’t stall OS waiting to receive!Solution: poll after the clock ticks

If (status register bit 1 == 1) {Data = RX registerStatus reg = status reg & 0x01;

}

Page 89: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

89CS 519Operating System

Theory

Interrupt-driven I/O

Polling can waste many CPU cycles On transmit, CPU slows to the speed of the device Can't block on receive, so tie polling to clock, but

wasted work if no RX data

Solution: use interrupts When network has data to receive, signal an interrupt When data is done transmitting, signal an interrupt

Page 90: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

90CS 519Operating System

Theory

Polling vs. Interrupts

Why poll at all? Interrupts have high overhead:

Stop processor Figure out what caused interrupt Save user state Process request

Key factor is frequency of I/O vs. interrupt overhead

Page 91: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

91CS 519Operating System

Theory

Direct Memory Access (DMA)

Problem with programmed I/O: CPU must load/store all the data into device registers. The data is probably in memory anyway! Solution: more hardware to allow the device to read and write memory just like the CPU

Base + bound or base + count registers in the deviceSet base and count registersSet the start transmit register I/O device reads memory from baseInterrupts when done

Page 92: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

92CS 519Operating System

Theory

PIO vs. DMA

Overhead less for PIO than DMA PIO is a check against the status register, then send or

receive DMA must set up the base, count, check status, take an

interrupt

DMA is more efficient at moving data PIO ties up the CPU for the entire length of the transfer

Size of the transfer becomes the key factor in when to use PIO vs. DMA

Page 93: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

93CS 519Operating System

Theory

Typical I/O Devices

Disk drives: Present the CPU with a linear array of fixed-sized blocks

that are persistent across power cycles

Network cards: Allow the CPU to send and receive discrete units of data

(packets) across a wire, fiber or radio Packet sizes 64-8K bytes are typical

Graphics adapters: Present the CPU with a memory that is turned into

pixels on a screen

Page 94: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

94CS 519Operating System

Theory

Recap: the I/O design space

Polling vs. interrupts How does the device notify the processor that an event

happened? Polling: Device is passive, CPU must read/write a register Interrupt: device signals CPU via an interrupt

Programmed I/O vs. DMA How does the device send and receive data?

Programmed I/O: CPU must use load/store into the device DMA: Device reads and writes memory

Page 95: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

95CS 519Operating System

Theory

Practical: How to boot?

How does a machine start running the operating system in the first place? The process of starting the OS is called booting

Sequence of hardware + software events form the boot protocol

Boot protocol in modern machines is a 3-stage process CPU starts executing from a fixed address Firmware loads the boot loader Boot loader loads the OS

Page 96: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

96CS 519Operating System

Theory

Boot Protocol

(1) CPU is hard-wired to start executing from a known address in memory

This memory address is typically mapped to solid-state persistent memory (e.g., ROM, EPROM, Flash)

(2) Persistent memory contains the “boot” codeThis kind of software is called firmwareOn x86, the starting address corresponds to the BIOS (basic input-output system) boot entry pointThis code reads 1 block from the disk drive. This block is loaded and then executed. This program is the boot loader.

Page 97: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

97CS 519Operating System

Theory

Boot Protocol (cont)

(3) The boot loader can then load the rest of the operating system from disk. Note that at this point the OS still is not running The boot loader can know about multiple operating

systems The boot loader can know about multiple versions of the

OS

Page 98: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

98CS 519Operating System

Theory

Why Have A Boot Protocol?

Why not just store the OS into persistent memory? Separate the OS from the hardware

Multiple OSes or different versions of the OS Want to boot from different devices

E.g. security via a network boot OS is pretty big (tens of MBs). Rather not have it as

firmware

Page 99: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

99CS 519Operating System

Theory

Basic Computer Architecture

Single-CPU-chip computer:• Single threaded• Multithreaded• Multi/many-core

CPU Memory

memory bus

I/O bus

Disk Net interfacecore 1 core 2

… core n

Page 100: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

100CS 519Operating System

Theory

Caching Inside A 4-Core CPU

Core Core Core Core

Private L1Caches

(coherence!)

Shared L2Cache

CPU

Page 101: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

101CS 519Operating System

Theory

Multi-CPU-Chip Multiprocessors

CPUMemory

memory bus

I/O bus

disk Net interface

cache

Simple scheme (SMP): more than one CPU on the same bus Memory is shared among CPUs -- cache coherence between

LLCs Bus contention increases -- does not scale Alternative (non-bus) system interconnect -- complex and

expensive SMPs naturally support single-image operating systems

CPU

cache

Last level of hw caching

Page 102: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

102CS 519Operating System

Theory

Cache-Coherent Shared-Memory: UMA

Core

Memory

Core Core Core

SnoopingCaches

Page 103: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

103CS 519Operating System

Theory

CC-NUMA Multiprocessors

CPU Memory

memory bus

I/O bus

disk

cache

network

• Non-uniform access to different memories• Hardware allows remote memory accesses and maintains cache coherence• Scalable interconnect more scalable than bus-based UMA systems• Also naturally supports single-image operating systems• Complex hardware coherence protocols

MemCntrl

CPUMemory

memory bus

I/O bus

disk

cache

MemCntrl

Page 104: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

104CS 519Operating System

Theory

Multicomputers

Network of computers: “share-nothing” -- cheap Distributed resources: difficult to program

Message passing Distributed file system

Challenge: build efficient global abstraction in software

CPUMemory

memory bus

I/O bus

disk Net interface

cache

CPUMemory

memory bus

I/O bus

diskNet interface

cache

network

Page 105: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

OS Issues in Different Architectures

Page 106: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

106CS 519Operating System

Theory

UMA Multiprocessors

CPU

Memory

memory bus

I/O bus

disk Net interface

cache

CPU

cache

Page 107: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

107CS 519Operating System

Theory

UMA Multiprocessors: OS Issues

Processes How to divide processors among multiple processes?

Time sharing vs. space sharing

Threads Synchronization mechanisms based on shared memory How to schedule threads of a single process on its

allocated processors? Affinity scheduling?

Page 108: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

108CS 519Operating System

Theory

CC-NUMA Multiprocessors

Hardware allows remote memory accesses and maintains cache coherence through protocol

CPU Memory

memory bus

I/O bus

disk

cache

network

MemCntrl

CPUMemory

memory bus

I/O bus

disk

cache

MemCntrl

Page 109: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

109CS 519Operating System

Theory

CC-NUMA Multiprocessors: OS Issues

Memory locality!! Remote memory access up to an order of magnitude

more expensive than local access Thread migration vs. page migration Page replication Affinity scheduling

Page 110: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

110CS 519Operating System

Theory

Multicomputers

CPUMemory

memory bus

I/O bus

disk Net interface

cache

CPUMemory

memory bus

I/O bus

diskNet interface

cache

network

Share-nothing

Page 111: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

111CS 519Operating System

Theory

Multicomputers: OS Issues

Scheduling Node allocation? (CPU and memory allocated together) Process migration?

Software distributed shared-memory (Soft DSM) Distributed file systems Low-latency reliable communication

Page 112: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

OS Structure

Page 113: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

113CS 519Operating System

Theory

Traditional OS Structure

Monolithic/layered systems one/N layers all executed in “kernel-mode” good performance but rigid

OS kernel

hardware

userprocess

filesystem

memorysystem

usersystem calls

Page 114: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

114CS 519Operating System

Theory

Micro-kernel OS

Client-server model, IPC between clients and servers The micro-kernel provides protected communication OS functions implemented as user-level servers Flexible but efficiency is the problem Easy to extend for distributed systems

micro-kernel

hardware

clientprocess

fileserver

memoryserver

IPC

user mode

Page 115: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

115CS 519Operating System

Theory

Extensible OS kernel

User processes can load customized OS services into the kernel

Good performance and flexibility but protection and scalability become problems

extensible kernel

hardware

process

defaultmemoryservice

user modeprocess

mymemoryservice

Page 116: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

116CS 519Operating System

Theory

Exokernels

Kernel provides only a very low-level interface to the hardware Idea is to allow an application to manage its resources (kernel

ensures that resource is free and application has right to access it)

OS functionality implemented as user-level libraries to simplify programming

hardware

user level

exokernel

allocate resourceOS libraries

Page 117: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

Some History …

Page 118: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

Brief OS History

In the beginning, there really wasn’t an OS Program binaries were loaded using switches Interface included blinking lights (cool!)

Then came batch systems OS was implemented to transfer control from one job to

the next OS was always resident in memory

Resident monitor

Operator provided machine/OS with a stream of programs with delimiters

Typically, input device was a card reader, so delimiters were known as control cards

Page 119: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

Spooling

CPUs were much faster than card readers and printers

Disks were invented – disks were much faster than card readers and printers

So, what do we do? Pipelining … what else? Read job 1 from cards to disk. Run job 1 while reading job 2 from

cards to disk; save output of job 1 to disk. Print output of job 1 while running job 2 while reading job 3 from cards to disk. And so on …

This is known as spooling: Simultaneous Peripheral Operation On-Line

Can use multiple card readers and printers to keep up with CPU if needed

Improves both system throughput and response time

Page 120: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

Multiprogramming

CPUs were still idle whenever executing program needs to interact with peripheral device E.g., reading more data from tape

Multiprogrammed batch systems were invented Load multiple programs onto disk at the same time (later into memory)

Switch from one job to another when the first job performs an I/O operation

Overlap I/O of one job with computation of another job

Peripherals have to be asynchronous

Have to know when I/O operation is done: interrupt vs. polling

Increase system throughput, possibly at the expense of response time

Page 121: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

Time-Sharing

As you can imagine, batching was a big pain You submit a job, you twiddle your thumbs for a while, you get the output,

see a bug, try to figure out what went wrong, resubmit the job, etc. Even running production programs was difficult in this environment

Technology got better: can now have terminals and support interactive interfaces

How to share a machine (remember machines were expensive back then) between multiple people and still maintain interactive user interface?

Time-sharing Connect multiple terminals to a single machine Multiplex machine between multiple users Machine has to be fast enough to give illusion that each user has own

machine Multics was the first large time-sharing system – mid-1960’s

Page 122: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

Parallel OS

Some applications comprise tasks that can be executed simultaneously Weather prediction, scientific simulations, recalculation of a

spreadsheet

Can speedup execution by running these tasks in parallel on many processors

Need OS, compiler, and/or language support for dividing programs into multiple parallel activities

Need OS support for fast communication and synchronization

Many different parallel architectures

Main goal is performance

Page 123: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

Real-Time OS

Some applications have time deadlines by when they have to complete certain tasks

Hard real-time system Medical imaging systems, industrial control systems, etc. Catastrophic failure if system misses a deadline

What happens if collision avoidance software on an oil tanker does not detect another ship before the “turning or breaking” distance of the tanker?

Challenge lies in how to meet deadlines with minimal resource waste

Soft real-time system Multimedia applications May be annoying but is not catastrophic if a few deadlines are missed Challenge 1: how to meet most deadlines with minimal resource waste Challenge 2: how to load-shed if become overloaded

Page 124: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

Distributed OS

Clustering Use multiple small machines to handle large service demands

Cheaper than using one large machine Better potential for reliability, incremental scalability, and absolute

scalability

Wide-area distributed systems Allow use of geographically distributed resources

E.g., use of a local PC to access web services Don’t have to carry needed information with us

Need OS support for communication and sharing of distributed resources E.g., network file systems

Want performance (although speedup is not metric of interest here), high reliability, and use of diverse resources

Page 125: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

Embedded OS

Pervasive computing Right now, cell phones and PDAs Future, computational elements everywhere

Characteristics Constrained resources: slow CPU, small memories, no disk, etc. What’s new about this? Isn’t this just like the old computers?

Well no, because we want to execute more powerful programs than before

How can we execute more powerful programs if our hardware is similar to old hardware?

Use many, many of them Augment with services running on powerful machines

OS support for power management, mobility, resource discovery, etc.

Page 126: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

Virtual Machines and Hypervisors

Popular in the 60’s and 70’s, vanished in the 80’s and 90’s

Idea: Partition a physical machine into a number of virtual machines Each virtual machine behaves as a separate computer

Can support heterogeneous operating systems (called guest OSes)

Provides performance isolation and fault isolation

Facilitates virtual machine migration

Facilitates server consolidation

Hypervisor or Virtual Machine Monitor Underlying software that manages multiple virtual machines

Page 127: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

Virtual Machines and Hypervisors

Source: Silberschatz, Galvin, and Gagne 2005

Page 128: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

Virtual Machines: Another Architecture

Source: Silberschatz, Galvin, and Gagne 2005

Page 129: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

129CS 519Operating System

Theory

Backup Slides

Page 130: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

130CS 519Operating System

Theory

The UNIX Time-sharing System

Features Time-sharing system Hierarchical file system System command language (shell) File-based device-independent I/O

Versions 1 & 2 No multi-programming Ran on PDP-7,9,11

Page 131: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

131CS 519Operating System

Theory

More History

Version 4 Ran on PDP-11 (hardware costing < $40k) Took less than 2 man-years to code ~50KB code size (kernel) Written in C

Page 132: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

132CS 519Operating System

Theory

File System

Ordinary files (uninterpreted) Directories

File of files Organized as a rooted tree Pathnames (relative and absolute) Contains links to parent, itself Multiple links to files can exist

Link - hard (different name for the same file; modifications seen under both names, but erasing one does not affect the other) or symbolic (pointer to a file; erasing the file leaves pointers hanging)

Page 133: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

133CS 519Operating System

Theory

File System (contd)

Special files Each I/O device associated with a special file To provide uniform naming and protection model Uniform I/O

Page 134: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

134CS 519Operating System

Theory

Removable File Systems

Tree-structured file hierarchies

Mounted on existing space by using mount

No links between different file systems

Page 135: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

135CS 519Operating System

Theory

Protection

User id uid marked on files

Ten protection bits nine - rwx permissions for user, group & other setuid bit is used to change user id

Super-user has special uid exempt from constraints on access

Page 136: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

136CS 519Operating System

Theory

Uniform I/O Model

Basic system calls open, close, creat, read, write, seek

Streams of bytes, no records No locks visible to the user

Page 137: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

137CS 519Operating System

Theory

File System Implementation

I-node contains a short description of one file direct, single-indirect and double-indirect pointers to disk

blocks

I-list table of i-nodes, indexed by i-number pathname scanning to determine i-number

Allows simple and efficient fsck Buffered data Different disk write policies for data and metadata

Page 138: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

138CS 519Operating System

Theory

Processes

Process Memory image, register values, status of open files etc. Memory image consists of text, data, and stack

segments To create new processes

pid = fork() process splits into two independently executing processes

(parent and child) Pipes used for communication between related

processes exec(file, arg1, ..., argn) used to start another

application

Page 139: CS 519: Operating System Theory Instructor: Liviu Iftode (iftode@cs) TA: Nader Boushehrinejadmoradi (naderb@cs) Fall 2011

139CS 519Operating System

Theory

The Shell

Command-line interpreter cmd arg1 arg2 ... argn i/o redirection

<, > filters & pipes

ls | more job control

cmd & simplified shell =>