148
Computer Organization: A Programmer's Perspective Bits, Bytes, Nibbles, Words and Strings

Computer Organization: A Programmer's Perspective

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Computer Organization: A Programmer's Perspective

Computer Organization:A Programmer's Perspective

Bits, Bytes, Nibbles, Words and Strings

Page 2: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 2

Topics

Why bits? Why 0/1? Basic terms: Bits, Bytes, Nibbles, Words Representing information as bits

Characters and strings Instructions (more on this when we look at assembly) Numbers

Bit-level manipulations Boolean algebra Expressing in C

Page 3: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 3

Everything is bits

Each bit is 0 or 1 By encoding/interpreting sets of bits in various ways

Computers represent numbers, sets, strings, etc…

Computers manipulate representations (instructions)

Why bits? Electronic implementation is easy Easy to store with bistable elements

Reliably transmitted on noisy and inaccurate wires

0.0V

0.2V

0.9V

1.1V

0 1 0

Page 4: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 4

Binary Representations Number representation (base 2)

Represent 1521310 as 111011011011012

Represent 1.5213 X 104 as 1.11011011011012 X 213

Represent 1.2010 as 1.0011001100110011[0011]…2

Instruction representation In AMD64 machine code

RETQ command: C3 (hex) = 110000112

MOV $0, %EAX: b8 (hex) = 101110002 [00000000 … ]

Page 5: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 5

Terms

Bit: Single binary digit, 0 or 1 Byte: 8 bits.

Smallest unit of memory used in modern computers Nibble (English: small bite): 4 bits

2 nibbles = 1 byte Word: 8-64 bits (1 to 8 bytes)

Depends on machine!

Page 6: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 6

Encoding Byte Values

Byte = 8 bits, nibble = 4 bits Binary 000000002 to 111111112

Decimal: 010 to 25510

Hexadecimal 0016 to FF16

Two nibblesBase 16 number representationUse characters ‘0’ to ‘9’ and ‘A’ to ‘F’Write FA1D37B16 in C as 0xFA1D37B

» Or 0xfa1d37b

Octal: 08 to 3778

Base 8, Not often usedWritten in C as '0256' (0 is zero)

0 0 00001 1 00012 2 00103 3 00114 4 01005 5 01016 6 01107 7 01118 8 10009 9 1001A 10 1010B 11 1011C 12 1100D 13 1101E 14 1110F 15 1111

HexDecim

al

Binary

Page 7: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 7

Bit level operations

Page 8: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 8

Boolean Algebra (George Boole, 19th century)

Developed by George Boole in 19th Century Algebraic representation of logic

Encode “True” as 1 and “False” as 0

And A&B = 1 when both A=1 and

B=1

Or A|B = 1 when either A=1 or

B=1

Not ~A = 1 when

A=0

Exclusive-Or (Xor) A^B = 1 when either A=1 or B=1, but not

both

Page 9: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 9

General Boolean Algebras

Operate on Bit Vectors Operations applied bitwise

All of the Properties of Boolean Algebra Apply

01101001& 01010101 01000001

01101001| 01010101 01111101

01101001^ 01010101 00111100

~ 01010101 10101010 01000001 01111101 00111100 10101010

Page 10: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 10

Example: Representing & Manipulating Sets

Representation Width w bit vector represents subsets of {0, …, w–1}

aj = 1 if j A∈

01101001 { 0, 3, 5, 6 }

76543210

01010101 { 0, 2, 4, 6 }

76543210

Operations & Intersection 01000001 { 0, 6 }

| Union 01111101 { 0, 2, 3, 4, 5, 6 }

^ Symmetric difference 00111100 { 2, 3, 4, 5 }

~ Complement 10101010 { 1, 3, 5, 7 }

Page 11: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 11

Bit-Level Operations in C

Operations &, |, ~, ^ Available in C Apply to any “integral” data type

long, int, short, char, unsigned

View arguments as bit vectors

Arguments applied bit-wise

Examples (Char data type) ~0x41 ➙ 0xBE

~010000012 ➙ 101111102

~0x00 ➙ 0xFF ~000000002 ➙ 111111112

0x69 & 0x55 ➙ 0x41 011010012 & 010101012 ➙ 010000012

0x69 | 0x55 ➙ 0x7D 011010012 | 010101012 ➙ 011111012

Page 12: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 12

Contrast: Logic Operations in C

Contrast to Logical Operators &&, ||, !

View 0 as “False”

Anything nonzero as “True”

Always return 0 or 1

Early termination

Examples (char data type) !0x41 ➙ 0x00 !0x00 ➙ 0x01 !!0x41 ➙ 0x01

0x69 && 0x55 ➙ 0x01 0x69 || 0x55 ➙ 0x01

p && *p (avoids null pointer access)

Page 13: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 13

Shift Operations

Left Shift: x << y

Shift bit-vector x left y positions (throw extras on left)

Fill with 0’s on right

Right Shift: x >> y

Shift bit-vector x right y positions (throw extras on right)

Logical shift: Fill with 0’s on left

Arithmetic shift: Replicate most significant bit on left

Undefined Behavior

Arithmetic or logical right shift? Up to compiler!

Often arithm. for signed, logical otherwise

Shift amount < 0 or ≥ word size

01100010Argument x

00010000<< 3

00011000Log. >> 2

00011000Arith. >> 2

10100010Argument x

00010000<< 3

00101000Log. >> 2

11101000Arith. >> 2

0001000000010000

0001100000011000

0001100000011000

00010000

00101000

11101000

00010000

00101000

11101000

Page 14: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 14

Cool Stuff with Xor

void funnyswap(int *x, int *y){ *x = *x ^ *y; /* #1 */ *y = *x ^ *y; /* #2 */ *x = *x ^ *y; /* #3 */}

Bitwise Xor is form of addition

With extra property that every value is its own additive inverse A ^ A = 0

BABegin

BA^B1

(A^B)^B = AA^B2

A(A^B)^A = B3

ABEnd

*y*x

Page 15: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 15

Some basic representations

C Data Type Typical 32-bit Typical 64-bit x86-64

char 1 1 1

short 2 2 2

int 4 4 4

long 4 8 8

float 4 4 4

double 8 8 8

long double − − 10/16

pointer 4 8 8

Page 16: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 16

Encoding IntegersEncoding Integers

short int x = 15213; short int y = -15213;

C short 2 bytes long

Sign Bit For 2’s complement, most significant bit indicates sign

0 for nonnegative 1 for negative

B2T (X ) xw1 2w1 xi 2

i

i0

w2

B2U(X ) xi 2i

i0

w1

Unsigned Two’s Complement

SignBit

Decimal Hex Binaryx 15213 3B 6D 00111011 01101101y -15213 C4 93 11000100 10010011

Page 17: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 17

Encoding Example (Cont.)Encoding Example (Cont.) x = 15213: 00111011 01101101 y = -15213: 11000100 10010011

Weight 15213 -152131 1 1 1 12 0 0 1 24 1 4 0 08 1 8 0 0

16 0 0 1 1632 1 32 0 064 1 64 0 0

128 0 0 1 128256 1 256 0 0512 1 512 0 0

1024 0 0 1 10242048 1 2048 0 04096 1 4096 0 08192 1 8192 0 0

16384 0 0 1 16384-32768 0 0 1 -32768

Sum 15213 -15213

Page 18: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 18

Numeric RangesNumeric Ranges

Unsigned Values UMin = 0

000…0 UMax = 2w – 1

111…1

Two’s Complement Values TMin = –2w–1

100…0 TMax = 2w–1 – 1

011…1

Other Values Minus 1

111…1

Decimal Hex BinaryUMax 65535 FF FF 11111111 11111111TMax 32767 7F FF 01111111 11111111TMin -32768 80 00 10000000 00000000-1 -1 FF FF 11111111 111111110 0 00 00 00000000 00000000

Values for W = 16

Page 19: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 19

Values for Different Word Sizes

Observations |TMin | =

TMax + 1 Asymmetric range

UMax = 2 * TMax + 1

C Programming #include <limits.h>

K&R App. B11

Declares constants, e.g., ULONG_MAX

LONG_MAX

LONG_MIN

Values platform-specific

W8 16 32 64

UMax 255 65,535 4,294,967,295 18,446,744,073,709,551,615TMax 127 32,767 2,147,483,647 9,223,372,036,854,775,807TMin -128 -32,768 -2,147,483,648 -9,223,372,036,854,775,808

Page 20: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 20

Unsigned & Signed Numeric ValuesEquivalence

Same encodings for nonnegative values

Uniqueness Every bit pattern represents

unique integer value Each representable integer has

unique bit encoding

Can Invert Mappings U2B(x) = B2U-1(x) T2B(x) = B2T-1(x)

Bit pattern for two’s comp integer

X B2T(X)B2U(X)0000 00001 10010 20011 30100 40101 50110 60111 7

–88

–79

–610

–511

–412

–313

–214

–115

1000

1001

1010

1011

1100

1101

1110

1111

0

1

2

3

4

5

6

7

Page 21: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 21

short int x = 15213; unsigned short int ux = (unsigned short) x; short int y = -15213; unsigned short int uy = (unsigned short) y;

Casting Signed to UnsignedCasting Signed to Unsigned

C Allows Conversions from Signed to Unsigned

Resulting Value No change in bit representation Nonnegative values unchanged

ux = 15213 Negative values change into (large) positive values

uy = 50323

Page 22: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 22

Relation Between Signed & UnsignedRelation Between Signed & Unsigned

uy = y + 2 * 32768= y + 65536

Weight -15213 503231 1 1 1 12 1 2 1 24 0 0 0 08 0 0 0 0

16 1 16 1 1632 0 0 0 064 0 0 0 0

128 1 128 1 128256 0 0 0 0512 0 0 0 0

1024 1 1024 1 10242048 0 0 0 04096 0 0 0 08192 0 0 0 0

16384 1 16384 1 1638432768 1 -32768 1 32768

Sum -15213 50323

Page 23: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 23

0

TMax

TMin

–1–2

0

UMaxUMax – 1

TMaxTMax + 1

2’s Complement Range

UnsignedRange

Conversion Visualized 2’s Comp. Unsigned

Ordering Inversion

Negative Big Positive

Page 24: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 24

Signed vs. Unsigned in CSigned vs. Unsigned in CConstants

By default are considered to be signed integers Unsigned if have “U” as suffix

0U, 4294967259U

Casting Explicit casting between signed & unsigned (U2T and T2U)

int tx, ty;unsigned ux, uy;tx = (int) ux;uy = (unsigned) ty;

Implicit casting also occurs via assignments, procedure callstx = ux;uy = ty;

Page 25: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 25

0 0U == unsigned

-1 0 < signed

-1 0U > unsigned

2147483647 -2147483648 > signed

2147483647U -2147483648 < unsigned

-1 -2 > signed

(unsigned) -1 -2 > unsigned

2147483647 2147483648U < unsigned

2147483647 (int) 2147483648U > signed

Casting SurprisesCasting SurprisesExpression Evaluation

If mix unsigned and signed in single expression, signed values implicitly cast to unsigned

Including comparison operations <, >, ==, <=, >=Examples for W = 32

Constant1 Constant2 Relation Evaluation0 0U-1 0-1 0U2147483647 -2147483648 2147483647U -2147483648 -1 -2 (unsigned)-1 -2 2147483647 2147483648U 2147483647 (int)2147483648U

Page 26: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 26

Remember!

Bit pattern does not change Bit pattern gets re-interpreted==> Unexpected effects

When mixed signed, unsigned => cast to unsigned

Page 27: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 27

Sign ExtensionSign Extension

Task: Given w-bit signed integer x Convert it to w+k-bit integer with same value

Rule: Make k copies of sign bit: X = xw–1 ,…, xw–1 , xw–1 , xw–2 ,…, x0

k copies of MSB• • •X

X • • • • • •

• • •

w

wk

Page 28: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 28

Justification For Sign ExtensionJustification For Sign ExtensionProve Correctness by Induction on k

Induction Step: extending by single bit maintains value

Key observation: –2w–1 = –2w +2w–1

Look at weight of upper bits: X –2w–1 xw–1 X –2w xw–1 + 2w–1 xw–1 = –2w–1 xw–1

- • • •X

X - + • • •

w+1

w

Page 29: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 29

Sign Extension ExampleSign Extension Example

Converting from smaller to larger integer data type C automatically performs sign extension

short int x = 15213; int ix = (int) x; short int y = -15213; int iy = (int) y;

Decimal Hex Binary

x 15213 3B 6D 00111011 01101101ix 15213 00 00 3B 6D 00000000 00000000 00111011 01101101y -15213 C4 93 11000100 10010011iy -15213 FF FF C4 93 11111111 11111111 11000100 10010011

Page 30: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 30

Visualizing (Mathematical) Integer Addition

Integer Addition 4-bit integers u, v

Compute true sum Add4(u , v)

Values increase linearly with u and v

Forms planar surface

Add4(u , v)

u

v0 2

46 8

1012

14

0

2

4

6

8

10

1214

0

4

8

12

16

20

24

28

32

Integer Addition

Page 31: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 31

Unsigned Addition

Standard Addition Function Ignores carry output

Implements Modular Arithmetics = UAddw(u , v) = u + v mod 2w

• • •

• • •

u

v+

• • •u + v

• • •

True Sum: w+1 bits

Operands: w bits

Discard Carry: w bits UAddw(u , v)

Page 32: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 32

Visualizing Unsigned Addition

0

2w

2w+1

UAdd4(u , v)

u

v

True Sum

Modular Sum

Overflow

Overflow

02

46

810

1214

0

2

4

6

8

10

1214

0

2

4

6

8

10

12

14

16

Wraps Around If true sum ≥ 2w

At most once

Page 33: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 33

Two’s Complement Addition

TAdd and UAdd have Identical Bit-Level Behavior Signed vs. unsigned addition in C:

int s, t, u, v;s = (int) ((unsigned) u + (unsigned) v);

t = u + v

Will give s == t

• • •

• • •

u

v+

• • •u + v

• • •

True Sum: w+1 bits

Operands: w bits

Discard Carry: w bits TAddw(u , v)

Page 34: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 34

TAdd Overflow

–2w –1

–2w

0

2w –1–1

2w–1

True Sum

TAdd Result

1 000…0

1 011…1

0 000…0

0 100…0

0 111…1

100…0

000…0

011…1

PosOver

NegOver

Functionality True sum requires w+1

bits

Drop off MSB

Treat remaining bits as 2’s comp. integer

Page 35: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 35

Visualizing 2’s Complement Addition

Values 4-bit two’s comp.

Range from -8 to +7

Wraps Around If sum 2w–1

Becomes negative

At most once

If sum < –2w–1

Becomes positive

At most once

TAdd4(u , v)

u

v

PosOver

NegOver

-8 -6 -4-2 0

24

6

-8

-6

-4

-2

0

2

46

-8

-6

-4

-2

0

2

4

6

8

Page 36: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 36

Multiplication

Goal: Computing Product of w-bit numbers x, y Either signed or unsigned

But, exact results can be bigger than w bits Unsigned: up to 2w bits

Result range: 0 ≤ x * y ≤ (2w – 1) 2 = 22w – 2w+1 + 1

Two’s complement min (negative): Up to 2w-1 bits

Result range: x * y ≥ (–2w–1)*(2w–1–1) = –22w–2 + 2w–1

Two’s complement max (positive): Up to 2w bits, but only for (TMinw)2

Result range: x * y ≤ (–2w–1) 2 = 22w–2

So, maintaining exact results… would need to keep expanding word size with each product computed

is done in software, if needed

e.g., by “arbitrary precision” arithmetic packages

Page 37: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 37

Unsigned Multiplication in C

Standard Multiplication Function Ignores high order w bits

Implements Modular ArithmeticUMultw(u , v) = u · v mod 2w

• • •

• • •

u

v*

• • •u · v

• • •

True Product: 2*w bits

Operands: w bits

Discard w bits: w bitsUMultw(u , v)

• • •

Page 38: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 38

Signed Multiplication in C

Standard Multiplication Function Ignores high order w bits

Some of which are different for signed vs. unsigned multiplication

Lower bits are the same

• • •

• • •

u

v*

• • •u · v

• • •

True Product: 2*w bits

Operands: w bits

Discard w bits: w bitsTMultw(u , v)

• • •

Page 39: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 39

Power-of-2 Multiply with Shift

Operation u << k gives u * 2k

Both signed and unsigned

Examples u << 3 == u * 8 (u << 5) – (u << 3) == u * 24 Most machines shift and add faster than multiply

Compiler generates this code automatically

• • •

0 0 1 0 0 0•••

u

2k*u · 2kTrue Product: w+k bits

Operands: w bits

Discard k bits: w bits UMultw(u , 2k)

•••

k

• • • 0 0 0•••

TMultw(u , 2k)0 0 0••••••

Page 40: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 40

Why Should I Use Unsigned?Why Should I Use Unsigned?Don’t Use Just Because Number Nonzero

C compilers on some machines generate less efficient codeunsigned i; int cnt;for (i = 1; i < cnt; i++) a[i] += a[i-1];

Cast cnt to unsigned

(repeatedly)

Page 41: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 41

Why Should I Use Unsigned?Why Should I Use Unsigned?Don’t Use Just Because Number Nonzero

C compilers on some machines generate less efficient codeunsigned i; int cnt;for (i = 1; i < cnt; i++) a[i] += a[i-1];

Easy to make mistakesfor (i = cnt-2; i >= 0; i--) a[i] += a[i+1];

What’s the bug?

Page 42: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 42

Why Should I Use Unsigned?Why Should I Use Unsigned?Don’t Use Just Because Number Nonzero

C compilers on some machines generate less efficient codeunsigned i; int cnt;for (i = 1; i < cnt; i++) a[i] += a[i-1];

Easy to make mistakesfor (i = cnt-2; i >= 0; i--) a[i] += a[i+1];

Loop foreveri never < 0

Page 43: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 43

Why Should I Use Unsigned?Why Should I Use Unsigned?Don’t Use Just Because Number Nonzero

C compilers on some machines generate less efficient codeunsigned i; int cnt;for (i = 1; i < cnt; i++) a[i] += a[i-1];

Should befor (i = cnt-2; i <cnt; i--) a[i] += a[i+1];

Why does this work?

Page 44: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 44

Why Should I Use Unsigned?Why Should I Use Unsigned?Don’t Use Just Because Number Nonzero

C compilers on some machines generate less efficient codeunsigned i; int cnt;for (i = 1; i < cnt; i++) a[i] += a[i-1];

Should befor (i = cnt-2; i <cnt; i--) a[i] += a[i+1];

Do Use For: Multiprecision or modular arithmetic Need extra bits' range, right up to limit of word size Represent sets

Page 45: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 45

C PuzzleC Puzzle Assume machine with 32 bit word size, two’s comp. integers TMin makes a good counterexample in many cases

x < 0 ((x*2) < 0)

ux >= 0

(x & 7) == 7 (x<<30) < 0

ux > -1

x > y -x < -y

x * x >= 0

x > 0 && y > 0 x + y > 0

x >= 0 -x <= 0

x <= 0 -x >= 0

int x = foo();

int y = bar();

unsigned ux = x;

unsigned uy = y;

Page 46: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 46

C Puzzle AnswersC Puzzle Answers

x < 0 ((x*2) < 0) False: TMin

ux >= 0 True: 0 = UMin

(x & 7) == 7 (x<<30) < 0 True: x1 = 1

ux > -1 False: 0

x > y -x < -y False: -1, TMin

x * x >= 0 False: x=65535

x > 0 && y > 0 x + y > 0 False: TMax, TMax

x >= 0 -x <= 0 True: –TMax < 0

x <= 0 -x >= 0 False: TMin

Assume machine with 32 bit word size, two’s comp. integers TMin makes a good counterexample in many cases

Page 47: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 47

Security Example: Impact of (un)signed

Similar to code found in FreeBSD’s implementation of getpeername There are legions of smart people trying to find vulnerabilities in

programs

/* Kernel memory region holding user-accessible data */#define KSIZE 1024char kbuf[KSIZE];

/* Copy at most maxlen bytes from kernel region to user buffer */int copy_from_kernel(void *user_dest, int maxlen) { /* Byte count len is minimum of buffer size and maxlen */ int len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len); return len;}

Page 48: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 48

Typical Usage

/* Kernel memory region holding user-accessible data */#define KSIZE 1024char kbuf[KSIZE];

/* Copy at most maxlen bytes from kernel region to user buffer */int copy_from_kernel(void *user_dest, int maxlen) { /* Byte count len is minimum of buffer size and maxlen */ int len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len); return len;}

#define MSIZE 528

void getstuff() { char mybuf[MSIZE]; copy_from_kernel(mybuf, MSIZE); printf(“%s\n”, mybuf);}

Page 49: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 49

Malicious Usage

/* Kernel memory region holding user-accessible data */#define KSIZE 1024char kbuf[KSIZE];

/* Copy at most maxlen bytes from kernel region to user buffer */int copy_from_kernel(void *user_dest, int maxlen) { /* Byte count len is minimum of buffer size and maxlen */ int len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len); return len;}

#define MSIZE 528

void getstuff() { char mybuf[MSIZE]; copy_from_kernel(mybuf, -MSIZE); . . .}

/* Declaration of library function memcpy */void *memcpy(void *dest, void *src, size_t n);

Page 50: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 50

Code Security Example #2

SUN XDR library Widely used library for transferring data between machines

void* copy_elements(void *ele_src[], int ele_cnt, size_t ele_size);

ele_src

malloc(ele_cnt * ele_size)

Page 51: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 51

XDR Code

void* copy_elements(void *ele_src[], int ele_cnt, size_t ele_size) { /* * Allocate buffer for ele_cnt objects, each of ele_size bytes * and copy from locations designated by ele_src */ void *result = malloc(ele_cnt * ele_size); if (result == NULL)

/* malloc failed */return NULL;

void *next = result; int i; for (i = 0; i < ele_cnt; i++) { /* Copy object i to destination */ memcpy(next, ele_src[i], ele_size);

/* Move pointer to next memory region */next += ele_size;

} return result;}

Page 52: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 52

XDR Vulnerability

What if: ele_cnt = 220 + 1

ele_size = 4096 = 212

Allocation on 32 bits? What happens to the ‘next’ pointer?

How can I make this function secure?

malloc(ele_cnt * ele_size)

Page 53: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 53

Page 54: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 54

Machine-Level Code Representation

Encode Program as Sequence of Instructions Arithmetic, logical, math operations Read or write memory Conditional branches, jumps

Different machines, different instructions Most code not binary compatible Alpha’s, Sun’s, ARM (tablets) use fix-length instructions

Reduced Instruction Set Computer (RISC) PC’s use variable length instructions

Complex Instruction Set Computer (CISC)

Page 55: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 55

Representing Instructionsint sum(int x, int y){ return x+y;}

Different CPUs use totally different instructions and encodings

00

00

30

42

Alpha sum

01

80

FA

6B

E0

08

81

C3

Sun sum

90

02

00

09

For this example, Alpha & Sun use two 4-byte instructions Use differing numbers of instructions

in other cases

PC uses 7 instructions with lengths 1, 2, and 3 bytes Same for NT and for Linux

E5

8B

55

89

PC sum

45

0C

03

45

08

89

EC

5D

C3

Page 56: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 56

Machines have Words

Imprecise definitions Nominal size of integer-valued data Sometimes size of address, memory bus width

Current desktop machines are 64 bits (8 bytes) Potentially address 1.8 X 1019 bytes 32-bit machines phasing out (but in phones, tablets)

Low-end use 8- or 16-bit words Machines support multiple data formats

Fractions or multiples of word size Always integral number of bytes

Page 57: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 57

Byte-Oriented Memory Organization Programs Refer to Virtual Addresses

Conceptually very large array of bytes Implemented with hierarchy of different memory types

SRAM, DRAM, disk In Unix and Windows NT, address space private to “process”

Program can clobber its own data, but not that of others You will see this again, in much more detail

Compiler + Run-Time System Control Allocation Where different program objects should be stored Multiple mechanisms: static, stack, and heap In any case, all allocation within single virtual address space You will see this again, in much more detail

Page 58: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 58

Word-Oriented Memory Organization

Addresses Specify Byte Locations Address of first byte in word Addresses of successive

words differ by 4 (32-bit) or 8 (64-bit)

0000

0001

0002

0003

0004

0005

0006

0007

0008

0009

0010

0011

32-bitWords

Bytes Addr.

0012

0013

0014

0015

64-bitWords

Addr =??

Addr =??

Addr =??

Addr =??

Addr =??

Addr =??

0000

0004

0008

0012

0000

0008

Page 59: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 59

Byte Ordering

How should bytes within multi-byte word be ordered in memory?

Conventions Sun’s, Mac’s are “Big Endian” machines

Least significant byte has highest address Alphas, PC’s are “Little Endian” machines

Least significant byte has lowest address

Page 60: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 60

Byte Ordering Example

Big Endian Least significant byte has highest address

Little Endian Least significant byte has lowest address

Example Variable x has 4-byte representation 0x01234567 Address given by &x is 0x100

0x100 0x101 0x102 0x103

01 23 45 67

0x100 0x101 0x102 0x103

67 45 23 01

Big Endian

Little Endian

01 23 45 67

67 45 23 01

Page 61: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 61

Reading Byte-Reversed Listings

Disassembly Text representation of binary machine code Generated by program that reads the machine code

Example Fragment

Address Instruction Code Assembly Rendition 8048365: 5b pop %ebx 8048366: 81 c3 ab 12 00 00 add $0x12ab,%ebx 804836c: 83 bb 28 00 00 00 00 cmpl $0x0,0x28(%ebx)

Deciphering Numbers Value: 0x12ab

Pad to 4 bytes: 0x000012ab

Split into bytes: 00 00 12 ab

Reverse: ab 12 00 00

Page 62: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 62

Examining Data Representations

Code to Print Byte Representation of Data Casting pointer to unsigned char * creates byte array

typedef unsigned char *pointer;

void show_bytes(pointer start, int len){ int i; for (i = 0; i < len; i++) printf("0x%p\t0x%.2x\n", start+i, start[i]); printf("\n");}

Printf directives:%p: Print pointer%x: Print Hexadecimal

Page 63: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 63

show_bytes Execution Example

int a = 15213;

printf("int a = 15213;\n");

show_bytes((pointer) &a, sizeof(int));

Result (Linux):

int a = 15213;

0x11ffffcb8 0x6d

0x11ffffcb9 0x3b

0x11ffffcba 0x00

0x11ffffcbb 0x00

Page 64: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 64

Representing Integersint A = 15213;int B = -15213;long int C = 15213;

Decimal: 15213

Binary: 0011 1011 0110 1101

Hex: 3 B 6 D

6D

3B

00

00

Alpha A

3B

6D

00

00

Sun A

93

C4

FF

FF

Alpha B

C4

93

FF

FF

Sun B

Two’s complement representation

00

00

00

00

6D

3B

00

00

Alpha C

3B

6D

00

00

Sun C

6D

3B

00

00

ia32 C

Page 65: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 65

Representing Pointersint B = -15213;int *P = &B;

Alpha Address

Hex: ... 0 1 F F F F F C A 0

Binary: 0000 0001 1111 1111 1111 1111 1111 1100 1010 0000

01

00

00

00

A0

FC

FF

FF

Alpha P

Sun Address

Hex: E F F F F B 2 C Binary: 1110 1111 1111 1111 1111 1011 0010 1100

Different compilers & machines assign different locations to objects

FB

2C

EF

FF

Sun P

FF

BF

D4

F8

Linux PLinux Address

Hex: B F F F F 8 D 4 Binary: 1011 1111 1111 1111 1111 1000 1101 0100

Page 66: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 66

Characters and Strings

Page 67: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 67

Representing a character Single byte sufficient for small codes

In C: char c; But interpretation of value depends on chosen code Fixed-length code:

Baudot, ITA2, USTTY: 5 bits BCD (binary coded decimal, used in early computers): 6 bits ASCII – defines 128 characters (7 bits) ISO 8859, EBCDIC: 8 bits

Variable-length codes UTF-8: 8-32 bits UTF-16: 16-32 bits

Wide chars: chars more than 8 bits.

Page 68: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 68

Example: ASCII

Page 69: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 69

Representing Strings (in C) Represented by array of 1-byte characters

Each character encoded in ASCII formatStandard 7-bit encoding of character setCharacter “0” has code 0x30

» Digit i has code 0x30+i Other encodings exist (e.g., for Hebrew) String are null-terminated

Final character = 0x00Array length must be strlen()+1.

Alpha S Sun S

32

31

31

35

33

00

32

31

31

35

33

00

char S[6] = "15213";

Page 70: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 70

Representing Strings (in C) Represented by array of 1-byte characters

Each character encoded in ASCII formatStandard 7-bit encoding of character setCharacter “0” has code 0x30

» Digit i has code 0x30+i Other encodings exist (e.g., for Hebrew) String are null-terminated

Final character = 0x00Array length must be strlen()+1.

Text files generally platform independent But: Different conventions of line termination character(s)! But: Different default encodings (depend on locale)

Alpha S Sun S

32

31

31

35

33

00

32

31

31

35

33

00

char S[6] = "15213";

Page 71: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 71

Representing Strings (Cont'd) Strings in Pascal (a.k.a P-Strings)

Represented by array of characters+1 First cell holds length, no need for null-termination Length known in O(1)!

Think about strcat(), strlen(), strcpy(), ... Size of string limited by size of array cell value

p-String

35

32

5

31

31

33

Page 72: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 72

Representing Strings (Cont'd) Strings in Pascal (a.k.a P-Strings)

Represented by array of characters+1 First cell holds length, no need for null-termination Length known in O(1)!

Think about strcat(), strlen(), strcpy(), ... Size of string limited by size of array cell value

c-strings and p-strings can be used with wide chars, but: big/little endian now matters (each cell multiple bytes) No longer built-in, instead platform/library specific

p-String

35

32

5

31

31

33

Page 73: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 73

Main Points

Boolean Algebra is Mathematical Basis Basic form encodes “false” as 0, “true” as 1 General form like bit-level operations in C Good for representing & manipulating sets

It’s All About Bits & Bytes Numbers, text, programs: They are all collections of bits!

Different Machines/Languages: Different Conventions Representations Word size Byte ordering

Page 74: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 74

שאלות?

Page 75: Computer Organization: A Programmer's Perspective

Computer Organization:A Programmer's Perspective

Bits, Bytes, Nibbles, Words and Strings

Page 76: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 2

Topics Why bits? Why 0/1? Basic terms: Bits, Bytes, Nibbles, Words Representing information as bits

Characters and strings Instructions (more on this when we look at assembly) Numbers

Bit-level manipulations Boolean algebra Expressing in C

Page 77: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 3

Everything is bits

Each bit is 0 or 1 By encoding/interpreting sets of bits in various ways

Computers represent numbers, sets, strings, etc…

Computers manipulate representations (instructions)

Why bits? Electronic implementation is easy Easy to store with bistable elements

Reliably transmitted on noisy and inaccurate wires

0.0V

0.2V

0.9V

1.1V

0 1 0

Page 78: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 4

Binary Representations Number representation (base 2)

Represent 1521310 as 111011011011012

Represent 1.5213 X 104 as 1.11011011011012 X 213

Represent 1.2010 as 1.0011001100110011[0011]…2

Instruction representation In AMD64 machine code

RETQ command: C3 (hex) = 110000112

MOV $0, %EAX: b8 (hex) = 101110002 [00000000 … ]

Page 79: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 5

Terms

Bit: Single binary digit, 0 or 1 Byte: 8 bits.

Smallest unit of memory used in modern computers Nibble (English: small bite): 4 bits

2 nibbles = 1 byte Word: 8-64 bits (1 to 8 bytes)

Depends on machine!

Page 80: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 6

Encoding Byte Values

Byte = 8 bits, nibble = 4 bits Binary 000000002 to 111111112

Decimal: 010 to 25510

Hexadecimal 0016 to FF16

Two nibblesBase 16 number representationUse characters ‘0’ to ‘9’ and ‘A’ to ‘F’Write FA1D37B16 in C as 0xFA1D37B

» Or 0xfa1d37b

Octal: 08 to 3778

Base 8, Not often usedWritten in C as '0256' (0 is zero)

0 0 00001 1 00012 2 00103 3 00114 4 01005 5 01016 6 01107 7 01118 8 10009 9 1001A 10 1010B 11 1011C 12 1100D 13 1101E 14 1110F 15 1111

HexDecim

al

Binary

Page 81: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 7

Bit level operations

Page 82: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 8

Boolean Algebra (George Boole, 19th century)

Developed by George Boole in 19th Century Algebraic representation of logic

Encode “True” as 1 and “False” as 0

And A&B = 1 when both A=1 and

B=1

Or A|B = 1 when either A=1 or

B=1

Not ~A = 1 when

A=0

Exclusive-Or (Xor) A^B = 1 when either A=1 or B=1, but not

both

Page 83: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 9

General Boolean Algebras

Operate on Bit Vectors Operations applied bitwise

All of the Properties of Boolean Algebra Apply

01101001& 01010101 01000001

01101001| 01010101 01111101

01101001^ 01010101 00111100

~ 01010101 10101010 01000001 01111101 00111100 10101010

Page 84: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 10

Example: Representing & Manipulating Sets

Representation Width w bit vector represents subsets of {0, …, w–1}

aj = 1 if j A∈

01101001 { 0, 3, 5, 6 }

76543210

01010101 { 0, 2, 4, 6 }

76543210

Operations & Intersection 01000001 { 0, 6 }

| Union 01111101 { 0, 2, 3, 4, 5, 6 }

^ Symmetric difference 00111100 { 2, 3, 4, 5 }

~ Complement 10101010 { 1, 3, 5, 7 }

Page 85: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 11

Bit-Level Operations in C

Operations &, |, ~, ^ Available in C Apply to any “integral” data type

long, int, short, char, unsigned

View arguments as bit vectors

Arguments applied bit-wise

Examples (Char data type) ~0x41 ➙ 0xBE

~010000012 ➙ 101111102

~0x00 ➙ 0xFF ~000000002 ➙ 111111112

0x69 & 0x55 ➙ 0x41 011010012 & 010101012 ➙ 010000012

0x69 | 0x55 ➙ 0x7D 011010012 | 010101012 ➙ 011111012

Page 86: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 12

Contrast: Logic Operations in C

Contrast to Logical Operators &&, ||, !

View 0 as “False”

Anything nonzero as “True”

Always return 0 or 1

Early termination

Examples (char data type) !0x41 ➙ 0x00 !0x00 ➙ 0x01 !!0x41 ➙ 0x01

0x69 && 0x55 ➙ 0x01 0x69 || 0x55 ➙ 0x01

p && *p (avoids null pointer access)

Page 87: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 13

Shift Operations

Left Shift: x << y Shift bit-vector x left y positions (throw extras on left)

Fill with 0’s on right

Right Shift: x >> y Shift bit-vector x right y positions (throw extras on right)

Logical shift: Fill with 0’s on left

Arithmetic shift: Replicate most significant bit on left

Undefined Behavior

Arithmetic or logical right shift? Up to compiler!

Often arithm. for signed, logical otherwise

Shift amount < 0 or ≥ word size

01100010Argument x

00010000<< 3

00011000Log. >> 2

00011000Arith. >> 2

10100010Argument x

00010000<< 3

00101000Log. >> 2

11101000Arith. >> 2

0001000000010000

0001100000011000

0001100000011000

00010000

00101000

11101000

00010000

00101000

11101000

Page 88: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 14

Cool Stuff with Xor

void funnyswap(int *x, int *y){ *x = *x ^ *y; /* #1 */ *y = *x ^ *y; /* #2 */ *x = *x ^ *y; /* #3 */}

Bitwise Xor is form of addition

With extra property that every value is its own additive inverse A ^ A = 0

BABegin

BA^B1

(A^B)^B = AA^B2

A(A^B)^A = B3

ABEnd

*y*x

Page 89: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 15

Some basic representations

C Data Type Typical 32-bit Typical 64-bit x86-64

char 1 1 1

short 2 2 2

int 4 4 4

long 4 8 8

float 4 4 4

double 8 8 8

long double − − 10/16

pointer 4 8 8

Page 90: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 16

Encoding IntegersEncoding Integers

short int x = 15213; short int y = -15213;

C short 2 bytes long

Sign Bit For 2’s complement, most significant bit indicates sign

0 for nonnegative 1 for negative

B2T (X ) xw1 2w1 xi 2

i

i0

w2

B2U(X ) xi 2i

i0

w1

Unsigned Two’s Complement

SignBit

Decimal Hex Binaryx 15213 3B 6D 00111011 01101101y -15213 C4 93 11000100 10010011

Page 91: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 17

Encoding Example (Cont.)Encoding Example (Cont.) x = 15213: 00111011 01101101 y = -15213: 11000100 10010011

Weight 15213 -152131 1 1 1 12 0 0 1 24 1 4 0 08 1 8 0 0

16 0 0 1 1632 1 32 0 064 1 64 0 0

128 0 0 1 128256 1 256 0 0512 1 512 0 0

1024 0 0 1 10242048 1 2048 0 04096 1 4096 0 08192 1 8192 0 0

16384 0 0 1 16384-32768 0 0 1 -32768

Sum 15213 -15213

Page 92: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 18

Numeric RangesNumeric Ranges

Unsigned Values UMin = 0

000…0 UMax = 2w – 1

111…1

Two’s Complement Values TMin = –2w–1

100…0 TMax = 2w–1 – 1

011…1

Other Values Minus 1

111…1

Decimal Hex BinaryUMax 65535 FF FF 11111111 11111111TMax 32767 7F FF 01111111 11111111TMin -32768 80 00 10000000 00000000-1 -1 FF FF 11111111 111111110 0 00 00 00000000 00000000

Values for W = 16

Page 93: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 19

Values for Different Word Sizes

Observations |TMin | =

TMax + 1 Asymmetric range

UMax = 2 * TMax + 1

C Programming #include <limits.h>

K&R App. B11

Declares constants, e.g., ULONG_MAX LONG_MAX LONG_MIN

Values platform-specific

W8 16 32 64

UMax 255 65,535 4,294,967,295 18,446,744,073,709,551,615TMax 127 32,767 2,147,483,647 9,223,372,036,854,775,807TMin -128 -32,768 -2,147,483,648 -9,223,372,036,854,775,808

Page 94: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 20

Unsigned & Signed Numeric ValuesEquivalence

Same encodings for nonnegative values

Uniqueness Every bit pattern represents

unique integer value Each representable integer has

unique bit encoding

Can Invert Mappings U2B(x) = B2U-1(x) T2B(x) = B2T-1(x)

Bit pattern for two’s comp integer

X B2T(X)B2U(X)0000 00001 10010 20011 30100 40101 50110 60111 7

–88

–79

–610

–511

–412

–313

–214

–115

1000

1001

1010

1011

1100

1101

1110

1111

0

1

2

3

4

5

6

7

Page 95: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 21

short int x = 15213; unsigned short int ux = (unsigned short) x; short int y = -15213; unsigned short int uy = (unsigned short) y;

Casting Signed to UnsignedCasting Signed to Unsigned

C Allows Conversions from Signed to Unsigned

Resulting Value No change in bit representation Nonnegative values unchanged

ux = 15213 Negative values change into (large) positive values

uy = 50323

Page 96: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 22

Relation Between Signed & UnsignedRelation Between Signed & Unsigned

uy = y + 2 * 32768= y + 65536

Weight -15213 503231 1 1 1 12 1 2 1 24 0 0 0 08 0 0 0 0

16 1 16 1 1632 0 0 0 064 0 0 0 0

128 1 128 1 128256 0 0 0 0512 0 0 0 0

1024 1 1024 1 10242048 0 0 0 04096 0 0 0 08192 0 0 0 0

16384 1 16384 1 1638432768 1 -32768 1 32768

Sum -15213 50323

Page 97: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 23

0

TMax

TMin

–1–2

0

UMaxUMax – 1

TMaxTMax + 1

2’s Complement Range

UnsignedRange

Conversion Visualized 2’s Comp. Unsigned

Ordering Inversion

Negative Big Positive

Page 98: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 24

Signed vs. Unsigned in CSigned vs. Unsigned in CConstants

By default are considered to be signed integers Unsigned if have “U” as suffix

0U, 4294967259U

Casting Explicit casting between signed & unsigned (U2T and T2U)

int tx, ty;unsigned ux, uy;tx = (int) ux;uy = (unsigned) ty;

Implicit casting also occurs via assignments, procedure callstx = ux;uy = ty;

Page 99: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 25

0 0U == unsigned

-1 0 < signed

-1 0U > unsigned

2147483647 -2147483648 > signed

2147483647U -2147483648 < unsigned

-1 -2 > signed

(unsigned) -1 -2 > unsigned

2147483647 2147483648U < unsigned

2147483647 (int) 2147483648U > signed

Casting SurprisesCasting SurprisesExpression Evaluation

If mix unsigned and signed in single expression, signed values implicitly cast to unsigned

Including comparison operations <, >, ==, <=, >=Examples for W = 32

Constant1 Constant2 Relation Evaluation0 0U-1 0

-1 0U2147483647 -2147483648 2147483647U -2147483648

-1 -2 (unsigned)-1 -2

2147483647 2147483648U 2147483647 (int)2147483648U

Page 100: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 26

Remember!

Bit pattern does not change Bit pattern gets re-interpreted==> Unexpected effects

When mixed signed, unsigned => cast to unsigned

Page 101: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 27

Sign ExtensionSign Extension

Task: Given w-bit signed integer x Convert it to w+k-bit integer with same value

Rule: Make k copies of sign bit: X = xw–1 ,…, xw–1 , xw–1 , xw–2 ,…, x0

k copies of MSB• • •X

X • • • • • •

• • •

w

wk

Page 102: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 28

Justification For Sign ExtensionJustification For Sign ExtensionProve Correctness by Induction on k

Induction Step: extending by single bit maintains value

Key observation: –2w–1 = –2w +2w–1

Look at weight of upper bits: X –2w–1 xw–1 X –2w xw–1 + 2w–1 xw–1 = –2w–1 xw–1

- • • •X

X - + • • •

w+1

w

Page 103: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 29

Sign Extension ExampleSign Extension Example

Converting from smaller to larger integer data type C automatically performs sign extension

short int x = 15213; int ix = (int) x; short int y = -15213; int iy = (int) y;

Decimal Hex Binary

x 15213 3B 6D 00111011 01101101ix 15213 00 00 3B 6D 00000000 00000000 00111011 01101101y -15213 C4 93 11000100 10010011iy -15213 FF FF C4 93 11111111 11111111 11000100 10010011

Page 104: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 30

Visualizing (Mathematical) Integer Addition

Integer Addition 4-bit integers u, v

Compute true sum Add4(u , v)

Values increase linearly with u and v

Forms planar surface

Add4(u , v)

u

v0

2 46

810

1214

0

2

4

6

8

10

1214

0

4

8

12

16

20

24

28

32

Integer Addition

Page 105: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 31

Unsigned Addition

Standard Addition Function Ignores carry output

Implements Modular Arithmetics = UAddw(u , v) = u + v mod 2w

• • •

• • •

u

v+

• • •u + v

• • •

True Sum: w+1 bits

Operands: w bits

Discard Carry: w bits UAddw(u , v)

Page 106: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 32

Visualizing Unsigned Addition

0

2w

2w+1

UAdd4(u , v)

u

v

True Sum

Modular Sum

Overflow

Overflow

0 24

6 810

1214

0

2

4

6

8

10

1214

0

2

4

6

8

10

12

14

16

Wraps Around If true sum ≥ 2w

At most once

Page 107: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 33

Two’s Complement Addition

TAdd and UAdd have Identical Bit-Level Behavior Signed vs. unsigned addition in C:

int s, t, u, v;s = (int) ((unsigned) u + (unsigned) v);

t = u + v

Will give s == t

• • •

• • •

u

v+

• • •u + v

• • •

True Sum: w+1 bits

Operands: w bits

Discard Carry: w bits TAddw(u , v)

Page 108: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 34

TAdd Overflow

–2w –1

–2w

0

2w –1–1

2w–1

True Sum

TAdd Result

1 000…0

1 011…1

0 000…0

0 100…0

0 111…1

100…0

000…0

011…1

PosOver

NegOver

Functionality True sum requires w+1

bits

Drop off MSB

Treat remaining bits as 2’s comp. integer

Page 109: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 35

Visualizing 2’s Complement Addition

Values 4-bit two’s comp.

Range from -8 to +7

Wraps Around If sum 2w–1

Becomes negative

At most once

If sum < –2w–1

Becomes positive

At most once

TAdd4(u , v)

u

v

PosOver

NegOver

-8 -6 -4-2 0

24

6

-8

-6

-4

-2

0

2

46

-8

-6

-4

-2

0

2

4

6

8

Page 110: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 36

Multiplication

Goal: Computing Product of w-bit numbers x, y Either signed or unsigned

But, exact results can be bigger than w bits Unsigned: up to 2w bits

Result range: 0 ≤ x * y ≤ (2w – 1) 2 = 22w – 2w+1 + 1

Two’s complement min (negative): Up to 2w-1 bits

Result range: x * y ≥ (–2w–1)*(2w–1–1) = –22w–2 + 2w–1

Two’s complement max (positive): Up to 2w bits, but only for (TMinw)2

Result range: x * y ≤ (–2w–1) 2 = 22w–2

So, maintaining exact results… would need to keep expanding word size with each product computed

is done in software, if needed

e.g., by “arbitrary precision” arithmetic packages

Page 111: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 37

Unsigned Multiplication in C

Standard Multiplication Function Ignores high order w bits

Implements Modular ArithmeticUMultw(u , v) = u · v mod 2w

• • •

• • •

u

v*

• • •u · v

• • •

True Product: 2*w bits

Operands: w bits

Discard w bits: w bitsUMultw(u , v)

• • •

Page 112: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 38

Signed Multiplication in C

Standard Multiplication Function Ignores high order w bits

Some of which are different for signed vs. unsigned multiplication

Lower bits are the same

• • •

• • •

u

v*

• • •u · v

• • •

True Product: 2*w bits

Operands: w bits

Discard w bits: w bitsTMultw(u , v)

• • •

Page 113: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 39

Power-of-2 Multiply with Shift

Operation u << k gives u * 2k

Both signed and unsigned

Examples u << 3 == u * 8 (u << 5) – (u << 3) == u * 24 Most machines shift and add faster than multiply

Compiler generates this code automatically

• • •

0 0 1 0 0 0•••

u

2k*u · 2kTrue Product: w+k bits

Operands: w bits

Discard k bits: w bits UMultw(u , 2k)

•••

k

• • • 0 0 0•••

TMultw(u , 2k)0 0 0••••••

Page 114: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 40

Why Should I Use Unsigned?Why Should I Use Unsigned?Don’t Use Just Because Number Nonzero

C compilers on some machines generate less efficient codeunsigned i; int cnt;for (i = 1; i < cnt; i++) a[i] += a[i-1];

Cast cnt to unsigned

(repeatedly)

First code bad because may cause repeated casting of cnt to unsigned with every passage of loop

Second code will loop forever. When unsigned i=0, do i-- which is a big number

Page 115: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 41

Why Should I Use Unsigned?Why Should I Use Unsigned?Don’t Use Just Because Number Nonzero

C compilers on some machines generate less efficient codeunsigned i; int cnt;for (i = 1; i < cnt; i++) a[i] += a[i-1];

Easy to make mistakesfor (i = cnt-2; i >= 0; i--) a[i] += a[i+1];

What’s the bug?

First code bad because may cause repeated casting of cnt to unsigned with every passage of loop

Second code will loop forever. When unsigned i=0, do i-- which is a big number

Page 116: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 42

Why Should I Use Unsigned?Why Should I Use Unsigned?Don’t Use Just Because Number Nonzero

C compilers on some machines generate less efficient codeunsigned i; int cnt;for (i = 1; i < cnt; i++) a[i] += a[i-1];

Easy to make mistakesfor (i = cnt-2; i >= 0; i--) a[i] += a[i+1];

Loop foreveri never < 0

First code bad because may cause repeated casting of cnt to unsigned with every passage of loop

Second code will loop forever. When unsigned i=0, do i-- which is a big number

Page 117: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 43

Why Should I Use Unsigned?Why Should I Use Unsigned?Don’t Use Just Because Number Nonzero

C compilers on some machines generate less efficient codeunsigned i; int cnt;for (i = 1; i < cnt; i++) a[i] += a[i-1];

Should befor (i = cnt-2; i <cnt; i--) a[i] += a[i+1];

Why does this work?

First code bad because may cause repeated casting of cnt to unsigned with every passage of loop

Second code will loop forever. When unsigned i=0, do i-- which is a big number

Page 118: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 44

Why Should I Use Unsigned?Why Should I Use Unsigned?Don’t Use Just Because Number Nonzero

C compilers on some machines generate less efficient codeunsigned i; int cnt;for (i = 1; i < cnt; i++) a[i] += a[i-1];

Should befor (i = cnt-2; i <cnt; i--) a[i] += a[i+1];

Do Use For: Multiprecision or modular arithmetic Need extra bits' range, right up to limit of word size Represent sets

First code bad because may cause repeated casting of cnt to unsigned with every passage of loop

Second code will loop forever. When unsigned i=0, do i-- which is a big number

Page 119: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 45

C PuzzleC Puzzle Assume machine with 32 bit word size, two’s comp. integers TMin makes a good counterexample in many cases

x < 0 ((x*2) < 0)

ux >= 0

(x & 7) == 7 (x<<30) < 0

ux > -1

x > y -x < -y

x * x >= 0

x > 0 && y > 0 x + y > 0

x >= 0 -x <= 0

x <= 0 -x >= 0

int x = foo();

int y = bar();

unsigned ux = x;

unsigned uy = y;

Page 120: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 46

C Puzzle AnswersC Puzzle Answers

x < 0 ((x*2) < 0) False: TMin

ux >= 0 True: 0 = UMin

(x & 7) == 7 (x<<30) < 0 True: x1 = 1

ux > -1 False: 0

x > y -x < -y False: -1, TMin

x * x >= 0 False: x=65535

x > 0 && y > 0 x + y > 0 False: TMax, TMax

x >= 0 -x <= 0 True: –TMax < 0

x <= 0 -x >= 0 False: TMin

Assume machine with 32 bit word size, two’s comp. integers TMin makes a good counterexample in many cases

Page 121: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 47

Security Example: Impact of (un)signed

Similar to code found in FreeBSD’s implementation of getpeername There are legions of smart people trying to find vulnerabilities in

programs

/* Kernel memory region holding user-accessible data */#define KSIZE 1024char kbuf[KSIZE];

/* Copy at most maxlen bytes from kernel region to user buffer */int copy_from_kernel(void *user_dest, int maxlen) { /* Byte count len is minimum of buffer size and maxlen */ int len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len); return len;}

Page 122: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 48

Typical Usage

/* Kernel memory region holding user-accessible data */#define KSIZE 1024char kbuf[KSIZE];

/* Copy at most maxlen bytes from kernel region to user buffer */int copy_from_kernel(void *user_dest, int maxlen) { /* Byte count len is minimum of buffer size and maxlen */ int len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len); return len;}

#define MSIZE 528

void getstuff() { char mybuf[MSIZE]; copy_from_kernel(mybuf, MSIZE); printf(“%s\n”, mybuf);}

Page 123: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 49

Malicious Usage

/* Kernel memory region holding user-accessible data */#define KSIZE 1024char kbuf[KSIZE];

/* Copy at most maxlen bytes from kernel region to user buffer */int copy_from_kernel(void *user_dest, int maxlen) { /* Byte count len is minimum of buffer size and maxlen */ int len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len); return len;}

#define MSIZE 528

void getstuff() { char mybuf[MSIZE]; copy_from_kernel(mybuf, -MSIZE); . . .}

/* Declaration of library function memcpy */void *memcpy(void *dest, void *src, size_t n);

Page 124: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 50

Code Security Example #2

SUN XDR library Widely used library for transferring data between machines

void* copy_elements(void *ele_src[], int ele_cnt, size_t ele_size);

ele_src

malloc(ele_cnt * ele_size)

Page 125: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 51

XDR Code

void* copy_elements(void *ele_src[], int ele_cnt, size_t ele_size) { /* * Allocate buffer for ele_cnt objects, each of ele_size bytes * and copy from locations designated by ele_src */ void *result = malloc(ele_cnt * ele_size); if (result == NULL)

/* malloc failed */return NULL;

void *next = result; int i; for (i = 0; i < ele_cnt; i++) { /* Copy object i to destination */ memcpy(next, ele_src[i], ele_size);

/* Move pointer to next memory region */next += ele_size;

} return result;}

Page 126: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 52

XDR Vulnerability

What if: ele_cnt = 220 + 1

ele_size = 4096 = 212

Allocation on 32 bits? What happens to the ‘next’ pointer?

How can I make this function secure?

malloc(ele_cnt * ele_size)

Page 127: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 53

Page 128: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 54

Machine-Level Code Representation

Encode Program as Sequence of Instructions Arithmetic, logical, math operations Read or write memory Conditional branches, jumps

Different machines, different instructions Most code not binary compatible Alpha’s, Sun’s, ARM (tablets) use fix-length instructions

Reduced Instruction Set Computer (RISC) PC’s use variable length instructions

Complex Instruction Set Computer (CISC)

Page 129: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 55

Representing Instructionsint sum(int x, int y){ return x+y;}

Different CPUs use totally different instructions and encodings

00

00

30

42

Alpha sum

01

80

FA

6B

E0

08

81

C3

Sun sum

90

02

00

09

For this example, Alpha & Sun use two 4-byte instructions Use differing numbers of instructions

in other cases

PC uses 7 instructions with lengths 1, 2, and 3 bytes Same for NT and for Linux

E5

8B

55

89

PC sum

45

0C

03

45

08

89

EC

5D

C3

Page 130: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 56

Machines have Words

Imprecise definitions Nominal size of integer-valued data Sometimes size of address, memory bus width

Current desktop machines are 64 bits (8 bytes) Potentially address 1.8 X 1019 bytes 32-bit machines phasing out (but in phones, tablets)

Low-end use 8- or 16-bit words Machines support multiple data formats

Fractions or multiples of word size Always integral number of bytes

Page 131: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 57

Byte-Oriented Memory Organization Programs Refer to Virtual Addresses

Conceptually very large array of bytes Implemented with hierarchy of different memory types

SRAM, DRAM, disk In Unix and Windows NT, address space private to “process”

Program can clobber its own data, but not that of others You will see this again, in much more detail

Compiler + Run-Time System Control Allocation Where different program objects should be stored Multiple mechanisms: static, stack, and heap In any case, all allocation within single virtual address space You will see this again, in much more detail

Page 132: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 58

Word-Oriented Memory Organization

Addresses Specify Byte Locations Address of first byte in word Addresses of successive

words differ by 4 (32-bit) or 8 (64-bit)

0000

0001

0002

0003

0004

0005

0006

0007

0008

0009

0010

0011

32-bitWords

Bytes Addr.

0012

0013

0014

0015

64-bitWords

Addr =??

Addr =??

Addr =??

Addr =??

Addr =??

Addr =??

0000

0004

0008

0012

0000

0008

Page 133: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 59

Byte Ordering

How should bytes within multi-byte word be ordered in memory?

Conventions Sun’s, Mac’s are “Big Endian” machines

Least significant byte has highest address Alphas, PC’s are “Little Endian” machines

Least significant byte has lowest address

Page 134: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 60

Byte Ordering Example

Big Endian Least significant byte has highest address

Little Endian Least significant byte has lowest address

Example Variable x has 4-byte representation 0x01234567 Address given by &x is 0x100

0x100 0x101 0x102 0x103

01 23 45 67

0x100 0x101 0x102 0x103

67 45 23 01

Big Endian

Little Endian

01 23 45 67

67 45 23 01

Page 135: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 61

Reading Byte-Reversed Listings

Disassembly Text representation of binary machine code Generated by program that reads the machine code

Example Fragment

Address Instruction Code Assembly Rendition 8048365: 5b pop %ebx 8048366: 81 c3 ab 12 00 00 add $0x12ab,%ebx 804836c: 83 bb 28 00 00 00 00 cmpl $0x0,0x28(%ebx)

Deciphering Numbers Value: 0x12ab

Pad to 4 bytes: 0x000012ab

Split into bytes: 00 00 12 ab

Reverse: ab 12 00 00

Page 136: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 62

Examining Data Representations

Code to Print Byte Representation of Data Casting pointer to unsigned char * creates byte array

typedef unsigned char *pointer;

void show_bytes(pointer start, int len){ int i; for (i = 0; i < len; i++) printf("0x%p\t0x%.2x\n", start+i, start[i]); printf("\n");}

Printf directives:%p: Print pointer%x: Print Hexadecimal

Page 137: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 63

show_bytes Execution Example

int a = 15213;

printf("int a = 15213;\n");

show_bytes((pointer) &a, sizeof(int));

Result (Linux):

int a = 15213;

0x11ffffcb8 0x6d

0x11ffffcb9 0x3b

0x11ffffcba 0x00

0x11ffffcbb 0x00

Page 138: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 64

Representing Integersint A = 15213;int B = -15213;long int C = 15213;

Decimal: 15213

Binary: 0011 1011 0110 1101

Hex: 3 B 6 D

6D

3B

00

00

Alpha A

3B

6D

00

00

Sun A

93

C4

FF

FF

Alpha B

C4

93

FF

FF

Sun B

Two’s complement representation

00

00

00

00

6D

3B

00

00

Alpha C

3B

6D

00

00

Sun C

6D

3B

00

00

ia32 C

Page 139: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 65

Representing Pointersint B = -15213;int *P = &B;

Alpha Address

Hex: ... 0 1 F F F F F C A 0

Binary: 0000 0001 1111 1111 1111 1111 1111 1100 1010 0000

01

00

00

00

A0

FC

FF

FF

Alpha P

Sun Address

Hex: E F F F F B 2 C Binary: 1110 1111 1111 1111 1111 1011 0010 1100

Different compilers & machines assign different locations to objects

FB

2C

EF

FF

Sun P

FF

BF

D4

F8

Linux PLinux Address

Hex: B F F F F 8 D 4 Binary: 1011 1111 1111 1111 1111 1000 1101 0100

Page 140: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 66

Characters and Strings

Page 141: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 67

Representing a character Single byte sufficient for small codes

In C: char c; But interpretation of value depends on chosen code Fixed-length code:

Baudot, ITA2, USTTY: 5 bits BCD (binary coded decimal, used in early computers): 6 bits ASCII – defines 128 characters (7 bits) ISO 8859, EBCDIC: 8 bits

Variable-length codes UTF-8: 8-32 bits UTF-16: 16-32 bits

Wide chars: chars more than 8 bits.

Page 142: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 68

Example: ASCII

Page 143: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 69

Representing Strings (in C) Represented by array of 1-byte characters

Each character encoded in ASCII formatStandard 7-bit encoding of character setCharacter “0” has code 0x30

» Digit i has code 0x30+i Other encodings exist (e.g., for Hebrew) String are null-terminated

Final character = 0x00Array length must be strlen()+1.

Alpha S Sun S

32

31

31

35

33

00

32

31

31

35

33

00

char S[6] = "15213";

Page 144: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 70

Representing Strings (in C) Represented by array of 1-byte characters

Each character encoded in ASCII formatStandard 7-bit encoding of character setCharacter “0” has code 0x30

» Digit i has code 0x30+i Other encodings exist (e.g., for Hebrew) String are null-terminated

Final character = 0x00Array length must be strlen()+1.

Text files generally platform independent But: Different conventions of line termination character(s)! But: Different default encodings (depend on locale)

Alpha S Sun S

32

31

31

35

33

00

32

31

31

35

33

00

char S[6] = "15213";

Page 145: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 71

Representing Strings (Cont'd) Strings in Pascal (a.k.a P-Strings)

Represented by array of characters+1 First cell holds length, no need for null-termination Length known in O(1)!

Think about strcat(), strlen(), strcpy(), ... Size of string limited by size of array cell value

p-String

35

32

5

31

31

33

Page 146: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 72

Representing Strings (Cont'd) Strings in Pascal (a.k.a P-Strings)

Represented by array of characters+1 First cell holds length, no need for null-termination Length known in O(1)!

Think about strcat(), strlen(), strcpy(), ... Size of string limited by size of array cell value

c-strings and p-strings can be used with wide chars, but: big/little endian now matters (each cell multiple bytes) No longer built-in, instead platform/library specific

p-String

35

32

5

31

31

33

Page 147: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 73

Main Points

Boolean Algebra is Mathematical Basis Basic form encodes “false” as 0, “true” as 1 General form like bit-level operations in C Good for representing & manipulating sets

It’s All About Bits & Bytes Numbers, text, programs: They are all collections of bits!

Different Machines/Languages: Different Conventions Representations Word size Byte ordering

Page 148: Computer Organization: A Programmer's Perspective

Computer Organization: A Programmer's Perspective Based on class notes by Bryant and O'Hallaron 74

שאלות?