Data Representation and Architecture Modelling Revision
Slide 2
Binary system 1. Conversion 1. Convert decimal to binary 2.
Convert binary to decimal and hexadecimal 2. Integer representation
1. Unsigned notation 2. Signed notation 3. Excess notation 4. Tows
complement 3. Advantages of using Twos complement
Slide 3
Floating point representation 1. What decimal floating point
number is represented by the following 32 bits (single precision
format)? Show your workings. 1 100 0011 1 000 1010 0000 0000 0000
0000 2. What is the range of negative numbers in this
representation 3. Define negative overflow and underflow in this
representation.
Slide 4
Solution 1. Method the sign-bit is one, negative number biased
exponent = 10000111 = 128 + 4 +2 + 1 = 135 The real exponent =
135-127= 8 the normalized mantissa = 000 1010 0000 0000 0000 0000.
the real mantissa = 1.000101 the final value represented =
-(1.000101 2 ) x 2 8 = 100010100 2 = - (256+16+ 4)= -276 2.
Negative range: -(2-2 -23 )x 2 127 to- 2- 127 3. Negative overflow
and underlflow Negative over: value less than -(2-2 -23 )x 2 127.
Negative underflow: 2- 127 < value < 0.
Slide 5
CPU CPU registers: PC, IR, AC,MAR, MBR System bus Data bus,
Address bus, and control bus Pipelining Role of pipelining
Pipelining hazards (control hazards, data hazards, and structural
hazards) What is the disadvantage of using a very long stage
pipeline?
Slide 6
Exercise Suppose you have designed a processor implementation
whose five pipeline stages take the following amounts of time:
IF(instruction fetch)=20ns, ID (instruction decode)=10ns, EX
(execution)=20ns, MEM (memory operation)=35ns and WB (write
back)=10ns. (a) What is the minimum clock period for which your
processor functions properly? (b) What should be redesigned first
to improve this processors performance? (c) Assume this processor
is redesigned with 50 pipeline stages. Is it true to say that the
new processor is 10 times faster than the previous design with 5
pipeline stages?
Slide 7
solution (a) The minimum clock period is the time of the
longest stage: stage MEM takes 35ns. (b) The MEM should be
redesigned to reduce the clock cycle. (c) Probably not. Longer
pipelines can be faster due to higher clock rates, unlikely that
the clock rate is 10x faster due to uneven pipeline stages and
register overheads Furthermore, longer pipelines tend to make data
and control hazards require longer stalls. higher clock-rate
processor is likely to be more power- hungry proportional to the
increase in clock-speed
Slide 8
Question 2 An instruction requires four stages to execute:
stage 1 (instruction fetch) requires 30 ns, stage 2 (instruction
decode) = 9 ns, stage 3 (instruction execute) = 20 ns and stage 4
(store results) = 10 ns. An instruction must proceed through the
stages in sequence. 1) What is the minimum asynchronous time for
any single instruction to complete? 2) We want to set this up as a
pipelined operation. How many stages should we have and at what
rate should we clock the pipeline?
Slide 9
Hints 1) The minimum time it takes to execute all the 4 stages
of an instruction. We have 4 natural stages given and no
information on how we might be able to further subdivide them, so
we use 4 stages in our pipeline. Clock rate? use the longest stage
Or use a time that closely matches the shortest stage, but
integrally divisible into the other stages. DISCUSS EACH CASE.
Slide 10
Question 3 The pipeline for these instructions runs with a 100
MHz clock with the following stages: instruction fetch = 2 clocks,
instruction decode = 1 clock, fetch operands = 1 clock, execute = 2
clocks, and store result = 1 clock.
Slide 11
HINTS FOR QUESTION 3 1) THE longest stage takes two cycle.
Hence we need to execute one instruction per 2 cycles. What is the
rate then? 2) The Operand Fetch unit must wait until the prior
instruction stores its result. before it can retrieve one of its
operands (e.g. Op Fetch for 2 must wait until Op Store for 1
completes). As a result, things begin backing up in the pipeline,
and we produce one instruction output only every 4 cycles.
Slide 12
No dependencies Execute instruction every 2 cycles. Cock
rate?
Slide 13
dependency From the table we still begin fetching instructions
every two cycles. However the operand fetch for 2 instruction must
wait until Op Store for instruction 1 completes. (wait for another
2 cycles). Hence, the rate????
Slide 14
Memories CPU registers Cache memory Main memory (electronic
memory) Magnetic memory (hard drive) Optical memory Magnetic
tape
Why is cache memory needed? CPU slowed down by the main memory
When a program references a memory location, it is likely to
reference that same memory location again soon. A memory location
that is near a recently referenced location is more likely to be
referenced than a memory location that is far away.
Slide 17
Cache memory Resides between the CPU and the main memory
Operates at a speed near to that of the CPU Data is exchanged
between CPU and main memory through the cache memory Cache memory
use locality principles to enhances computer performance. Temporal
locality principle Spatial locality principle
Slide 18
Temporal locality principle When a program references a memory
location, it is likely to reference that same memory location again
soon. Cache memory keeps records of data recently being used.
Slide 19
Spatial locality principle A memory location that is near a
recently referenced location is more likely to be referenced than a
memory location that is far away. Cache memory copies not only the
recently referenced memory locations but also its nearby.
Associative Mapped Cache Any main memory blocks can be mapped
into each cache slot. To keep track of which of the 2 27 possible
blocks is in each slot, a 27-bit tag field is added to each
slot.
Slide 22
Associative Mapped Cache Valid bit is needed to indicate
whether or not the slot holds a line that belongs to the program
being executed. Dirty bit keeps track of whether or not a line has
been modified while it is in the cache.
Slide 23
Associative Mapped Cache The mapping from main memory blocks to
cache slots is performed by partitioning an address into fields.
For each slot, if the valid bit is 1, then the tag field of the
referenced address is compared with the tag field of the slot.
Slide 24
Associative Mapped Cache How an access to the memory location
(A035F014) 16 is mapped to the cache. If the addressed word is in
the cache, it will be found in word (14) 16 of a slot that has a
tag of (501AF80) 16, which is made up of the 27 most significant
bits of the address.
Slide 25
Associative Mapped Cache Advantages Any main memory block can
be placed into any cache slot. Regardless of how irregular the data
and program references are, if a slot is available for the block,
it can be stored in the cache.
Slide 26
Associative Mapped Cache Disadvantages Considerable hardware
overhead needed for cache bookkeeping. There must be a mechanism
for searching the tag memory in parallel.
Slide 27
Direct-Mapped Cache Each cache slot corresponds to an explicit
set of main memory. In our example we have 2 27 memory blocks and 2
14 cache slots. A total of 2 27 / 2 14 = 2 13 main memory blocks
can be mapped onto each cache slot.
Slide 28
Direct-Mapped Cache The 32-bit main memory address is
partitioned into a 13-bit tag field, followed by a 14-bit slot
field, followed by a five-bit word field.
Slide 29
Direct-Mapped Cache When a reference is made to the main memory
address, the slot field identifies in which of the 2 14 slots the
block will be found. If the valid bit is 1, then the tag field of
the referenced address is compared with the tag field of the
slot.
Slide 30
Direct-Mapped Cache How an access to memory location (A035F014)
16 is mapped to the cache. If the addressed word is in the cache,
it will be found in word (14) 16 of slot (2F80) 16 which will have
a tag of (1406) 16.
Slide 31
Direct-Mapped Cache Advantages Simple and inexpensive The tag
memory is much smaller than in associative mapped cache. No need
for an associative search, since the slot field is used to direct
the comparison to a single field.
Slide 32
Direct-Mapped Cache Disadvantages Fixed location for a given
memory block. If a program accesses 2 blocks that map to the same
line repeatedly, caches misses are very high.
Slide 33
Set-Associative Mapped Cache Combines the simplicity of direct
mapping with the flexibility of associative mapping For this
example, two slots make up a set. Since there are 2 14 slots in the
cache, there are 2 14 /2 =2 13 sets.
Slide 34
Set-Associative Mapped Cache When an address is mapped to a
set, the direct mapping scheme is used, and then associative
mapping is used within a set.
Slide 35
Set-Associative Mapped Cache The format for an address has 13
bits in the set field, which identifies the set in which the
addressed word will be found. Five bits are used for the word field
and 14-bit tag field.
Slide 36
Typical exam question Explain the difference between direct
mapped cache and associative mapped cache. Explain how cache memory
uses temporal and spatial locality principles to enhance computers
performance.
Slide 37
Web languages (html,xml, xhtml) Difference between these
languages Disadvantages of using html How does XHTML solve these
problems Advantages of CSS Difference between HTML selector, CLASS
selectors and ID selectors
Slide 38
htlm selector: h{ bgcolor:green; color: red; font-weight: bold;
} Class selector:.section { color: red; font-weight: bold; } ID
selector: #section{ color: red; font-weight: bold; } An ID selector
applies styles to an element in the same way as a class. The main
difference between an ID selector and a class is that an ID can be
used only once on each page, whereas a class can be used many
times.
Slide 39
Computer networks Network classes and default mask TCP/IP model
(internet model) The role of each layer Example of protocols at
each layer and there role. TCP vs UDP How is error and flow control
achieved? Layer responsible for this? Subnetting Role of subnetting
Subnet address Host address Broadcast address Range of addresses in
a subnet
Slide 40
Exercise Given a host configuration with an IP address
192.158.15.33 and a subnet mask 255.255.255.248: What is the subnet
address? What is the host address? What is the broadcast address?
What is the number of possible hosts and range of host addresses in
this subnet?
Slide 41
Solution 192.168.10.32 0.0.0.1 192.168.10.39 The number if bits
for the host is 3 and therefore the number if hosts allowed in in
this subnet is 2 3 -2=6 The range of address is 192.168.10.33 -
192.168.10.38.
Slide 42
Exam Duration 1:30 hours 3 questions: 30 minutes each Time :
May Preparation: Past exam papers Revise all the questions given in
two assignments Consult revision slides Concentrate on the
preparation list Attempt the Mock exam on my website Next week mock
exam