(Fall 2011 - Asanovic and Patterson):
1) Explain energy and power. How is energy dissipated in a modern
microprocessor? What techniques could an architect use to reduce
energy consumption in a microprocessor?
2) Explain the different types of memory used in modern server and
handheld computers. What are their cost/bit, densities, access
latencies, and bandwidths? How would you take advantage of different
memory types in a modern memory system?
3) Describe the operation of an out-of-order superscalar processor
based on a unified physical register file (e.g., MIPS R10K or Alpha
21264). Describe how register renaming can be performed in parallel
for a group of sequential instructions. What is the minimal number of
physical registers needed? How does instruction scheduling logic cope
with variable latency of cache accesses?
4) Describe the principal types of parallelism exploited in computer
systems. Describe representative architectures for each type of
(Fall 2010 - Kubi & Patterson):
"Q1: Flash Memory
Q2: Modern processor
Q3: CMOS dependability
Q4: Personal Mobile Device"
(Fall 2008 - Asanovik & Wawrzynek):
"1. Memory Hierarchy
a) For a typical modern general purpose processor sketch
and describe in detail the memory hierarchy.
b) How would you enhance/modify the above to accommodate
several processors sharing a cache-coherent memory.
c) Would your solution scale to hundreds of processors?
If not, what would you change to accommodate the scaling?
2. Processor Microarchitecture
a) Sketch and describe the major stages of a modern
out-of-order processor pipeline and how the processor works.
[BP/IF, Dec/RegRename, Ex, Completion, Commit]
b) Devise and write assembly code for an example program
where register renaming helps performance. Show an example
where it doesn't help.
c) Devise and write assembly code for an example program
where branch prediction helps and where it doesn't.
d) If you had to choose only one of register renaming and
branch prediction over the other, which one would you
choose and why?
3. Power and Energy
a) What is the relationship between power and energy? Why are
these important considerations in digital systems.
b) Describe (in formula form) the components of power consumption in
c) Take the design of a floating point unit for instance. For a
fixed required throughput, what could you do to lower its
d) What ultimately limits the effectiveness of the techniques from c)?
4. Parallel Processing
In 2005 there was a historic in the industry with all microprocessor companies announced
that their future products would be chip-scale multiprocessors
(CMPs). Why did this happen?
[No more Vt scaling (leakage problem dominates), diminishing returns on
ILP extraction, memory latency (&BW?) problems]
5. Looking ahead
Consider as a baseline architecture, a collection of energy
efficient RISC cores. Assuming that technology stops scaling, what
can be done to further reduce energy consumption for a given
[accelerators, vectors, ...]"
(Spring 2004 - Wawrzynek & Patterson):
"1) Power and Energy in Microprocessors
a) Give us a metric for expressing energy efficiency of a microprocessor for a
(would expect MIPS/watt or joules/instruction, ...)
b) If I gave you a microprocessor and that workload, how would you measure its
average energy efficiency?
(needs to understand P=IV and think about using a current meter.
Then measure time and number of instructions, ..., )
c) What are the factors that effect/determine power consumption in this experiment
and how could you influence each factor?
(P = Cv^2f, c is process and wire lengths, v is process and user set,...)
2) Microprocessor Limitaions
a) What do you think some of the technological limitations to increasing microprocessor
performance in the next 3 to 7 years?
b) How do recent 80x86 microprocessor designs match up to those limitations?
c) How would you expect such limitations will change the microarchitectures in this time
3) Networks of processors
Suppose you were engineering a multiprocessor on a chip (a homogeneous array of simple
processing units with local memory each). Think about what structure you would use to
connect the processors together.
a) What network topologies would you consider and why?
(Should be able to describe meshes, trees, busses, xbar, etc.)
What factors would you consider in choosing one versus another?
What other factors would come into the design?
(need to consider area cost, control and routing complexity, packet switched or circuit
swithed, cross-section bandwidth, scaling)
How would you go about determining the best design for this network?"
(Fall 2003 - Culler & Kubiatowicz):
"Q1: For the first question, we were looking for concise definition of
precise interrupts, followed by clear illuminations of mechanisms for
achieving precise interrupts in both 5-stage and out of order pipelines.
Q2: The second question focused on evaluating design tradeoffs. It
centered on energy efficiency.
Q3: The third question focused on instruction set design. The specific
context was communication on a chip based multiprocessor. We asked
for a list of possible forms of communication and then to discuss how
to extend the ISA for each.
Q4: For the final question, we were looking to explore the design space for
a NAS (network attached storage) system."
(Fall 2002 - Culler & Patterson):
"Q1: Vector processing
Q2: Power and energy
Q3: Errors in computers
Q4: Branch prediction"
(Spring 2002 - Kubiatowicz & Wawrzynek):
Q2: Power consumption
Q3: Networked multiprocessors
(Fall 2001 - Kubiatowicz & Patterson):
"Q1: modern processor design
Q2: trace caches
Q3: VLSI scaling
Q4: high-transaction rate server"
(Spring 2001 - Wawrzynek & Kubiatowicz):
"Q1: What are precise interrupts? Why are they useful? Discuss how to
implement precise interrupts in a modern processor (superscalar,
out-of-order) of your choice. What is branch prediction? Why is it
useful? How does it fit into your example processor? What other
things do people try to predict?
Draw the floorplan for a processor (real or imaginary but realistic).
Include the major functional blocks and their approximate sizes.
Imagine you now have a merged DRAM/logic process. Assuming an
identical organization and memory hierarchy, how would this affect the
floorplan? How would the floorplan differ if you could change the
organization or memory hierarchy?
What is memory coherency in multiprocessors? Discuss coherency in a
snoopy bus model. Describe a series of reads and writes in this
model. What is consistency? What is sequential consistency and how
can it be modelled? Describe the limitation of snoopy-based coherency
models? What is/are the solutions? Describe a series of reads and
writes in that model?
We want to create a network router from a standard PC. (For our
purposes, a router will input a packet, examine the packet, make a
decision about where to route the packet based on stored state, and
send the packet out that port.) Draw a diagram of this system and
trace a packet through this system. How do we determine the number of
ports that a single PC can support? What is the bottleneck? How do
you justify this? Estimate various execution times, bandwidths, and
(Fall 2000 - Wawrzynek & Patterson):
"Q1: Identify the critical paths in microprocessor design. What is the
effect of carry logic on adder delay? Discuss fast adder techniques.
Q2: Discuss issues related to disk drives, trends, today's spec, and
future specs. Reason about read times for an entire disk and how this
relates to RAIDs and other disk arrays.
Q3: Discuss the basic issues involved with system implementation
Q4: Identify some out-of-order (OOO) execution processors. Select one
and describe its operations."
(Spring 2000 - Wawrzynek & Kubiatowicz):
"Q1: Discuss the distinctions between RISC and CISC; compare and contrast
the advantages of one over the other. Discuss whether the RISC/CISC
distinction is still valid today.
Q2: Discuss the detailing of overheads of message communication. How
would you optimize communication?
Q3: Account for the large difference in energy observed between solving
a problem with custom hardware vs. a general purpose processor. List
some of the metrics one might use to decide between a custom ASIC, an
FPGA, or a general purpose processor.
Q4: What is the cache-coherence problem? What are the definitions of sequential
consistency? Discuss the details of snoopy protocols. and about
the typical bus bandwidth. Estimate the maximum number of processors
that would fit on a bus."
(Fall 1999 - Wawrzynek & Kubiatowicz):
"Q1. Describe precise interrupt. Why is it useful? Describe exactly how
you would implement it on a 5 stage pipeline. Draw a rough block diagram of
an out-of-order execution processor. Describe how you implement precise
interrupt with respect to that block diagram. What happens to the
precise interrupt scheme when there is branch mispredictions?
Q2: Given a very simple processor with no cache, no floating point unit,
single issue, everything in order, pipelined processor, estimate the
number of transistors it will use. Given the number of transistors you
just came up with, can you implement it using some state-of-the art FPGA?
What is the rough capacity of the FPGA? What are some pros and cons of
implementing such a processor on a FPGA if it is possible? Why would
someone ever want to do such thing in the real world?
Q3: Suppose you want to implement a multiprocessor over the conventional
network (like ethernet), describe all the mechanisms you need to pass a message
from one user to another user. Estimate the time it takes for the
message to go through each part of the process. Given the time you just
estimated, is it fast enough to meet the need of implementing
multiprocessor? How can you improve the speed? How can you protect
users from interfering each other if they have control over the network
Q4: Define energy efficiency. Given that you have a 4-way super-scalar
processor, a program, and the data input, describe how (including
setup) you can measure the energy efficiency of this processor. What
would be the issue if you want to compare this energy efficiency number
with other processors? If you are a chief architect, what kind of
processor you would design to optimize energy efficiency?"
(Spring 1999 - Patterson & Kubiatowicz):
"Q1: What are the causes of cache misses (three C's)? Describe how to
remove these. Under what circumstances would they be advantageous?
Q2: What is the definition of precise interrupts and the implementation
of precise interrupts in a five-stage pipeline? Discuss modern
Q3: Discuss some of the issues behind the replacement of buses with
routers. What are some advantages of router-like architectures?
Q4: Discuss computational paint."
(Fall 1998 - Patterson & Kubiatowicz):
"Q1: What is the definition of precise interrupts and the implementation
of precise interrupts in a five-stage pipeline? Discuss modern
Q2: What are the consequences of increased wire delay? How would you
evaluate ambiguous design alternatives?
Q3: Discuss the basic message communication path. Why would a separate
network be needed to avoid protocal overhead?
Q4: (A database design problem was given.) Calculate the numbers of
disks and CPUs for 100% utilization as well as bus organization. What
does the queueing theory suggest?"