(Spring 2013 - Abbeel* & Bartlett):
(Spring 2012 Jordan and Abbeel):
(Fall 2010 - Klein & Malik): "Language: -- Express the ASR problem as an HMM. What are the features and state space? -- Imagine you have a sequence of words and a PCFG in CNF. Define the quantity computed by the CKY algorithm and give the recurrence used by the algorithm. -- How would you jointly do the ASR and parsing? Explain how to merge the two algorithms. Vision: -- Define texture. -- Explain how to segment an image by texture. What is the key reason simple methods like convolving with basic filters don't work? What is the solution? How does segmentation work? -- Explain how to classify a region given reference examples of various textures. What would the features be, etc? Graphical Models / Search: -- [Example GM, questions about what is conditionally independent of what] -- State the Bayes ball (or d-separation) algorithm as a state-space search problem. What are the states, successor function, and so on? -- Give an admissible heuristic for telling whether a source and target variable are conditionally independent given evidence. Reinforcement Learning: -- State Blackjack as an MDP. What are the states, rewards, etc? -- What is a value? How do you compute the values in a known MDP. -- Imagine you did not know the transition function for Blackjack (or didn't want to compute it). Describe how to determine values of a policy. -- What is q-learning? What is the difference between on- and off-policy learning? What is the purpose of an epsilon-greedy policy?"
(Spring 2009 - Klein & Jordan): "1) What is a CSP? Define formally what a constraint is, etc. What is arc consistency and why would we want it? Give an algorithm for establishing it. 2) What is PCA? What is kernel PCA? Why would we use it? What are the inputs and outputs? Give the algorithm. 3) What is the EM algorithm? What is it for? What does it do, formally? How does it work for a bivariate Gaussian with missing components? What sufficient statistics? 4) Describe how a statistical machine translation system works. What data does it require? How are the models learned? How does decoding work? 5) Describe how depth can be recovered from an image. What about without stereo or motion?"
(Spring 2007 - Bartlett & Jordan): "Candidates were asked to answer the following four questions covering reinforcement learning, vision, language and graphical models: 1. What is an MDP? Explain how to formulate the game of Tetris as an MDP. Describe an algorithm for finding the optimal policy for this game. Suggest an efficient approximate approach that might be more effective in practice. 2. What is the image segmentation and why might one want to segment an image? Describe an image segmentation algorithm and explain its advantages and disadvantages. 3. What is part-of-speech tagging? Suggest two approaches to the part-of-speech tagging problem. 4. What is the junction tree algorithm? Given a parameterized grapical model, and given observations of a subset of nodes of the model, describe a general approach to maximum likelihood parameter estimation for the model. What role might the junction tree algorithm play in this procedure?"
(Fall 2006 - Klein & Malik): "General AI: -- What is an MDP? Define it formally. -- What are the Bellman equations -- What is the value iteration algorithm? -- What is Q-learning? Learning: -- Define linear regression as an optimization problem. -- What is a kernel and why are they used? -- Can you kernelize linear regression? How? Language: -- Describe a chain of processing from a wave file to first-order logic. -- How does ASR work? What is an HMM in general? What is the Viterbi algorithm? -- How does parsing work? What is an efficient algorithm for probabilistic parsing? -- How might semantic translation to first-order logic work? Vision: -- Define the object recognition problem. What makes it hard? -- How might you segment and classify a zip code from a picture of an envelope? -- What is edge detection, and how does it work?"
(Fall 2005 - Klein & Jordan): "General AI: 1. Robot motion planning a. Draw a configuration space for a 2D robot arm with rotational shoulder and elbow joints. Is the mapping from configurations to workspace positions of the actuator invertible? b. If the shoulder joint is suspended from the ceiling, how does the free space change? c. Assume now there's a floor with a hole in it that the arm move obstacles into. Why is the free space no longer convex? 2. Search a. How could I use state-space search to plan paths for the robot discussed before, assuming there is no uncertainty in the robots movements and the obstacles are fully known? What are the drawbacks to the method you proposed? b. How could I use A* search? What would be an admissible heuristic? Statistical Learning Theory: 1. Kernels a. What is Fisher discriminant analysis (aka linear discriminant analysis)? b. Can it be kernelized? What does that mean? Why do you know that the parameter vector will be a linear combination of transformed data points? 2. Parameter estimation a. Show how logistic regression can be motivated from a generative perspective. In particular, what assumptions on the class-conditional densities do you need. a. Given a specific generative model, describe two ways to estimate parameters for the resulting logistic regression, using maximum likelihood in both cases. Will they give the same results? 3. Graphical models a. What is the treewidth of a graph? What does it have to do with the complexity of inference? b. What is a junction tree? How is it different from an arbitrary tree of cliques? Language and Speech 1. PCFGs a. What is a PCFG? How could one be used in a speech recognition? b. Sketch an algorithm for finding the best parse or the sum of all parses for a sentence. What are the time and space complexities of your algorithm and why? 2. Semantic Translation a. Write a reasonable translation of "Tom likes Amy" into FOPL. How could you derive this expression compositionally from a parse tree. b. How about "Everyone likes Amy"? "Everyone likes someone"? c. How might we represent tense and aspect in this process? Vision 1. Shape detection a. What is a Hough transform? How can it be used for segmentation? 2. Depth a. What are some techniques for finding depth in images?"
(Spring 2005 - Russell & Wilensky): "1. Vision Define object recognition and explain why it's hard. How might learning help? Why might it be important for higher-level visual hypotheses to inform lower-level processes such as edge detection? 2. KR What are the principal differences between propositional and first-order formalisms? Can you write the rules of chess in propositional logic? Are graphical models propositional? Can we make them first-order? 3. Planning How might actions be described in first-order logic? What are the benefits and drawbacks to such a representation? What are some alternatives? Are there domain-independent heuristics for planning? 4. Language, speech What are some important types of ambiguity that occur in natural language use? Pick one of these and explain some approaches to addressing it. What are some examples of semantic ambiguity other than lexical semantic ambiguity? What sorts of ambiguity arise in speech? How do speech systems resolve those ambiguities? 5. Robotics How can a robot get its hand from here to there? What are some of the problems that one would encounter in designing a robot perform some task in a new environment, such as finding a telephone in an unfamiliar house? Solutions? 6. Learning Suppose you wanted to get a machine to play Tetris. How might you formulate and solve this as a learning problem? Ditto for battleships"
(Fall 2004 - Feldman & Klein): "Language: 1. What are some levels of structure in natural languages, and what formal models are used for them? 2. Why might we want to model syntax as something other than a CFG? Name some options and reasons. Search: 1. What is state-space search? Describe a general heuristic approach to search. 2. What is A* search, what is admissibility, and why is it important? What would happen if we used an inadmissible heuristic? 3. Contrast state-space search and dynamic programming. Can they be combined? Learning: 1. Describe your favorite classifier. What objective function does it maximize, and what are its properties? What is its mechanism for preventing overfitting? 2. When might your classifier be inappropriate? What are its weaknesses? What would you use instead, and why? Vision: 1. For 2D images, suggest a domain where automatic visual recognition is successful. What representations and techniques are used? 2. For 3D, do the same."
(Spring 2004 - Jordan & Russell): "1. What is a planning problem and how could one solve such problems using a satisfiability algorithm? 2. What is boosting and how does it work? What formal properties of boosting can be shown? 3. Consider the problem of moving a robot arm from one place to another. How can such problems be formulated and solved? 4. What is the part-of-speech tagging problem and how can it be solved? 5. Explain the EM algorithm for hidden Markov models. Is forward-backward an elimination algorithm?"
(Fall 2003 - Feldman & Wilensky): "1. Vision Motion is a major problem for visual systems. Describe how Kalman filters can be used in tracking moving objects. How would you handle the problem of data association? To what extent is the problem of tracking people solved? 2. Language What is the problem of reference resolution? Can it be divided into subproblems that may be subject to different constraints? Does the need to address this problem have representational consequences? What are some possible algorithms to address this problem? 3. Speech recognition What are hidden Markov models, and why and how are they used in speech recognition? How are HMMs trained? What other components are needed in a speech recognition system? A current speech recognition system, given two utterances, transcribed them as follows: 1. “One carrot cake, one carat diamond, one karat gold ring” 2. “My cake is made with carrots. Diamonds are weighed in carrots. Gold is weighing the in carrots.” The sound variously transcribed as “carrot”, “carat” and “karat” is identical. How might one account for this variation in performance? 4. Logic and planning Assuming a STRIPS-like representation, how might we represent a simple operator, e.g., "Go" (i.e., the robot goes from one location to another). Such a representation is generally preferred to a "pure" logical formulation. Why? Compare state-based and plan-based approaches to problem solving. Suppose that our robot could pick something up and take it with it. How might we reason now about the consequences of actions? How is the planning process complicated? 5. Search Propose an approach to solving cryptarithmetic problems. Within this approach, what are some different strategies one can exploit? 6. Learning Tell us about your favorite learning algorithm. Is there a class of problems for which it is not appropriate?"
(Spring 2003 - Feldman & Russell): "1. Vision What is segmentation and what role does it play in the overall vision problem? [Explain with reference to a specific image.] How is segmentation done? What is the connection between segmentation and edge detection? How is edge detection done? 2. Language What is unification grammar? What basic advantage does it offer over standard context free grammars. Give an example of a unification grammar rule that incorporates number agreement. What about semantics? What are some limitations of unification grammar? 3. Speech recognition What is a language model and how is it used in speech? Describe the probability models used in statistical speech recognition. Describe the EM algorithm and how it is used to train these models. 4. Logic and planning Describe an efficient SAT algorithm. Is your algorithm complete? Explain how to use a SAT algorithm to solve a STRIPS planning problem such as the following: Go(x,y): Preconditions: At(x) Effects: ~At(x), At(y) Initial state: At(A) Goal state: At(B) 5. Search and robotics Describe an algorithm for finding the shortest path to a known goal location in a deterministic grid world with obstacles. What if the map of the grid world is not available? How can a real robot build a map of a real environment? 6. Learning Explain decision tree learning Explain how to combine this with boosting Explain the problems involved in learning Bayesian networks from data, distinguishing between Bayesian and maximum-likelihood approaches."
(Spring 2002 - Jordan & Russell): "See (http://www.cs.berkeley.edu/~daf/AI-prelim/AI-prelim.html) for the reading requirements."(Fall 2001 - Feldman & Wilensky): "1) What are computational methods for going from 2D images to models of the 3D world? 2) Tell us about your favorite learning algorithm. 3) Discuss state-space and plan-space planning techniques. 4) Compare Neural Networks and Bayesian Networks. 5) Discuss the feasibility of Context Free Grammars for Natural Language processing."
(Fall 1999 - Jordan & Russell): "Q1: search as a solution to robot path planning a. admissible search algorithm b. heuristics c. what if you can't assume polygonal obstacles (configuration space)? Q2: belief networks a. are diagnostic or causal queries easier? b. can we remove a node from the network for a given query? c. complexity of variable elimination for diagnostic/causal Q3: mdps: what are two ways of solving them / pros-cons? a. what if they are not observable? b. what is the size of the search space (unobservable)? c. why should we assume the markov property holds in belief space? Q4: mdps vs. dbns a. complexity of inference (forward algorithm) b. what if the model is a mixture of gaussians? complexity of forward chaining then? what if the gausisian nodes have discrete parents? what then? Q5: mdp vs. reinforcement learning a. what are the differences? b. what are some strategies for reinforcement learning?"
(Fall 1998 - Malik & Wilensky): "Q1: Draw a belief network with two causal nodes A and B, and one symptom node C. What are the independence relationships in the network? Number of parameters in the CPT? Write a joint probability. What will happen if we reverse some arrows (can we still capture the same independence relationships)? Q2: Draw a belief network for "smoking", "lung cancer", "coughing", "positive x-ray", and "black lung". How do you do diagnostic query? What do you do when poly tree doesn't work here? Suppose we use clustering. Is there any systematic way of deciding which nodes to merge? Q3: Construct a neural network for the example in the last question. Compare the neural nets with belief nets, both domain independent and domain specific properties. Why do people use sum-of-square error instead of sum-of-absolute error in neural net training? Q4: What is non-linear planning? Why is that better than linear planning? Illustrate resolving threats with promotion and demotion. What other problems do planning have in real world situations? How do we cope with uncertainties? Can you incorporate probability into POP? Any other ways of using probability? Do you get a plan out of a Markov decision process? How do you find such policy? Q5: What is version space? Compare version space to neural nets, advantages and disadvantages."