C79 Midterm Solution This solution contains detailed explanation and comments. You were not expected to write nearly as much on your exam! Q1. 1) One answer: The two events are independent so: P(coin lands heads AND die shows 3 or more spots on top) = P(coin lands head) P(die shows 3 or more spots on top) = (1/2)*(4/6) Alternate answer: There are 12 possible outcomes, H1, T1, H2, T2, ..., H6, T6 (where the letter H or T represents how the coin came up and 1..6 represents what we got on the die). There are 4 outcomes that satisfy the requirements: H3, H4, H5, H6. Since all outcomes are equally likely, the probability is 4/12. 2) One answer: There are 12 possible outcomes, as mentioned above. Of these, there are 7 outcomes that satisfy all the requirements: H3, H4, H5, H6, T4, T5, T6. Since all outcomes are equally likely, the probability is 4/12. Alternative answer: There are 12 total possibilities for all possible outcomes. Among these, if we have tails on the coin, then the die can be 4, 5, or 6 whereas if we have heads on the coin, then the die can be 3, 4, 5, or 6. This is a total of 7 possible events. So the answer is P(number of head + die spots >= 4) = 7/12. Q2. 1) The animals at each farm are all fed the same feed, so if one animal is infected, it's likely that other cattle from the same farm are infected too. 2) Some animals were born after the feed ban, which would greatly reduce their chances of having BSE. Alternative answer: Younger cattle have a lower probability of BSE than older cattle. Q3. It would be helpful to know P(does not use drugs | tests positive). Alternative answer: You should seek to learn P(test positive | uses drugs), P(test positive | does not use drugs), and P(uses drugs) (the proportion of people in the summer internship that uses drugs). From this you can compute the conditional probability that a random intern who tested positive did actually use drugs. Q4. The grad student is correct. The policy was meant to decrease the mortality rate in the whole district overall; her data shows this did not happen, so it is not reasonable to call this a significant success. Yes, both their data could be accurate. This could happen if home delivery has a significantly higher mortality rate, and if the fraction of babies delivered at home has increased over time (for instance, maybe the policy changes made more mothers decide to deliver at home); then even if both home deliveries and hospital deliveries each got safer, this could be offset by the increased number of home deliveries. Alternative answer: The grad student is correct, since the overall mortality rate did not decline. Yes, both their data could be accurate; here is an example of how it could happen. Suppose before the policy changes we had 100 infants where 10 died (overall rate 10/100). Let's say 2 out of 10 died at home and 8 out of 90 died in hospitals. Suppose after the policy changes, there were 50 home deliveries, and 8 out of 50 died (a lower mortality rate for home births than before; 16% now vs 20% before), and 50 hospital deliveries, where 2 out of those 50 died (a lower mortality rate for hospital deliveries than before; 4% now vs 9% before). Then for both groups the rate decreased but the overall rate is still 10/100. Alternative answer: This is like the example where soldiers in the army (during a war!) had a lower death rate than NYC residents. The death rate for soldiers of any given age was higher than the death rate of NYC residents of the same age, but the army was skewed towards young people, who have a lower death rate than older people. Here too the mortality rate today might be the same (or even higher!) than the mortality rate a decade ago, even if the mortality rate for babies of each type (home-delivered or hospital-delivered) is lower than it was a decade ago, if today the location of births is skewed more towards the more dangerous location. Thus both their data might be accurate, but the grad student's ultimate interpretation seems right. Q5. The respondents on the German survey will show higher (self-reported) happiness than on the Kahneman survey. The responses from the Kahneman survey will suffer from anchoring: the number of dates will be an anchor that affects their answer on happiness. Because most people are not likely to have 3 to 5 dates in a month, their answer for happiness might be lowered due to anchoring. Probably most people in the population will be generally happy, and this will be reported accurately in the German survey, showing high happiness numbers in the German study, but lower numbers (due to anchoring) in the Kahneman study. Alternatively, if we assume that most people are generally not happy but date a lot then the anchoring could work in the reverse direction (you do not need to mention this...but if you did you will get credit as well). Alternative answer: The results from the Kahneman survey will have a high correlation between the questions since one is objective and the other is subjective. The answer for the first might carry to the second. This heuristic is called substitution. The correlation will not exist for the German experiment. Alternative answer: The Kahneman survey will shower lower (self-reported) happiness. Most people are happy, and this will be reported accurately in the German survey. However, the Kahneman study will first prime people to think about the number of dates they had; if many people wish they had more dates, thinking about how many dates they've had might make them feel unhappy, influencing their response to the happiness question. Comment: It seems likely that the results for the number of dates will be the same, since people should know that number exactly. Q6. Democratic states have a higher percentage of people supporting Obama than Republican states. People are ignoring the actual bill itself but using the heuristic on who is supporting the bill to help make their decision. Alternative answer: This is an example of the availability heuristic at work: people who don't know anything about the bill are using their view of the parties to help make their decision. Alternative answer: This could be viewed as an example of attribute substitution. The target attribute (knowledge about the merits of the bill) is unavailable, but a substitute (knowledge about the party supporting the bill) is readily available and is used by System 1 instead. Q7. No, because the relationship at the community level does not carry to the individual level. This is the ecological fallacy. Alternative answer: No, maybe Obama won 60% of the vote of people in that county, by winning 80% of the vote of the poorest-half of people in the county and only only 40% of the vote of the wealthiest-half of people in the county. Q8. There are 31 majors, so there are C(31,3) = 31*30*29/6 ways ("31 choose 3" ways) to choose 3 majors if we don't care about their types. Some of those are bad: C(8,3) = 8*7*6/6 ways to choose 3 physical science majors, C(10,3) = 10*9*8/6 ways to choose life science majors, C(13,3) = 13*12*11/6 ways to choose social science majors. Subtract off the bad ways, and we're left with 31*30*29/6 - 8*7*6/6 - 10*9*8/6 - 13*12*11/6 acceptable combinations. Alternative answer: They could all be different sciences or 2 in the same science and one in a different science. This leaves: 8*10*13+8*7*10/2+8*7*13/2+10*9*8/2+10*9*13/2+13*12*8/2+13*12*10/2 Q9. 1) The graph is trying to show that the number of Americans receiving federal welfare has greatly increased since 2009. 2) The y-axis does not start at 0 so the actual increase is much less than what appears on the graph.