C79 Midterm Solution

This solution contains detailed explanation and comments.
You were not expected to write nearly as much on your exam!

Q1. 
1) 
One answer: The two events are independent so: 
P(coin lands heads AND die shows 3 or more spots on top)
= P(coin lands head) P(die shows 3 or more spots on top)
= (1/2)*(4/6)

Alternate answer: There are 12 possible outcomes, H1, T1,
H2, T2, ..., H6, T6 (where the letter H or T represents how
the coin came up and 1..6 represents what we got on the die).
There are 4 outcomes that satisfy the requirements:
H3, H4, H5, H6.  Since all outcomes are equally likely,
the probability is 4/12.
 
2) 
One answer: There are 12 possible outcomes, as mentioned
above.  Of these, there are 7 outcomes that satisfy all the
requirements: H3, H4, H5, H6, T4, T5, T6.  Since all outcomes
are equally likely, the probability is 4/12.

Alternative answer: There are 12 total possibilities for all possible
outcomes. Among these, if we have tails on the coin, then the die can
be 4, 5, or 6 whereas if we have heads on the coin, then the die can be
3, 4, 5, or 6.  This is a total of 7 possible events. So the answer is
P(number of head + die spots >= 4) = 7/12.


Q2.
1) The animals at each farm are all fed the same feed, so if one animal
is infected, it's likely that other cattle from the same farm are
infected too.

2) Some animals were born after the feed ban, which would greatly reduce
their chances of having BSE.

Alternative answer: Younger cattle have a lower probability of BSE
than older cattle.


Q3.
It would be helpful to know P(does not use drugs | tests positive).

Alternative answer: You should seek to learn P(test positive | uses
drugs), P(test positive | does not use drugs), and P(uses drugs) (the
proportion of people in the summer internship that uses drugs).  From
this you can compute the conditional probability that a random intern
who tested positive did actually use drugs.


Q4.
The grad student is correct.  The policy was meant to decrease the
mortality rate in the whole district overall; her data shows this did not
happen, so it is not reasonable to call this a significant success.  Yes,
both their data could be accurate.  This could happen if home delivery
has a significantly higher mortality rate, and if the fraction of babies
delivered at home has increased over time (for instance, maybe the policy
changes made more mothers decide to deliver at home); then even if both
home deliveries and hospital deliveries each got safer, this could be
offset by the increased number of home deliveries.

Alternative answer: The grad student is correct, since the overall
mortality rate did not decline.  Yes, both their data could be accurate;
here is an example of how it could happen.  Suppose before the policy
changes we had 100 infants where 10 died (overall rate 10/100). Let's
say 2 out of 10 died at home and 8 out of 90 died in hospitals. Suppose
after the policy changes, there were 50 home deliveries, and 8 out of
50 died (a lower mortality rate for home births than before; 16% now vs
20% before), and 50 hospital deliveries, where 2 out of those 50 died
(a lower mortality rate for hospital deliveries than before; 4% now vs
9% before).  Then for both groups the rate decreased but the overall
rate is still 10/100.

Alternative answer: This is like the example where soldiers in
the army (during a war!) had a lower death rate than NYC residents.
The death rate for soldiers of any given age was higher than the
death rate of NYC residents of the same age, but the army was
skewed towards young people, who have a lower death rate than older
people.  Here too the mortality rate today might be the same
(or even higher!) than the mortality rate a decade ago, even if
the mortality rate for babies of each type (home-delivered or
hospital-delivered) is lower than it was a decade ago, if today
the location of births is skewed more towards the more dangerous
location.  Thus both their data might be accurate, but the grad
student's ultimate interpretation seems right.


Q5. 
The respondents on the German survey will show higher (self-reported)
happiness than on the Kahneman survey.  The responses from the Kahneman
survey will suffer from anchoring: the number of dates will be an anchor
that affects their answer on happiness. Because most people are not
likely to have 3 to 5 dates in a month, their answer for happiness might
be lowered due to anchoring. Probably most people in the population will
be generally happy, and this will be reported accurately in the German
survey, showing high happiness numbers in the German study, but lower
numbers (due to anchoring) in the Kahneman study.

Alternatively, if we assume that most people are generally not happy but
date a lot then the anchoring could work in the reverse direction (you
do not need to mention this...but if you did you will get credit as well).

Alternative answer: The results from the Kahneman survey will have a
high correlation between the questions since one is objective and the
other is subjective. The answer for the first might carry to the second.
This heuristic is called substitution. The correlation will not exist
for the German experiment.

Alternative answer: The Kahneman survey will shower lower (self-reported)
happiness. Most people are happy, and this will be reported accurately in
the German survey. However, the Kahneman study will first prime people
to think about the number of dates they had; if many people wish they
had more dates, thinking about how many dates they've had might make
them feel unhappy, influencing their response to the happiness question.

Comment: It seems likely that the results for the number of dates
will be the same, since people should know that number exactly.

Q6.
Democratic states have a higher percentage of people supporting Obama than
Republican states. People are ignoring the actual bill itself but using
the heuristic on who is supporting the bill to help make their decision.

Alternative answer: This is an example of the availability heuristic at
work: people who don't know anything about the bill are using their view
of the parties to help make their decision.

Alternative answer: This could be viewed as an example of attribute
substitution.  The target attribute (knowledge about the merits of
the bill) is unavailable, but a substitute (knowledge about the party
supporting the bill) is readily available and is used by System 1 instead.

Q7. 
No, because the relationship at the community level does not carry to
the individual level. This is the ecological fallacy.

Alternative answer: No, maybe Obama won 60% of the vote of people in
that county, by winning 80% of the vote of the poorest-half of people
in the county and only only 40% of the vote of the wealthiest-half of
people in the county.

Q8. 
There are 31 majors, so there are C(31,3) = 31*30*29/6 ways ("31
choose 3" ways) to choose 3 majors if we don't care about their types.
Some of those are bad: C(8,3) = 8*7*6/6 ways to choose 3 physical science
majors, C(10,3) = 10*9*8/6 ways to choose life science majors, C(13,3) =
13*12*11/6 ways to choose social science majors.  Subtract off the bad
ways, and we're left with 31*30*29/6 - 8*7*6/6 - 10*9*8/6 - 13*12*11/6
acceptable combinations.

Alternative answer: They could all be different sciences or 2 in the
same science and one in a different science. This leaves:

8*10*13+8*7*10/2+8*7*13/2+10*9*8/2+10*9*13/2+13*12*8/2+13*12*10/2


Q9.
1) The graph is trying to show that the number of Americans receiving
federal welfare has greatly increased since 2009.

2) The y-axis does not start at 0 so the actual increase is much less
than what appears on the graph.