4.7 Sampling & Confidence

Suppose that, according to your calculations, the following is true about your polling:

You run the poll, you count how respondents many said they will vote for the President, you divide by \(n\), and find 0.53. You call the President, and... what do you say?

Mr. President, \(p\)=0.53.

Mr. President, with probability at least 95%, \(p\) is within 0.04 of 0.53.

Mr. President, either \(p\) is within 0.04 of 0.53 or something very strange (5-in-100) has happened.

Mr. President, we can be 95% confident that \(p\) is within 0.04 of 0.53.

You cannot say (1): the only way to know the exact value of the constant \(p\) is to ask all 250,000,000 voters.

You cannot say (2) either: \(p\) is a \(constant\) which can either be or not be within 0.04 of 0.53. If it is, then the probability that it is is 1, and thus at least 0.95, and therefore (2) will be true. If it is not, then the probability that it is is 0, and thus smaller than 0.95, and therefore (2) will be false.

You can say (3): To see why, start with the statement

either \(|0.53 - p| \leq 0.04\) or \(|0.53 - p| > 0.04\) is true. which is obviously true. Now read it as follows: Either \(p\) is within 0.04 of 0.53 or it is not and therefore my random variable \(P\) took a value from a set that is hit only 5 times in 100. So, clearly, either \(p\) is within 0.04 of 0.53 or something strange has happened.

You can say (4): By rephrasing (2) as "confidence" rather than probability, you are correctly indicating that you are talking about the probable behavior of your methodology for sampling \(p\), not the actual value of \(p\).

Random Sampling