It's a quadrennial tradition for partisans and poll-watchers to complain about the number of Democrats, Republicans and Independents that are included in each survey.
Recently - perhaps because Mitt Romney still narrowly trails President Obama in most state and national surveys - we have seen a bit more of this from conservatives. They will sometimes allege that these polls are âoversamplingâ Democrats, including too many of them in their surveys, and perhaps biasing their results toward Mr. Obama because of this.
Liberals are capable of making these charges too, however. In 2004, when most polls showed a narrow lead for George W. Bush, some said that polls were âoversamplingâ Republicans. Did their charges pan out? Actually, they look silly in retrospect. Mr. Bush carried the election by 2.4 percentage points, just slightly larger than the 1.5 percentage point lead he had in the polling average heading into Ele ction Day.
There are, nevertheless, elements of truth in these critiques. It is certainly the case that some polling firms consistently show more favorable results for Democrats and Republicans than the consensus of polling firms. We call these âhouse effectsâ and our forecast model adjusts for them; if a polling firm is consistently 2 points more Democratic-leaning that the consensus, we strip most of that right back out.
It is also the case that there have been some years when even the pollster consensus was biased toward or against one party. In 2000, Mr. Bush led in most surveys by about 3 percentage points on the eve of the election, but Al Gore won the popular vote instead. In 1980, Jimmy Carter trailed Ronald Reagan only by the slimmest margin in the last round of surveys - but Mr. Reagan romped to a resounding 9.7-point victory.
It is easier, of course, to identify these cases after the fact. Beforehand, the best you can usually do is to acknowl edge that there is some possibility of their occurrence. Even in the waning days of an election, when we have surveys from dozens of polling firms that collected tens of thousands of interviews between them, their biases will not necessarily cancel out, and the error in the surveys may considerably exceed that from sampling error alone.
Still, I think the charges of âoversamplingâ mostly miss the point. Let me make 13 relatively brief but interrelated points that explains my philosophy on this issue, and where I see the theoretical and empirical evidence as guiding the debate.
1. Be careful if you see the term âoversampled.â It is probably being used incorrectly. In blogs, the term âoversampledâ has come to be a shorthand for a poll that includes âtoo manyâ Democrats or Republicans. But that's not quite the way that pollsters use the term.
Instead, an âoversampleâ is a deliberate effort to include more of a certain population in a survey to permit more robust analysis of a particular demographic subgroup.
For example, say that a polling firm wants to study the views of Latino voters in more detail at the same time that it is conducting a national survey. Its initial survey of 900 adults may include about 130 Hispanics - about their share of the United States population - which is not really enough to analyze with much accuracy because of the high margin of error associated with a 150-person subsample. So the polling firm would take an âoversampleâ until it got a total of 450 Hispanics on the phone, creating a respectable sample size. Then it might be able to report, say, how Hispanic voters' preferences would be affected by the presence of Senator Marco Rubio on the Republican ticket.
Knowing that it has interviewed too many Hispanics, the polling firm would then down-weight the Hispanic voters when it rolled them back into its national survey and reported the results from all United States adults. In this example, they would reduce the weight associated with each Hispanic voter by two-thirds, since they interviewed 450 when there should be 150 based on their share of the U.S. population. This technique permits the pollster something of the best of both worlds: it can have a more robust analysis of hard-to-poll demographic subgroups without skewing the overall sample.
Sometimes, these oversamples can also occur based on party identification. For instance, if a pollster wants to poll both a primary election and the general election in a given state, it may take an âoversampleâ of whichever party is holding a competitive primary at the time. But then it discounts those interviews when it reports the preferences of general election voters.
2. Be even more careful when you see terms like âskewedâ or âbiasedâ.
3. Party identification is not a hard-and-fast characteristic, as other demographic characteristics are.
4. Partisan ident ification measures are affected by sampling error.
5. Partisan identification is not the same thing as partisan registration.
6. There are many different ways of measuring and asking about party identification.
7. Polls of registered voters, or all adults, typically show a more favorable party identification spread for Democrats.
8. If you are going to scrutinize polls based on their partisan identification, do so equally.
9. Weighting by party identification puts the cart before the horse.
10. There is no absolute standard to measure party identification - only other polls.
11. Taking a poll average - especially with adjustments for âhouse effectsâ - is usually a more elegant solution to the problem.
12. Pay relatively more attention to party identification when you have fewer polls.
13. There has not been any long-term bias in the polling average toward Democratic or Republican candidates.
No comments:
Post a Comment