They are of some use in predicting election results
but need to be reviewed carefully to be understood.
The article is based on my limited understanding as a layman, so please take it for whatever it may be worth.
Public opinion polling is based on a sample of the relevant population as a whole within a given survey area, with margin of error calculations of the degree to which the sample size approaches perfect randomness. Polls attempting to predict how many will vote for one presidential candidate versus another use party affiliation as one means of simulating a random sample. To the extent that polling samples fail to reflect accurately the party affiliations of those who actually vote, the results are flawed.
Whether a poll respondent is likely to vote for the candidate he prefers (usually his party’s candidate) is, in part, a function of the respondent’s enthusiasm for him. Enthusiasm against the opposing candidate can also produce votes against him and perhaps for a marginally preferred candidate. Charlie Martin at PJ Media Tatler, who knows a lot more about polling and statistical methodology than I ever will, suggested on November 2nd,
One of the things that turned the election for Obama in 2008 is that while a lot of people were excited about Obama — from the people who thought he really was the Messiah to middle of the roaders who just thought it’d be cool if we elected a black man — a lot fewer were excited about McCain. Lots of hard-core conservatives thought he was a RINO; a fair number of people believed the hatchet job on Sarah Palin; a bunch of otherwise sensible people got gamed on the financial crisis. The result was that from about the first of October on, the crowds were surging for Obama; early voting was way in Democrats’ favor.
This year, the actual data is much the other way. (There were some polls, of course, that early on claimed Obama was getting amazing proportions of the early vote in, say, Ohio, but they turned out to be what mathematicians call “nonsensical,” with sample sizes of, like, 50.)
I’m going to propose an idea: in the absence of other evidence, I think we should assume that people who vote on Election Day will be much like the people who vote before Election Day. This election is different from previous ones in one significant way — we’ve made it ridiculously easy to vote early or absentee (unless, of course, you’re in the military overseas, but that actually is a fairly small population). You don’t need to be extra enthusiastic to vote early.
If I’m right, then Republican turnout will turn out (heh) to be quite a bit more enthusiastic, and therefore more likely to vote, than Democratic.
If the population is exactly evenly divided, and 53 Republicans show up to vote for every 45 Democrats, what is the actual result?
Land line or Cell phone
Increasingly, people rely on cell phones to the exclusion of land line phones. According to this article at The Democratic Strategist,
In the last half of 2011, 32 percent of adults were cell-phone only according the Center for Disease Control that is the official source on these issues; 16 percent were cell phone mostly. But the proportion cell-phone only has jumped about 2.5 points every six months since 2008 – and is probably near 37 percent now. And pay attention to these numbers for the 2011 adult population:
More than 40 percent of Hispanic adults are cell phone only (43 percent).
A disproportionate 37 percent of African Americans are cell only.
Not surprisingly, almost half of those 18 to 24 years are cell only (49 percent), but an astonishing 60 percent of those 25 to 29 years old only use cell phones.
But it does not stop there: of those 30 to 34 years, 51 percent are cell only.
You have to ask, what America are the current polls sampling if they are overwhelmingly dependent on conventional samples or automated calling with no cell phones? Democracy Corps reached 30 percent by cell; 35 percent were cell only or cell mostly, but only 15 percent are cell only, well short of where we should be.
According to Gallup,
As we began this election tracking program on Oct.1, our methodologists also recommended modifying and updating several procedures. We increased the proportion of cell phones in our tracking to 50%, meaning that we now complete interviews with 50% cell phones and 50% landlines each night. This marks a shift from our Gallup Daily tracking, which has previously been 40% cell phones. This means that our weights to various phone targets in the sample can be smaller, given that the actual percentage of cell phones and cell-phone-only respondents in the sample is higher. We have instituted some slight changes in our weighting procedures, including a weight for the density of the population area in which the respondent lives. Although all Gallup surveys are weighted consistently to census targets on demographic parameters, we believe that these improvements provide a more consistent match with weight targets. The complete statement of survey methods is included at the end of each article we publish at Gallup.com. (Emphasis added)
If Gallup completes “interviews with 50% cell phones and 50% landlines each night” as claimed, cell phone users may well be overrepresented in its results. The Democrat Strategist says that Hispanic and Black cell phone usage is higher than for others but is still lower than 50%, with some upward variation based on age. If so, Gallup seems likely to complete interviews with disproportionate numbers of Hispanics and Blacks, generally considered to be likely Obama voters. There may be valid ways to avoid or to compensate for this problem, but if there are I haven’t heard of them.
There are various other problems involved in polling respondents who rely on cell phones. Although the recipient of a land line call does not have to pay for answering it, the recipient of a cell phone call does. It seems likely that cell phone users are less likely than land line users to answer calls from unfamiliar numbers, and that those who do answer them are less willing to spend their money to complete interviews. To meet Gallup’s 50% – 50% balance in completed interviews, it therefore seems likely that more cell phone than land line numbers have to be called. To what extent does this affect the results? Overall,
the response rate for telephone surveys response rate (percent of households sampled that yielded an interview) this year to be just 9%, down from 36% in 1997 and 25% in 2000. (Emphasis added.)
However, the polling companies contend that their results are just fine anyway. There may be different views on that after the election results are known and compared with survey results.
Margin of error
Margins of error are calculated using sample size relative to total population surveyed and hence the sample’s degree of randomness in representing the entire population. To the extent that the sample approaches perfect randomness, it becomes more like a census (of everybody in the survey area) and the margin of error decreases. Although doubtless more accurate, a true census of the entire relevant population would be prohibitively expensive and time consuming; even a true census taken in advance of voting could not factor in such unknowns as voting fraud, the weather and the like. The compositional data mentioned above are not considered in sample polling (nor would they be needed in a true census) and there seems to be no reliable method of doing it effectively.
Polls frequently have margins of error of between three and four points. Assuming a four point margin of error, a polling result of 50% for candidate A and 50% for Candidate B means that the real split could be 46% for candidate A and 54% for candidate B, but more likely some less substantial departure from 50% – 50%, perhaps 48% for candidate A and 52% for candidate B. These possibilities are often ignored, leading to incorrect perceptions of poll results.
Often, the results from multiple polls are averaged in attempts to produce more credible results than the individual polls alone seem to provide. The current Real Clear Politics averages are provided here, showing a significant recent swing toward Governor Romney but a 0.4 point edge for President Obama. That average does not appear to take into account the quite different margins of error of the individual polls (from 2.0 to 4.4); nor do I know how it could without substantial difficulty (I got the same result (0.3636, which rounds to 0.4) from the raw averages with no compensation for margins of error).
To the extent that individual poll results do not reflect the realities that will eventually surface in the election, averaging can increase rather than decrease the errors in the averaged values. If ten people are asked to look at Governor Christie and guess his weight, the chances are slimmer than he is that any guess will be exactly right. The result from averaging the ten guesses may be more nearly accurate than any one guess, or it may be less. To the greater extent that garbage goes into the calculation, the greater is the likelihood that the result will also be garbage. Here (I think) is another way of expressing this:
[I]t’s fairly straight forward actually. Each of the poll results is really an approximately normally distributed random deviate with mean at the percentage of the vote given. Assume wlg that the margins of error are all the same; it’ll turn out that in terms of order statistics it’ll all end up in the constant anyway. The mean of the sum of normal random deviates is the sum of the means, the variance is sum of the variances. By the assumption, that means the variance for the sum is n*σ^2 (stupid WordPress won’t let me have a superscript.) Thus the standard deviation is √n * σ and thus we see that the standard deviation of the sum of these independent normal random deviates varies with the square root of the number of random deviates summed. Margin of error is basically just the 2σ band above and below the mean. In other words, the margin of error grows approximately as O(√n).
So, when RCP averages their ten polls with a margin of error of roughly 3 percent, the margin of error of the average is more like 9.4 percent. (Emphasis added.)
Hence, Governor Romney could win by 9.4% or President Obama could win by 9.8%; or somewhere in between for each. That’s a possible spread of about nineteen points. It’s an exercise at least as simple as finding a square root using roman numerals and seems to be roughly as worthwhile an effort.
Most of us looking at polls probably evaluate them based more on how closely they reflect the results we want than on the “technicalities” noted above. Due to the partially subjective components of polls and often overlooked but important considerations of margin of error and averaging difficulties, there is ample room for subjective analysis and wishful thinking.
Charlie Martin, writing at PJ Media Tatler, today predicted that Governor Romney would win sufficient popular votes for a 341 to 197 electoral vote victory over President Obama. Note the limited extent to which his projection refers to polling results.
All, or nearly all, states in which the poll on Romney/Obama is even within the margin of error will go for Romney. My reasoning is this:
- All these samples seem to me to still imply a greater D turnout than R. Nearly all actual reported early voting has been heavily Republican, plus we have the Washington Post polling on defectors. So it seems very probable that the actual turnout will be favoring Republicans, possibly heavily.
- The “Sandy” debacle isn’t making Obama’s administration look good. This may not mean more votes for Romney (although it could), but it may well depress Democrat voters on Tuesday.
- Romney/Ryan are drawing tens of thousands, Obama/Biden thousands or hundreds. Again, this indicates greater enthusiasm and suggests greater Republican turnout
On November 2nd, Michael Barone predicted
Romney 315, Obama 223. That sounds high for Romney. But he could drop Pennsylvania and Wisconsin and still win the election.
Today, George Will predicted
Cookie Roberts, Ronald Bronstein, Matthew Dowd, and Donna Brazile predicted electoral college wins of varying magnitude for President Obama.
Going on little more than my own gut feeling, relative apparent enthusiasm and the apparent closing of the polling gap, I think Governor Romney has an excellent chance of winning, perhaps not by a landslide but by a more than sufficient a margin.
First published at Dan Miller’s Blog.