See my interview with Edison Research’s lead pollster Joe Lenski for background to this article if desired. Besides Mr. Lenski, for this and the articles that follow in the series, I have spoken with four additional U.S. elections and polling experts by phone or email: Professor of Political Science J. Celeste Lay of Tulane University, Professor of Political Science Amy Fried of the University of Maine, Professor of Political Science Lonna Atkeson of the University of New Mexico, and Antonio Gonzalez of the Southwest Voter Registration Education Project.
What if early exit polling in the United States is actually quite reliable, except when it involves George W. Bush or Hillary Clinton?
Last night in West Virginia the first full exit poll projected a 19.3% Bernie Sanders win. The first full exit poll is released to CNN and others in the National Election Pool by Edison Research just as polls close for a given state. CNN posts the results within five or ten minutes of poll closing. With 96.8% reporting this morning, Sanders is up 15.4% in West Virginia for a 3.9% difference in Hillary Clinton’s favor. Not bad. In and of itself, this would cause no concern. It is definitely within the margin of error.
For our purposes, the 3.9% miss does raise some concern as outlined in our nine criteria in Part 1 for when a state may show moderate potential for substantial election fraud. The problem is that the majority of West Virginia jurisdictions use voting machines or tabulators more than ten years old and the exit polling miss is more than 3.5% in Clinton’s favor. Beginning with South Carolina, where the first full exit poll missed by 11.5%, 22 of 25 primaries have seen Clinton outperform her exit polling expectations. The average is a 5.1% exit poll bias in Clinton’s favor.
For the numbers I can find (nearly all of them) on the GOP side for the same states, the overall bias is virtually nil, with most results getting the margin between first and second place in each contest right within a percentage or two. In 17 of the 25 contests on the Dem side, the exit polling miss on the marginal difference was 3.5% or more; this has happened just four times on the GOP side. On the GOP side, the misses of 3.5% or more were distributed across candidates. On the Dem side, 16 of 17 were in Clinton’s favor. For 9 of the 25 contests, the polling miss was 7.0% or greater, all in Clinton’s favor. This happened just once for Republicans (Texas).
In general, if exit polling were scientifically accurate, it would be right overall within the margin of error about nineteen times out of twenty. As Edison pollster Joe Lenski suggested, I would have liked to use Roper Center exit poll data to test the theory that, absent a race including Clinton or W., the early exit polls are generally accurate at a 95% interval. Unfortunately, these things are top secret. While dozens of universities theoretically have access to them via the Roper Center, you actually cannot just see them by making your way to the nearest university that has access. Most other polling results at the Roper Center are immediately available with subscription, but you have to get special Roper Center approval for designated-in-advance exit poll research including agreement from a top university official that you will stick strictly by the terms of your proposed research agenda.
My Roper Center special application was not approved in time for this article.
The results I can get a hold of from hunting around various places suggests this theory is generally true. Yes, exit polling missed by seven points for the Scott Walker recall (at or just over the margin of error … which you can’t know unless you have secret access!), by about ten points for the margin between Cruz and Trump in Texas, and by ten points for McCain versus Huckabee in Virginia in 2008. But these seem to be the expected exception in “nineteen out of twenty” times. They are randomized rather than all favoring one candidate. As argued above, things have been quite accurate on the Republican side this cycle, within the margin of error 24 of 25 times. Where there is deviation within the margin of error by more than 3.5% (just a handful of times), it has been randomized for which candidate it affected. Sometimes Trump. Sometimes Cruz. One time Rubio. Provided none of the margins of error are more than 8.0% (Masschusetts), Edison has missed the margin of error 36% of the time, all to Clinton’s benefit for Democratic contests.
On the one hand, you have long-time researchers like Richard Charnin who are quite certain that exit polling misses are hard and fast evidence of election fraud. Charnin’s work was the ultimate source of Tim Robbins’ initial contentious tweet after the New York Primary. Robbins removed the tweet but has since written a full-length defense of the idea in it for Huffington Post. Charnin’s work, and Robbins, has been mocked now publicly by Joshua Holland and many others. I do not agree with every move Charnin makes, but he does have multiple relevant degrees in statistics and has accurately recorded the first full exits versus final results in every instance I’ve checked. Charnin calculates margin of error using a well-accepted method, but Mr. Lenski, in my interview with him, disputed that way of figuring the margin of error for exit polls. Since I do not have secret clearance to access what Edison’s calculated margin of error is for various polls, I have decided to take 7%, or the marginal miss in the Scott Walker recall, as the barometer.
On the other hand, you have people who do not think election fraud explanations are worth entertaining. The exit polls are just bad, some will say. Others will advance any number of reasons for exit polling misses – any reason, that is, that does not include fraud. Professor Amy Fried responded to my questions along these lines with the following:
Exit polls simply are not intended for this purpose in the U.S. The questionnaires are too long and there is A LOT of selection bias in who will stop to answer all of the questions. This and the fact that the early and absentee votes have favored Clinton (due to the those voters’ age differences and the Clinton campaign’s greater competence in getting voters to vote before election day) likely account for much of the difference between exit poll results and vote totals. Plus on election day different people vote early versus late and so exit polls are adjusted over time. They are designed to give demographic detail plus more information about why people voted as they did, what issues are important and such.
Professor J. Celeste Lay agreed, pointing to youth voting and early or absentee voting as key factors that could be resulting in exit polling errors. She argued, “Most of this discussion is driven by Sanders supporters who are disappointed he is not winning and want to claim he has more support in the Democratic Party than he actually does.” She added, “Until proven otherwise, I’ll go with the numerous studies demonstrating the infinitesimal amount of voter fraud in U.S. elections.”
While Professor Lonna Atkeson was also generally skeptical of the possibility of fraud, the big misses by exit polling does give her some pause: “Exit polls are one tool we use to discover election fraud.”
If Professors Fried and Lay are right, we are left with “The Cult of Sore Losers” explanation. I will be drawing on Professor Atkeson’s work on voter perceptions of whether their vote is properly counted and perceived systemic fraud in future articles for this series. For now, my aim is to take an exhaustive look at both fraudulent and non-fraudulent possibilities rather than dismissing one or the other of them out of hand. Which possibility or possibilities best explain the known facts, not only about exit polling in general, but about why exits get some races right for Hillary Clinton versus Bernie Sanders while getting many other races very wrong?
In short order, the non-fraudulent theories involve (1) the Bradley Effect (for why Clinton did so well versus exit polling in 2008) (2) voter enthusiasm for a particular candidate influencing who talks to exit polling workers (3) general exit polling sampling mistakes (4) missing out on early voters (5) suggestions that early exit polls are unweighted or (6) include a much smaller sample size and especially (7) oversampling of young voters and/or (8) closed primaries leading voters who have cast provisional or affidavit ballots to skew the exit polls.
Do any of these theories on their own or in combination with others help explain the misses? This is where the Alabama Test I first discussed in part one comes in. The exit poll released just as state polls were closing in Alabama on Super Tuesday turned out to be wrong by fourteen points, double the number that caused Harry Enten to write about the Wisconsin Scott Walker recall miss.
Some of these theories can be dismissed quite quickly:
The Bradley Effect (1) suggested that voters were reluctant to tell exit pollsters they had voted against the black guy in 2008. The black vote has swung precipitously to Clinton in 2016, as is well known, and Bernie doesn’t fit the model for other obvious reasons.
Voter enthusiasm for a candidate (2) would suggest Bernie Sanders supporters are more likely to talk to exit polling workers, but would suggest the same for Trump supporters. When combined with the youth theory, this theory does gain some traction, as Lenski’s answers suggest. We will take that theory more seriously under (7).
General exit polling mistakes (3), especially with respect to whether pollsters have the right voter turnout model in mind when planning where to poll, may well be a factor contest to contest, but without a specific suggestion of what they are, these mistakes should lead to misses sometimes in Clinton’s favor, sometimes in Sanders favor.
We will retain number missing out on early voters (4) for now, but note that exit polling does attempt to sample early voters. Since this is the most recurring suggestion apart from youth enthusiasm, it will be addressed further with number (7).
The idea that the first full wave of polling is unweighted (5), as my interview with Joe Lenski shows, is just a flat out myth. The polls are continually weighted throughout the day.
Additionally, Lenski told me that “[t]he exit poll results that are released around poll closing include nearly all of the voter interviews that are conducted during election day.” So much for (6) small sample size.
Lenski was most open to the suggestions of voter enthusiasm (2) or oversampling of young voters (7) skewing results somewhat. Older and less-educated voters tend to refuse more often per Edison observations over many cycles. But Edison accounts for this by taking rigorous demographic notes on all voters who refuse to participate and weights their sample size accordingly throughout the day:
“Our interviewers record the gender, approximate age and race of voters who decline to participate in the exit poll and the survey results are adjusted for the response rate by demographic groups so that the weighted results represent all voters. Typically younger voters are more likely to agree to participate in an exit poll than older voters so the percentage of older voters is typically adjusted upward to account for this non-response.”
This helps to address numbers (2), (3), and (7). Enthusiasm, youth, and general sampling errors are known and addressed by exit polling methodology. This is why, even though “Trump Overwhelming Leads Rivals in Support from Less Educated Americans,” exit polling still gets it right. Edison takes account of the fact that older and less-educated voters tend to refuse to be interviewed and adjusts accordingly. Nearly every exit polls from poll closing time has gotten Trump’s margin versus his nearest competitor right within about 1% one way or the other.
Still Nate Cohn of the New York Times’ Upshot has insisted the polls are missing because of youth and early voting misses. Since Cohn, Lenski, Professors Fried and Lay, and Antonio Gonzalez all suggest the youth of Sanders’ supporters as a strong potential cause of exit polling misses, let’s take a close look at that possibility.
This is where the Alabama Test comes in. Alabama had a low turnout for 18-29-year-olds for this cycle, just 14% according to exit polling, and no early voting. No matter how radically you adjust the age turnout, you still cannot get anything like a fourteen point swing out of the mix. (Lenski did not respond to my follow-up question on the initial numbers for 18-29-year-olds for Alabama, but that number in the half-dozen initial versus final exit polls I have numbers for has never gone up or down by more than two or three percent. And Cohn’s argument is that this bias is reflected even in the adjustments.) Oklahoma at 12% is where Edison pegged the youth vote lowest, and it is, in fact, the only place that Sanders outperformed exit poll numbers at poll closing by more than 3.5%. Age demographics, as in Alabama, skewed considerably older in Georgia and South Carolina per exit polls where Edison missed by twelve points in each instance. In North Carolina, meanwhile, there was tons of early voting and an 18% 18-29-year-old turnout. On the theories enumerated above, North Carolina should have been a really bad night for Edison’s initial full exit poll.
The Edison margin for the gap between Clinton and Sanders was right within 1.7% points at poll closing in North Carolina.
Just as voter enthusiasm (2) cannot explain why Edison gets it right for Trump but not Sanders, the age and early voting argument falls apart when applied to specific cases. Alabama should have been just fine according to Nate Cohn’s argument, as should have Georgia and South Carolina; North Carolina should have been way off. The opposite was true.
Likewise outside the South: Ohio was a bad miss for the exit polls but the youth turnout, perhaps on account of Spring Break, was projected low. In other words, it is a a nice theory Nate Cohn floats, one Joe Lenski at Edison is open to even, but it doesn’t have much to show by way of explanatory power when you put it to use. Even if the exit polls did miss by a few percentage points in terms of young voters, there is nothing like enough room to explain ten point plus misses.
What about closed primaries and provisional or affidavit ballots (8)? Are the misses worse or more likely in closed versus open primaries? (The affidavit or provisional ballot argument.) No. New York (closed) was way off, while Maryland (closed) is the most accurate result to-date with the first full initial exit poll missing by just 0.6%. Several double digit misses (Georgia, South Carolina, Alabama, Mississippi) were in open primary states, while Massachusetts and Ohio are partially open primaries where the initial full exit polls missed by eight and ten points respectively. I will, however, look at how this issue particularly impacted New York tomorrow. It could partly explain New York’s huge exit polling miss.
In short, none of these theories show strong explanatory power or correlation with the states where exit polling has missed against where it has gotten things right. The simplest answer is that exit polls not involving George W. Bush or Hillary Clinton tend to be quite accurate; ones that involve them are likely to be terrible.
But just because the available non-fraudulent explanations don’t seem to work, we are not automatically left to conclude that fraudulent explanations are the only way to go. Perhaps someone will come up with a new, non-fraudulent explanation that works. In the meantime, I will put fraudulent explanations to the same scrutiny in the next few articles in this series.