Update: The Facebook Primary, adjusted by USPD, nailed not only Michigan as an upset for Sanders, but also correctly called most of the counties as well as the other contests March 5-8. Kansas did go for Sanders at a much greater rate than I forecast with this model.
Weighted in a particular way, FiveThirtyEight.com’s “Facebook Primary” has been predictive of the outcome in virtually all fifteen Democratic Primary or Caucus contests thus far in 2016. A critical test of the schema awaits us this coming Tuesday, however. Ten polls in the last six week combine to predict an eighteen point loss for Senator Bernie Sanders in Michigan on March 8, but the Facebook Primary, weighted by U.S. Share Percentage Difference (USPD), suggests something on the order of a single digit loss for former Secretary of State Hillary Clinton, a stark difference of 20-25 percentage points.
What is USPD? How likely is it to be predictive of an outcome in the Democratic primaries? And why or why not? This is a rough go at a new way to forecast outcomes at the state level, especially where polling is non-existent or very poor. Perhaps over the next week to ten days it will perform far more poorly than it would have in the first fifteen contests and will thus appear to be so much bunk or noise.
First, a little background: FiveThirtyEight has become famous for fairly accurately predicting national outcomes in U.S. elections by using large data sets, particularly poll averages weighted by proximity to the date of the prediction and pollster strength based on analysis of previous predictions against actual outcomes. This year, it is using, and partially making public, data it obtains from Facebook rooted in the number of “likes” a particular candidate has vis-a-vis other candidates (both Republican and Democrat) and also the candidate’s average share nationally, by state, by county, and, in the case of several very large cities, at approximately the zip code level.
A disclaimer at the bottom of FiveThirtyEight’s Facebook Primary begins:
If Facebook likes were votes, Bernie Sanders would be on pace to beat Hillary Clinton nationwide by a nearly 3-to-1 margin and Donald Trump to garner more support than Ted Cruz and Marco Rubio combined. Anything seems possible this year, but, still, be careful how you interpret these numbers: Facebook likes are not votes.
But FiveThirtyEight is using the data in some tentative ways to fill in what would otherwise be substantial gaps. Ahead of Super Tuesday, for instance, there was only one poll in Minnesota for the Democratic side. It was from mid-January and predicted a 34 point loss for Bernie Sanders. FiveThirtyEight’s Super Guide for Super Tuesday Democratic Edition noted that polling was sparse, did not make a particular forecast for Minnesota, but said it was a state Sanders could pick up, in part because “Minnesota is one of Sanders’s better states according to Facebook ‘likes’.” Sanders won Minnesota by 23.4% points, a staggering 57.4% difference from the poll from January 18-20 published in the Minneapolis Star Tribune and taken by Mason Dixon.
Here is a chart of the contests so far, then an explanation for what otherwise looks like gibberish.
So back to our Minnesota example, if you click on Minnesota, you’ll see something like the graphic to the right here. Below the chart on the left side of the graphic, you’ll see that I have clicked on “Share vs. U.S.” This helps offset the bias for candidates who started on Facebook earlier than other candidates or who, like Bernie Sanders, appeal strongly to a younger set more likely to be on Facebook. But it still doesn’t tell us much. In Minnesota, Sanders has 33% of the total “likes” share for all candidates Democratic and Republican; Clinton has 10%. Clinton’s national average is 8%, while Sanders is 23. This means Sanders is outperforming his national average by 10 points while Clinton outperforms hers by two points. This is not surprising. Minnesota is a bit of a swing state, but generally votes Democratic in national elections and is known for a strong progressive streak. But if we simply take the 10 to 2 advantage, this tells us little.
Look now at the numbers for Massachusetts: Clinton +6, Sanders +18. 18-6! Sanders wins big, as in Minnesota, right? No. In reality, Clinton won by 1.4% points. But what if we take the percentage average by which they outperform their national average? Again, Massachusetts is decade over decade one of the most liberal states in the nation, so it’s not surprising that both Sanders and Clinton do better than their national averages. The U.S. Share Percentage Difference (USPD) would be Clinton +75%, Sanders +78%, predicting a Sanders victory but a very close outcome.
Massachusetts was a very close outcome, but Clinton was the winner. This is just one of two races out of fifteen so far where comparing the percentage differences is not predictive of the actual winner, but at least it predicted a close race. The other race is Virginia where the model shows Clinton performing at -25% against her national average and Sanders performing just -17% against his national average, but Clinton won by a wide margin over 29 points. In two additional races, USPD would predict a fairly close race where there turned out to be something of a blowout (Oklahoma and Tennessee). Significantly, Sanders simply conceded the races in Virginia and Tennessee, failing to put significant advertising dollars or additional campaign staff on the ground in the final weeks of the race after Nevada. All told, it is really only the race in Oklahoma that seems not to fit the model, and even there, the winner was at least correct.
One other note, before turning to Michigan and some suggestions for what the USPD Facebook Primary model would suggest over the next week and a half for eleven upcoming races. To get to its whole numbers, FiveThirtyEight certainly has to do some rounding. Sanders has two verified Facebook pages with over 6.25 million “likes” combined. Clinton’s Facebook page has over 2.6 million. Evidence of this rounding may show up in Illinois’ count where FiveThirtyEight says Clinton has 10% of the “likes” but is at 3% above her average. This makes sense if, for instance, she has 7.689 (8) percent of the “likes” nationally and 10.425 (10) percent of the likes in Illinois. It could also just be a clerical error. If it’s rounding, however, this could explain why Iowa shows a 13 point USPD percentage gap where Sanders was even and Clinton was up one. Perhaps without rounding, the USPD model would be more reflective of the 0.2 percent squeaker won by Clinton.
To be frank, the polling in Michigan, while plentiful, is awful. Six of the ten most recent polls were all done by the same firm (Mitchell) for Fox 2 News, local to Detroit. Check out this craziness from their Democratic primary poll released March 2:
Federal law only permits us to call land lines. Because likely Primary voters are older, 54% are 60 or older and 86% are older than 50, we believe there are sufficient land line voters to get an accurate sample. We do not have to make any assumptions of likely voter turnout.
They’re only polling senior citizens in a race that includes a candidate (Sanders) winning voters under 50 by wide margins and where voters under 50 are regularly making up around 50% of the vote. For perspective, this means that Clinton is likely winning with people over fifty at approximately the same rate she was in Nevada (67-70% depending on the math), but instead of a 28 point lead as Mitchell/Fox would have it, Clinton won by just 5.5% in Nevada. Of the other four polls, three will be older than three weeks by the date of the election (again, see Minnesota), and two of those three have much closer races of 10% (PPP) and 13% (ARG) leads for Clinton. The fourth poll from MRG from February 23-27 (20% Clinton) has a sample size of just 218 Michigan voters. This means that with the margin of error, the race could have been in the single digits ten days before the election.
Adding to the Michigan intrigue, the Clinton campaign has begun lowering expectations, suggesting that Sanders might eek out a victory in Michigan which would be offset by Clinton winning by a larger margin in Mississippi. So, let’s make a rough prediction for Michigan, as well as for the other states to vote over between now and Tuesday, using USPD from the “Facebook Primary” and comparing them to similar states which have already voted. (I’ll add an update with the Ides of March states if this is reasonably successful at forecasting.) First, the chart above again, this time including the states to vote between now and March 15:
Louisiana and Mississippi look like they’ll go by the same whopping margins as states like South Carolina and Alabama for Clinton, in this case by +55 – +65 percentage points. In Nebraska, just North of Kansas and even slightly less Red, Clinton is just below her national average and Sanders is right at his. In Iowa Clinton was one point above average, while Sanders was even. Let’s suggest this means a medium to high single digits victory for Sanders by 4-9 points.
Maine looks even worse by Facebook “likes” for Clinton than New Hampshire was, but it is a closed primary (only registered Democrats can vote) where New Hampshire was open and Sanders picked up a ton of Independent voters. Let’s call it a 20-25% win for Sanders according to USPD.
That leaves Michigan for states voting between now and Tuesday. In the South, where Clinton has won by huge margins, African American voters have made up upwards of 50% of the voters while going for her by 4 or 5 or even 6-1 margins. Michigan has a substantial black population, right near the national average of 13%, however. Even accounting for black voters regularly turning out and turning out almost exclusively for Democrats, it’s hard to see them making up more than 25% of the vote in Michigan, and Sanders has worked hard to be more competitive with black voters in the North. Detroit, by the way, looks better in the “Facebook Primary” for Sanders than did either Atlanta or Boston. This all presents a bit of a spanner, but the model definitely predicts a win for Sanders (+22 USPD in MI to Clinton’s -13 UPSD), but at a substantially lower rate than Colorado’s where Clinton was down 13% but Sanders was +52 USPD. Taking the USPD difference from Colorado, together with a two or three to one margin for Clinton with black voters, let’s project a small win for Sanders of four to six points, a 22-24 point difference from current polling. FiveThirtyEight’s own projection is that Clinton has a 99% chance of winning on Tuesday in Michigan and that they expect it to be a 24.6% victory.