Positively pondering pesky probabilities, perchance

One inspiration to start this “data-driven storytelling” blog was the pioneering work of Nate Silver and his fellow data journalists at FiveThirtyEight.com; their analyses are an essential “critical thinking” reality check to my own conclusions and perceptions. Indeed, when I finally get around to designing and teaching my course on critical thinking (along with my film noir course), the required reading would include Silver’s The Signal and the Noise and a deep dive into Robert Todd Carroll’s The Skeptic’s Dictionary. I will also include Ken Rothman’s Epidemiology: An Introduction; what drew me to epidemiology (besides my long career as a public health data analyst) was its epistemological aspect. By that I mean how the fundamental methods and principals of epidemiology allow us to critically assess any narrative or story.

To that end, I have been reading with great interest Silver’s 11-part series that “reviews news coverage of the 2016 general election, explores how Donald Trump won and why his chances were underrated by most of the American media.” And while I highly recommend the entire series of articles, the September 21 conclusion is the jumping off point for my own observations about assessing the likelihood of various events.

**********

Let me begin with a passage from that article:

In recent elections, the media has often overestimated the precision of polling, cherry-picked data and portrayed elections as sure things when that conclusion very much wasn’t supported by polls or other empirical evidence.

I personally think investigative journalists are heroic figures who will ultimately save American democracy from its current self-induced peril. But they are trained in a very specific way: deliver the fact of a story with certainty and immediacy. In so doing, they are responding to media consumers with little patience for complex narratives suffused with uncertainty.

To quote Silver again, “a story can be 1. fast, 2. interesting and/or 3. true — two out of the three — but it’s hard for it to be all three at the same time.”

One narrative that developed fairly early about the 2016 presidential election campaign was that Democratic nominee Hillary Clinton was the all-but-inevitable victor. I wrote about one version of this flawed narrative here.

Reinforcing this narrative were election forecasts issued during the last weeks of the campaign that practically said “stick a fork in Trump, he is finished.” But as Silver rightly observes, some of these models were flawed because they failed to account for the “correlation in outcomes between [demographically similar] states.” For example, were Republican nominee Donald Trump to outperform his polls in Wisconsin on Election Day, he would likely also do so in Michigan, Minnesota and Iowa. And that is essentially what happened.

Still, because aggregating polls yields a more precise picture of the state of an election at a given point in time, I aggregated these 2016 election forecasts. Going into Election Day, here were some estimated probabilities of a Clinton victory, ranked lowest to highest.

FiveThirtyEight	71.4%
Betting markets	82.9%[1]
The New York Times Upshot	84.0%
DailyKos	92.0%
HuffingtonPost Pollster	98.2%
Princeton Election Consortium (Sam Wang)	99.5%

The average and median forecast was 88.0%. Remove the most skeptical forecast (though Clinton still a 5:2 favorite), and the average and median jump to 91.3% and 92.0%, respectively. By contrast, if you remove the least forecast, the average and median drop to 84.1% and 83.5%, respectively.

It is an understandable human tendency to look at a probability over 80% and “round up” from “very likely, but not guaranteed” to “event will happen.” And, under the frequentist definition of probability, we would be correct more than 80% of the time in the long run.

But we would not be correct as much as 20% of the time.

Ignoring Wang’s insanely optimistic forecast for various reasons, the “aggregate” forecast I had in mind on Election Day was that Clinton had about an 84% chance of winning.

The flip side, of course, was that Trump had about a 16% chance of winning.

A good way to interpret this probability is to think about rolling a fair, six-sided die.

Pick a number from one to six. The chance that if you roll the die, the number you picked will come up, is 1 in 6, or 16.7%.

On Election Day, Trump metaphorically needed to roll his chosen number…and he did.

But even if take the Wang-inclusive average of 88%, that is still a 1 in 8 chance. Throw eight slips of paper with the numbers one through eight written on them in a hat (I like fedoras, myself), pick one and draw. If your number comes up (which will happen 12% of the time over many draws), you win.

Trump picked a number between one and eight then pulled it out of our hypothetical fedora, and he won the election.

One way people misunderstand probability (and one of many reasons I am resolutely opposed to classical statistical significance testing) is mentally converting event x has a very low probability (like, say, matching DNA in a murder trial—only a 1 in 2 million chance!) with that event cannot happen.

So, even the Wang forecast—which gave Trump only a 1 in 200 chance of winning—did NOT mean that Clinton would definitely win. It only meant that Trump had to pull a specific number between one and 200 out of our hypothetical fedora. He did, and he won.

**********

On the other end of the spectrum is an overabundance of caution in assessing the likelihood of an event. This usually occurs when interpreting election polls.

In this post, I discussed Democratic prospects in the 2017 and 2018 races for governor.

One of the two governor’s races in November 2017 is in Virginia, where Democratic governor Terry McAuliffe is term-limited. The Democratic nominee is Lieutenant Governor Ralph Northam, and the Republican nominee is former Republican National Committee chair Ed Gillespie.

Here are the 13 public polls of this race listed on RealClearPolitics.com[2] taken after the June 13, 2017 primary elections:

Poll	Date	Sample	MoE	Northam (D)	Gillespie (R)	Spread
Monmouth*	9/21 – 9/25	499 LV	4.4	49	44	Northam +5
Roanoke College*	9/16 – 9/23	596 LV	4	47	43	Northam +4
Christopher Newport Univ.*	9/12 – 9/22	776 LV	3.7	47	41	Northam +6
FOX News*	9/16 – 9/17	507 RV	4	42	38	Northam +4
Quinnipiac*	9/14 – 9/18	850 LV	4.2	51	41	Northam +10
Suffolk*	9/13 – 9/17	500 LV	4.4	42	42	Tie
Mason-Dixon*	9/10 – 9/15	625 LV	4	44	43	Northam +1
Univ. of Mary Washington*	9/5 – 9/12	562 LV	5.2	44	39	Northam +5
Roanoke College*	8/12 – 8/19	599 LV	4	43	36	Northam +7
Quinnipiac*	8/3 – 8/8	1082 RV	3.8	44	38	Northam +6
VCU*	7/17 – 7/25	538 LV	5	42	37	Northam +5
Monmouth*	7/20 – 7/23	502 LV	4.3	44	44	Tie
Quinnipiac	6/15 – 6/20	1145 RV	3.8	47	39	Northam +8

Eight of these polls have Northam up between four and seven percentage points, including four of the last six. Two polls show a tied race. No poll gives Gillespie the lead.

And yet, here was the headline on Taegan Goddard’s otherwise-reliable Political Wire on September 19, 2017, referring to the just-released University of Mary Washington (Northam +5) and Suffolk polls (Even): Race For Virginia Governor May Be Close.

Granted, the two polls gave Northam an average lead of only 2.5 percentage points, which, without context, suggest a close race on Election Day. Furthermore, all three Political Wire Virginia governor’s race poll headlines since then have been on the order of: Northam Maintains Lead In Virginia.

Here is the thing, however. Most people (as I did) will equate “close” with “toss-up.” But there is a huge difference between “we have no idea who is going to win because the polls average out to a point or two either way” and “one candidate consistently has the lead, but the margin is relatively narrow.”

The latter is clearly the case in the 2017 Virginia governor’s race, with Northam’s lead averaging 4.4 percentage points in eight September polls within a narrow range (standard deviation [SD]=3.3). We are still more than five weeks from 2017 Election Day (November 7), so this is unlikely to be “herding,” the tendency of some pollsters to adjust their demographic weights and turnout estimates to avoid an “outlier” result (undermining the rationale for aggregating polls in the first place).

The problem comes when members of the media try to interpret the results of individual polls. They have absorbed the lesson of the “margin of error” (MoE) almost too well.

For example, the Monmouth poll conducted September 21-25, 2017 gives Northam a five percentage point lead, with a 4.4 percentage point MoE. Applying that MoE to both candidates’ vote estimates, we have 95% confidence that the “actual” result (if we had accurately surveyed every likely voter, not a sample of 499) is somewhere between Gillespie 48.4, Northam 44.6 (Northam down 3.8) and Northam 53.4, Gillespie 39.6 (Northam up 13.8). It is this range of possible outcomes, from a somewhat narrow Gillespie victory to a comfortable Northam win that leads members of the media to imply through oversimplification that this race will be close, meaning “toss-up.”

And yet, even within this poll, the probability (using a normal distribution, mean= 5.0, SD=4.4) that Northam is as little as 0.0001 percentage points ahead is 87.2%, making him a 7:1 favorite, about what Hillary Clinton was on Election Day 2016.

OK, maybe that was not the best example…

But when you aggregate the eight September polls, the MoE drops to about 1.3[3], putting the probability Northam is ahead at well over 99%. Even if the MoE only dropped to 3.0, the probability of a Northam lead would still be about 93%.

My point is this. Every poll needs to be considered not just as an item in itself (polls as NEWS!) but within the larger context of other polls of the same race. And in the 2017 Virginia governor’s race, the available polling paints a picture of a narrow but durable lead for Northam.

I have no idea who will be the next governor of Virginia. But a careful reading of the data suggests that, as of September 29, 2017, Lt. Governor Ralph Northam is a heavy favorite to be the next governor of Virginia, despite being ahead “only” 4 or 5 percentage points.

**********

Finally, here is an update on this post about the Democrats’ chances of regaining control of the United States House of Representatives (House) in 2018.

Out of curiosity, I built two simple linear regression models. One estimates the number of House seats Democrats will gain in 2018 only as a function of the change from 2016 in the Democratic share of the total vote cast in House elections. The Democrats lost the total 2016 House vote by 1.1 percentage points, so if they were to win the 2018 House vote by 7.0 percentage points, that would be an 8.1 percentage point shift.

Right now, FiveThirtyEight estimates Democrats have an 8.0 percentage point advantage on the “generic ballot” question (whether a respondent would vote for the Democratic or the Republican House candidate in their district if the election were held today).

My simple model estimates a pro-Democratic House vote shift of 9.1 percentage points would result in a net pickup of 26.7 House seats, a few more than the 24 they need to regain control. The 95% confidence interval (CI) is a gain of 17.0 to 36.4 seats.

But the probability that Democrats net AT LEAST 24 House seats is 71.1%, making the Democrats 5:2 favorites to regain control of the House in 2018.

My more complex model adds a variable that is simply 1 for a midterm election and 0 otherwise, as well as the product of this “dummy” variable and the change in Democratic House vote share. I hypothesized (correctly) that this relationship would be stronger in midterm elections.

This model estimates that a 9.1 percentage point increase from 2016 in the Democratic share of the House vote would result in a net gain of 31.8 seats. However, with two additional independent variables (and only 24 data points), the 95% CI is much wider, from a loss of 7.0 seats to a history-making gain of 68.3 seats.

Still, this translates to a 66.1% probability (2:1 favorites) the Democrats regain the House in 2018.

Figure 1 shows the estimated probability the Democrats regain the House in 2018 using both models and a range of percentage point changes in House vote share from 2016.

Figure 1: Probability Democrats Control U.S. House of Representatives After 2018 Elections Based Upon the Change in Democratic Share of the House Vote, 2016-18

Democratic Probability 2018 House capture

The simple model (blue curve) gives the Democrats no chance to recapture the House in 2018 until the pro-Democratic change in vote share reaches 6.5 percentage points, after which the probability rises sharply and dramatically to a near-certainty at the 10.0 percentage point change mark. The more complex model (red curve), meanwhile, assigns steadily increasing chances for the Democrats, flipping to “more likely than not” at the 7.0 percentage point change mark; even at a truly historic 15 percentage point change, the complex model only gives the Democrats an 85.3% chance to recapture the House in 2018.

For the record, I lean toward the more complex model.

It is worth noting that in the current FiveThirtyEight estimate, 15.8% of the electorate is undecided or chose a third party candidate (when an option). If the undecided vote breaks heavily toward the party not controlling the White House in a midterm election (one way electoral “waves” form), a 66-71% would likely be an underestimate of the Democrats’ chances of regaining control of the House in 2018.

And…apropos of nothing…Happy 51^st Birthday to me (September 30, 2017)!!

Until next time…

[1] To be honest, I do not recall where I got this number from…possibly from fivethirtyeight.com or maybe from https://betting.betfair.com/politics/us-politics/…

[2] Accessed September 28, 2017

[3] The total number of voters sampled across these eight polls is 4,915, which is 9.85 times higher than the 499 sampled in the Monmouth poll. The square root of 9.85 is 3.14. Dividing 4.1 by 3.14 gives you 1.31.