Looking in the mirror, 2020 Democratic nomination polls edition

Monthly since April 2019, I have updated my weighted-adjusted polling averages for the 2020 Democratic presidential nomination. You may read about my aggregation methods here, but a key difference between my algorithm and those used by some other polling aggregators (e.g. RealClearPolitics) is that I use every publicly-available poll (as listed on FiveThirtyEight.com) released since January 1, 2019.

I Voted sticker

This means I do not:

Exclude polls based on “quality,”
Drop polls from the algorithm after a certain period of time, or
Distinguish between polls of adults, registered voters and likely voters.

I do, however, give much more weight to polls from higher-quality pollsters (as measured by FiveThirtyEight) and those released more recently. I weight “quality” by converting grades to numeric equivalents (A+=4.3, A=4.0) then dividing by 4.3. And I weight more recent polls by dividing the number of days between the poll’s field dates midpoint and January 1, 2019 by the number of days between January 1, 2019 and the election being assessed. Finally, I have not seen any appreciable difference in candidate standing based upon what set of respondents is sampled.

Simply put, I would rather collect more information, even of lower “quality” or “outdated,” than less. I would prefer to avoid defending exclusion/inclusion criteria.

My aggregation process also does something no other polling average or selection process does. It yields a single score (national-and-state-weighted weighted-adjusted polling average, or NSW-WAPA) for each 2020 Democratic nomination candidate based upon the fact nominations are won through the accumulation of delegates committed to voting for them at the 2020 Democratic National Convention. These delegates are accrued at the state level, usually based upon the results of that state’s presidential primary or caucuses. NSW-WAPA combines state and national polling averages (WAPA), weighting WAPA from the early state contests of Iowa, New Hampshire, Nevada and South Carolina higher than those from later states (with Super Tuesday states weighted twice as high) and national WAPA lowest of all.

As a brief aside, while I try to keep my personal feelings as far from this site as I can, there are two things even the best political journalists do that annoy me to no end:

They call caucuses simply “caucus.” (Multiple such events, requiring time and effort, each called a “caucus,” will be held on the same day in such states as Iowa and Nevada)
They refer to a complex, multi-stage, months-long nomination process as a “primary.” (“Primary” implies a single, national, one-day event in which every interested party member casts a ballot for the person they want to be their party’s presidential nominee. There is no such event.)

It is, frankly, lazy writing and reporting from people I otherwise respect and who should know better. I understand that “caucuses+ is an awkward word to pronounce, and that “nomination process” is a mouthful, but that is no excuse for inaccuracy and imprecision.

OK, I have put my soapbox back in the closet. Thank you for listening.

**********

Having chastised political journalists, I now look in the mirror myself.

I believe my algorithm approach (modeled to a large extent on the FiveThirtyEight approach) is the appropriate one, because it is both more comprehensive (individual higher quality polls may be outliers while certain lower quality polls may better reflect current preferences) and less prone to fluctuate wildly based on any single fluky poll or set of polls.

Nonetheless, it is important to ask if my algorithmic choices are biasing my NSW-WAPA in some way. By “bias,” I mean the mathematical difference between a calculated value and some platonic ideal “true” value.

There are three ways to think about this question:

What would happen to NSW-WAPA if I excluded “lower quality” polls entirely?
Does my current weighting scheme give outdated polls too much influence on what is essentially a snapshot of current voter preferences?
Am I correct that the type of voters sampled (e., registered vs. likely voters) does not make an appreciable difference in NSW-WAPA?

I altered my algorithm to reflect these three questions then compared the resulting NSW-WAPA to what I typically calculate. The results are summarized in the following sections.

**********

Pollster Quality. As noted, FiveThirtyEight assigns a letter grade to dozens of polling organizations based on how they conduct polls (e.g., do they call cellphones as well as landlines; do they use live callers or recordings, sometimes called “robo-polls;” do they randomly select subjects or are they Internet- or panel-based). They also assign a C+ to pollsters who did appear in their May 2018 update.

I include A+-level pollsters like Monmouth University and Seltzer & Co. (gold standard of Iowa polling, extremely difficult in multi-candidate caucuses), C- (and lower)-level pollsters like Zogby Interactive/JZ Analytics (C), McLaughlin & Associates (C-) and Survey Monkey (D-), as well as all pollsters in between.

On balance, the polling is OK, averaging between B and B-, depending on the location; B is a good midpoint.

But what if I only included pollsters with at least a B rating from FiveThirtyEight (while still weighting as before)?

First, the set of polls drops essentially in half, from:

157 national polls to 62 (34 to 17[1] pollsters)
20 Iowa polls to 10, (10 to six pollsters)
22 New Hampshire polls to 11 (10 pollsters to six pollsters)
Five Nevada polls to two (four to two pollsters)
18 South Carolina polls to six (eight to five pollsters)
36 Super Tuesday polls (10 states) to 16 polls, with 0 polls from Alabama, Minnesota, Oklahoma, Tennessee or Virginia
32 polls from 14 other states to 10 polls from six other states (Michigan, Ohio, Florida, Wisconsin, Pennsylvania, Oregon).

The number of polls overall drops from 293 (national, 28 states) to 117 (national, 15 states). However, the overall quality rises from B/B- to A-/B+ (precisely the student I was at Yale).

Table 1 compares NSW-WAPA with and without “lower quality” pollsters for the 21 announced Democratic candidates:

Table 1: NSW-WAPA for declared 2020 Democratic presidential nomination candidates with and without pollsters rated B- or lower by FiveThirtyEight

Candidate	All Polls	Pollster Rating≥B	*Difference*
Biden	27.5	29.8	2.27
Sanders	16.0	15.8	-0.20
Warren	13.2	12.5	-0.73
Harris	9.2	8.8	-0.40
Buttigieg	7.5	6.5	-1.05
O’Rourke	2.6	2.7	0.13
Booker	2.2	2.2	-0.01
Klobuchar	1.5	1.5	-0.02
Yang	1.1	1.0	-0.10
Gabbard	0.9	0.7	-0.17
Steyer	0.6	0.2	-0.41
Castro	0.6	0.6	-0.02
Gillibrand	0.5	0.5	-0.05
Delaney	0.4	0.4	0.02
Bennet	0.3	0.3	-0.03
Williamson	0.3	0.3	0.04
Ryan	0.2	0.2	-0.04
Bullock	0.2	0.1	-0.07
de Blasio	0.1	0.0	-0.11
Messam	0.0	0.1	0.02
Sestak	0.0	0.0	0.01
DK/Other	14.1	15.0	1.16

On average, there is no appreciable difference (-0.04) based on the two criteria. Regardless of direction, the average candidate shift is just 0.28.

Most of that comes from former Vice President Joe Biden, who has a 2.27 higher NSW-WAPA (29.8) in the higher-quality polls than among all polls. While I did not examine the data this way, this would imply a NSW-WPA of “just” 25.0 using only the lower-quality polls, though he would still clearly be in first place, about nine points ahead of Vermont Senator Bernie Sanders (~16.2). No other candidate does appreciably better in the higher-quality polls, with the possible exception of former Texas United States House of Representatives (“House”) member Beto O’Rourke (2.75 vs. 2.62).

In fact, a number of candidates fared worse in the higher-quality polls relative to all polls, most notably South Bend, Indiana Mayor Pete Buttigieg (-1.05), Massachusetts Senator Elizabeth Warren (-0.73) and billionaire activist Tom Steyer (-0.41). However, the only one of the three whose relative positioning would change is Steyer; using only the highest-quality polls, he drops from 11^th place to 17^th place.

Curiously, the number of respondents selecting “don’t know/not sure” or an unlisted candidate rises 1.16 to 15.1 when only higher-quality polls are analyzed.

Overall, however, while my aggregation method may slightly underrate Biden’s position (and percentage not choosing a listed candidate) and slightly overrate the positions of Buttigieg, Warren and Steyer, these differences are fairly minor.

**********

Poll recency. One simple way to down-weight older polls faster is to square the weight. For example, in my current algorithm, a poll whose field midpoint is August 22 (weight=0.417) is weighted four times as much as a poll whose field midpoint is February 28 (weight=0.104). Squaring each value, however, gives the more recent poll 16 times more weight (0.174 to 0.011).

Table 2 compares NSW-WAPA using more gradual down-weighting to more rapid down-weighting for the 21 announced Democratic candidates:

Table 2: NSW-WAPA for declared 2020 Democratic presidential nomination candidates, simple time weighting vs. squared time weighting

Candidate	Simple time weight	Squared time weight	*Difference*
Biden	27.5	26.3	-1.22
Sanders	16.0	15.2	-0.75
Warren	13.2	13.7	0.48
Harris	9.2	9.1	-0.07
Buttigieg	7.5	7.5	-0.05
O’Rourke	2.6	2.2	-0.40
Booker	2.2	2.0	-0.23
Klobuchar	1.5	1.4	-0.15
Yang	1.1	1.1	0.00
Gabbard	0.9	0.9	0.01
Steyer	0.6	0.7	0.10
Castro	0.6	0.6	-0.01
Gillibrand	0.5	0.5	0.00
Delaney	0.4	0.4	-0.01
Bennet	0.3	0.3	0.03
Williamson	0.3	0.3	0.02
Ryan	0.2	0.2	-0.01
Bullock	0.2	0.2	0.02
de Blasio	0.1	0.1	0.01
Messam	0.0	0.0	0.00
Sestak	0.0	0.0	0.00
DK/Other	14.1	16.2	2.25

If anything, the differences are even smaller here: although the mean “actual” difference is -0.11, the mean shift, regardless of direction, was only 0.17.

Table 2 does suggest Warren is rising faster (13.7 vs. 13.2) than my more-conservative algorithm shows, while Biden (-1.22), Sanders (-0.75) and O’Rourke (-0.40) are dropping faster; New Jersey Senator Cory Booker and Minnesota Senator Amy Klobuchar also appear to be losing ground recently[2]. Meanwhile, the proportion not choosing any listed candidate seems to be increasing, suggesting greater volatility in the race than my algorithm suggests.

Again, however, these differences are minor.

**********

Likely vs. registered voters. If I were only analyzing national polls, this might make a meaningful difference; of the 157 national polls, just 88 limited their sample to likely voters. And while most eliminated polls are from lower-quality pollsters, it also eliminates polls from Monmouth (A+), CNN/SSRS (A), IBD/TIPP (A-), Quinnipiac University (A-) and Reuters/Ipsos (B+).

However, just two polls (both by Gravis Marketing) in total from Iowa, New Hampshire, Nevada and South Carolina are of registered voters. Moreover, just 18 polls from every other state combined (mostly from Pennsylvania and Texas) were not of registered voters, meaning the number of polls analyzed only drops from 293 to 204, primarily from the lowest-weighted polls (national).

Not surprisingly, there is barely any difference between NSW-WAPA using all polls and only those of likely voters. The mild exception is Steyer dropping from 11^th to 15^th place (0.62 to 0.39), while the percentage not choosing a listed candidate drops from 14.1 to 13.4.

All combined. Just for fun, I limited the polls being analyzed only to those which were from pollsters with at least a B rating AND sampled likely voters, AND I used the squared time weight. This left 32 national polls, 10 Iowa polls, 11 New Hampshire polls, two Nevada polls, six South Carolina polls, 12 Super Tuesday polls and six polls from all other states (Ohio was dispatched), for a total of just 79 polls.

Table 3: NSW-WAPA for declared 2020 Democratic presidential nomination candidates, comparing original algorithm to most restrictive

Candidate	All Polls	Pollster Rating≥B	*Difference*
Biden	27.5	29.8	2.29
Sanders	16.0	15.4	-0.61
Warren	13.2	13.4	0.12
Harris	9.2	8.7	-0.50
Buttigieg	7.5	6.8	-0.80
O’Rourke	2.6	2.2	-0.36
Booker	2.2	2.1	-0.14
Klobuchar	1.5	1.4	-0.16
Yang	1.1	1.0	-0.04
Gabbard	0.9	0.8	-0.12
Steyer	0.6	0.2	-0.38
Castro	0.6	0.5	-0.05
Gillibrand	0.5	0.5	-0.05
Delaney	0.4	0.5	0.06
Bennet	0.3	0.3	-0.01
Williamson	0.3	0.4	0.09
Ryan	0.2	0.1	-0.05
Bullock	0.2	0.1	-0.06
de Blasio	0.1	0.0	-0.12
Messam	0.0	0.1	0.02
Sestak	0.0	0.0	0.01
DK/Other	14.1	15.2	1.26

As with limiting polls to those from pollsters with a B rating or better, the average “actual” difference is -0.04, while the average shift, regardless of direction, is 0.29. Other than Biden (+2.29) and “unlisted/unsure” (+1.26), no candidate did measurably better when limiting the analysis to these 79 national and state polls. Buttigieg (-0.80), Sanders (-0.61), Harris (-0.50) and Steyer (-0.38) all did somewhat worse, but—again—only Steyer’s rank changed (11^th to 17^th).

Conclusion. While my algorithm may somewhat understate Biden’s strength—and overestimate Steyer’s—while not fully capturing the rate at which Warren is gaining support, the overall differences are so minor I see no reason to alter it.

Until next time…

Postscript. For those who are curious, here is a comparison, as of August 28, 2019, between NSW-WAPA and the RealClearPolitics (RCP) averages, combining national and state polls, using my weighting scheme. I exclude New York Senator Kirsten Gillibrand—who dropped out on August 28—and Miramar, FL Mayor Wayne Messam, who is not included in the RCP averages; if a candidate’s average was not listed by RCP (i.e., it was less than 0.5%), I assigned her/him a value of 0.25%.

Table 4: Comparing NSW-WAPA to RealClearPolitics averages for declared 2020 Democratic presidential nomination candidates

Candidate	All Polls	RealClearPolitics	*Difference*
Biden	27.5	27.8	0.3
Sanders	16.0	15.9	-0.1
Warren	13.2	15.3	2.1
Harris	9.2	10.7	1.5
Buttigieg	7.5	6.4	-1.1
O’Rourke	2.6	1.8	-0.8
Booker	2.2	2.0	-0.2
Klobuchar	1.5	1.8	0.3
Yang	1.1	1.5	0.4
Gabbard	0.9	1.4	0.5
Steyer	0.6	2.0	1.4
Castro	0.6	1.0	0.4
Delaney	0.4	0.5	0.1
Bennet	0.3	0.3	0.0
Williamson	0.3	0.3	0.0
Ryan	0.2	0.4	0.2
Bullock	0.2	0.3	0.1
de Blasio	0.1	0.4	0.3
Sestak	0.0	0.2	0.2
DK/Other	14.1	10.1	-4.0

Other than underestimating the polling strength of Warren, California Senator Kamala Harris and Steyer—and overestimating the strength of O’Rourke—these differences are minimal; the much lower percentage not choosing a listed candidate in the RCP average could easily result from assigning 0.25% (likely too high) to unlisted candidates. The average actual difference is just 0.1, and the average difference, regardless of direction, is 0.7.

The next comprehensive update will come just before the September 12, 2019 debate.

[1] Emerson College, Monmouth, GBAO, CNN/SSRS, ABC News/Washington Post, Quinnipiac, YouGov, Suffolk University, WPA Intelligence, YouGov Blue, NBC News/Wall Street Journal, Perry Undem-YouGov, SurveyUSA, Public Policy Polling (PPP), IBD/TIPP, Reuters/Ipsos, Fox News

[2] This may be illusory for Booker. Taking a simple average of national polls, Booker was at 1.7% between the first and second Democratic debates, but he has risen to 2.7% since then.