2020 Iowa Caucuses: How did my polling averages fare?

Given the extremely volatile polling for the 2020 Democratic presidential nomination following the conclusion of the Iowa Caucuses, I will not provide global monthly updates for next few months. Instead, I will focus on the first handful of primaries and caucuses: Iowa on February 3, New Hampshire on February 11, Nevada on February 22, South Carolina on February 20, the 14 Super Tuesday contests on March 3, and so forth.

Also: I now weight polls conducted partially after February 3, 2020 either 1.333 or 1.667 times higher, and polls conducted entirely after February two times higher, than polls conducted entirely before February 4, 2020.

On the night of February 3, 2020, I was sitting on my usual spot on our sofa, watching MSNBC and anticipating returns from that day’s Iowa Caucuses.

Iowa Visitor Center Sep 1990

Earlier that day, I had published my final WAPA (weighted-adjusted polling average) for the 11 declared Democratic presidential candidates, calculated four different ways (Table 1):

  • Using all 58 polls conducted since January 1, 2019
  • Using only the 45 polls released since the 1st Democratic debate on June 26, 2019
  • Using only the 21 polls released since the 5th Democratic debate on November 19, 2019
  • Using only the 15 polls released since the 7th Democratic debate on January 14, 2020

Table 1: Final Iowa Caucuses WAPA for declared 2020 Democratic presidential nomination candidates

Candidate All Polls Since 1st Debate Since 5th Debate Since 7th Debate
Biden 19.9 19.8 20.1 20.3
Sanders 18.4 18.8 21.0 22.7
Warren 17.1 18.1 15.6 15.6
Buttigieg 15.9 16.8 16.7 16.7
Klobuchar 6.9 7.3 9.1 9.7
Yang 3.0 3.2 3.6 3.9
Steyer 2.8 3.1 3.1 3.5
Gabbard 1.5 1.6 1.5 1.6
Bloomberg 0.4 0.4 0.6 0.5
Bennet 0.3 0.3 0.2 0.3
Patrick 0.0 0.0 0.0 0.1
DK/Other 13.8 10.6 8.5 5.2

Based solely on these numbers, one would reasonably draw the following conclusions:

  • United States Senator (“Senator”) from Vermont Bernie Sanders and Minnesota Senator Amy Klobuchar were rising in the polls heading into the Iowa Caucuses, as to a lesser extent were entrepreneur Andrew Yang and businessman Tom Steyer.
  • Massachusetts Senator Elizabeth Warren was declining in the polls.
  • No other candidate was moving in the polls one way or the other.

By 11:37 pm EST, however, I had grown tired of waiting for results other than successive waves of entrance polls, so I tweeted the following:

RIP, Iowa Caucuses (1972-2020)

I have defended their idiosyncrasies for decades, believing the retail aspects of campaigning there outweighed the low-turnout mischegoss of the process.

 No more.

 This is ridiculous.

 #IowaCaucuses #iowacaucus2020

I will not relitigate here the myriad problems the Iowa Democratic Party had with tabulating, validating and releasing three distinct measures:

  1. Initial headcount of support for each Democratic candidate (“Initial tally”)
  2. Post-realignment headcount of support for each Democratic candidate (“Final tally”)
  3. Allocation of “state delegate equivalents,” or SDE’s, the only measure ever previously reported

Moreover, my annoyance has abated since Monday night, primarily because I suspect these vote-reporting snafus revealed that the byzantine process of converting persons standing in rooms, then possibly standing in different parts of the room, into SDE’s has always been “riddled with errors and inconsistencies,” to quote a recent New York Times headline. And if this marks the beginning of the end of using caucuses to allocate delegates to each party’s nominating conventions, so be it; they are undemocratic, exclusionary and overly complex.

As for which states “should” come first in future presidential nominating processes, I am currently agnostic.

Three days later, we finally have near-final results from the Iowa Caucuses (Table 2):

Table 2: Near-final Iowa Democratic Caucuses results, February 3, 2020

Candidate Initial Tally Final Tally SDE’s
Biden 15.0 13.7 15.8
Sanders 24.8 26.6 26.1
Warren 18.4 20.2 18.0
Buttigieg 21.3 25.0 26.2
Klobuchar 12.7 12.3 12.3
Yang 5.0 1.0 1.0
Steyer 1.7 0.2 0.3
Gabbard 0.2 0.0 0.0
Bloomberg 0.1 0.0 0.0
Bennet 0.1 0.0 0.0
Patrick 0.0 0.0 0.0
Uncommitted 0.6 0.1 0.2

The following three tables list the arithmetic differences between each candidate’s final Iowa Caucuses WAPA and each of the three reported measures; positive values indicate better performance in the Caucuses than in the polls.

Table 3: Arithmetic difference between Initial Iowa Caucuses % of vote and Iowa Caucuses WAPA

Candidate All Polls Since 1st Debate Since 5th Debate Since 7th Debate Mean

Difference

Biden -4.9 -4.8 -5.1 -5.3 -5.0
Sanders 6.4 6.0 3.8 2.1 4.6
Warren 1.3 0.3 2.8 2.8 1.8
Buttigieg 5.4 4.5 4.6 4.6 4.8
Klobuchar 5.8 5.4 3.6 3.0 4.5
Yang 2.0 1.8 1.4 1.1 1.6
Steyer -1.1 -1.4 -1.4 -1.8 -1.4
Gabbard -1.3 -1.4 -1.3 -1.4 -1.4
Bloomberg -0.3 -0.3 -0.5 -0.4 -0.4
Bennet -0.2 -0.2 -0.1 -0.2 -0.2
Patrick 0.0 0.0 0.0 -0.1 0.0
DK/Other -13.2 -10.0 -7.9 -4.6 -8.9

Initial tally. If the Iowa Caucuses were instead the Iowa Primary, this would have been the only vote reported. On this measure Sanders, Klobuchar and former South Bend, IN Mayor Pete Buttigieg averaged 4.5-4.8 percentage points (“points”) higher in the initial tally than in their WAPA. And the closer in time the polls were to the Iowa Caucuses, the more “accurate” the WAPA.

Warren (+1.8 points) and Yang (+1.6) also overperformed their WAPA in the initial tally, albeit by smaller margins. And for Warren, older polls were more predictive than recent polls.

By contrast, former Vice President Joe Biden did an average of 5.0 points worse in the initial Iowa Caucuses tally than his WAPA. Steyer and United House of Representatives Member from Hawaii Tulsi Gabbard (-1.4 each) also performed somewhat worse than their WAPA.

Table 4: Arithmetic difference between Final Iowa Caucuses % of vote and Iowa Caucuses WAPA

Candidate All Polls Since 1st Debate Since 5th Debate Since 7th Debate Mean

Difference

Biden -6.2 -6.1 -6.4 -6.6 -6.3
Sanders 8.2 7.8 5.6 3.9 6.4
Warren 3.1 2.1 4.6 4.6 3.6
Buttigieg 9.1 8.2 8.3 8.3 8.5
Klobuchar 5.4 5.0 3.2 2.6 4.1
Yang -2.0 -2.2 -2.6 -2.9 -2.4
Steyer -2.6 -2.9 -2.9 -3.3 -2.9
Gabbard -1.5 -1.6 -1.5 -1.6 -1.6
Bloomberg -0.4 -0.4 -0.6 -0.5 -0.5
Bennet -0.3 -0.3 -0.2 -0.3 -0.3
Patrick 0.0 0.0 0.0 -0.1 0.0
DK/Other -13.7 -10.5 -8.4 -5.1 -9.4

Final tally. Only three candidates improved their vote totals after supporters of non-viable candidates shifted to a viable candidate (15% of attendees at a precinct caucus):

  • Buttigieg (+5,638 supporters; +3.7 points)
  • Warren (+2,238; +1.8)
  • Sanders (+2,155; +1.8)

These three candidates, as well as Klobuchar (-1,288; -0.4), performed better in the final tally than their WAPA, on average. As with the initial tally, WAPA using more recent polls was most predictive for Sanders, Buttigieg and Klobuchar, while WAPA using older polls was most predictive for Warren.

Biden, on the other hand, lost 2,693 supporters and dropped 1.3 points between the initial and final tallies; Yang and Steyer also lost considerable support between the initial and final tallies. For all three candidates, WAPA using earlier polls was most predictive.

Table 5: Arithmetic difference between Iowa Caucuses SDE % and Iowa Caucuses WAPA

Candidate All Polls Since 1st Debate Since 5th Debate Since 7th Debate Mean

Difference

Biden -4.1 -4.0 -4.3 -4.5 -4.2
Sanders 7.7 7.3 5.1 3.4 5.9
Warren 0.9 -0.1 2.4 2.4 1.4
Buttigieg 10.3 9.4 9.5 9.5 9.7
Klobuchar 5.4 5.0 3.2 2.6 4.1
Yang -2.0 -2.2 -2.6 -2.9 -2.4
Steyer -2.5 -2.8 -2.8 -3.2 -2.8
Gabbard -1.5 -1.6 -1.5 -1.6 -1.6
Bloomberg -0.4 -0.4 -0.6 -0.5 -0.5
Bennet -0.3 -0.3 -0.2 -0.3 -0.3
Patrick 0.0 0.0 0.0 -0.1 0.0
DK/Other -13.6 -10.4 -8.3 -5.0 -9.3

SDEs. The same pattern holds for SDEs as for final vote tally, with one minor modification.

  • Buttigieg, Sanders and Klobuchar outperformed their WAPA, with the difference decreasing with more recent polls
  • Warren outperformed her WAPA, with the difference increasing with more recent polls
  • Biden, Steyer and Yang underperformed their WAPA, with the difference increasing with more recent polls.

The bottom line. To evaluate these comparisons globally, I used the sum of the squared differences (“SSE”) between each WAPA value and the results value. Excluding “DK/Other,” Table 6 lists the SSE for each comparison; higher values indicate lower predictive power.

Polling period Initial Tally Final Tally SDEs
All Polls 136.5 240.5 224.9
Since 1st Debate 115.8 210.8 198.2
Since 5th Debate 88.3 190.4 168.0
Since 7th Debate 77.1 177.8 156.1

WAPA was most predictive of the initial tally, not surprising given that poll respondents are asked which candidate they planned to support upon arriving at the caucus site, and not about second or third choices. WAPA was also slightly more predictive of the distribution of SDEs than of the final raw tally of supporters, though neither was especially predictive.

For each reported measure, WAPA was more predictive the closer the polls were to the Caucuses; I will admit this rather surprised me, given the candidate-specific differences detailed above. One explanation is that including older polls, however low-weighted, masks late polling movement of the kind that occurred to Sanders, Buttigieg and Klobuchar.

For now, however, I will continue to report multiple versions of WAPA, if only to see if this pattern holds for later contests.

Now, on to New Hampshire!

Until next time…

Rituals and obsessions: a brief personal history

It started with “Taxman” by The Beatles.

Its distorted vocal opening had gotten stuck in my head despite my stated antipathy toward the band—really more pose than position, in retrospect.

Whenever I run a bath, I like to be in the tub while the faucet(s) run. Until quite recently,[1] when the tub was nearly full, I would turn off the cold water and turn on the hot water to its scalding limit, counting down “one-two-three-four, one-two-three-four, one-two-three-four, one-two-three-four” in the same slow tempo as the opening of “Taxman.” Only then would I turn off the hot water and settle in for a steamy cleansing soak.

I realize the actual track opens with “one-two-three-four, one-two” before George Harrison sings “Let me tell you how it will be/There’s one for you, nineteen for me.”

But, hey, my ritual, my rules.

At some point, I stopped employing that ritual to start a bath—only to replace it with one for exiting a bath, even as most of the water had drained around me. During my senior year at Yale, two other seniors and I lived off-campus. Our second-floor walkup had a bathtub, which I used most nights. One night, for…reasons, before the water fully drained, I squatted down and scooped up some water, quickly shaking it out of my hands as though I had just washed my hands in a sink. I repeated that sequence twice, except on the third iteration, I stood up, shaking out my hands as I did so. Only then did I step onto the bath mat.

I have performed this ritual—or some slight variant of it—every single time I have exited a bathtub since the fall of 1987. It is not as though I expect something bad will happen if I do not do so—I am not warding off anxiety; when that particular coin is flipped, it lands on depression for me nearly every time. It is simply that having started doing it, I continued to do it, making it an essential part of my bathtub “routine.”

Funnily enough, I have yet to mention this routine to my psychotherapist.

**********

In a recent post, I detailed ways the Netflix series Stranger Things had resonated with me at a deeply personal level. As of the evening of December 26, my wife Nell and I had watched the entire series—25 episodes over three seasons—twice, the second time with our two pre-teen daughters. Nell’s pithy takeaway: “I would watch it again.” Our younger daughter may already have, quietly watching in her bedroom on her new iPad. She now very much wants her friends to watch the show so she can discuss it with them…or at least have them understand why she suddenly—and with great affection—calls folks, mainly me, “mouth breather” or “dingus.”

Meanwhile, over the course of winter break, a small army of Funko Pop! figures appeared in our home, which our younger daughter arranged in rough chronological order; the short video I took of the sequence is my first ever “pinned” tweet.

Stranger Things tower.JPG

Clearly, I am not the only member of this household now utterly obsessed with the admittedly-excellent series. And one peek inside our younger daughter’s room, decorated in true Hufflepuff fashion, will reveal I am not the only member of this household who easily becomes obsessed.

But I am one of only two members of this household legally old enough to purchase and/or consume alcohol, and I am the only one who refused to drink alcohol until well into my college years—even as my high school classmates would try to get me to join them in beer drinking as we stayed in hotels for Youth in Government or Model UN—because I was very wary of my obsessive nature. I was well aware how often I could not simply enjoy something—I had to fully absorb it into my life.

Indeed, once I did finally sample that first Molson Golden in the converted basement seminar room I shared with two other Elis sophomore year, I liked it far more than I would have anticipated from sampling my father’s watered-down beer at various sporting events. Age prevented me from drinking too much, though, until I turned 21 early in my senior year. On my birthday, those same off-campus roommates took me to a local eatery called Gentree. An utter novice at drinking anything other than beer, I had no clue what to order; the gin and tonic I settled upon did nothing for me. Shortly thereafter, after a brief flirtation with Martini and Rossi (I still do not know how that bottle appeared in our apartment), I tried my first Scotch whisky.

It was love at first sip.

Over the next few years, I never drank enough for anyone to become, you know, concerned, but I did feel like I needed to have a glass of J&B or Cutty Sark with soda water—usually lemon Polar Seltzer—every day. When a close friend came to visit me in the Boston suburb of Somerville in January 1992, he presented me with a bottle of Glenfiddich—one of the better single-malt Scotches—and it was like having a revelation within a revelation, as this photograph from that night depicts.

Glenfiddich Jan 1992.jpg

This photograph reminds me I spent the 1990s and a significant chunk of the following decade living in turtlenecks—of all colors—because I decided one day while getting my hair cut, I liked the way the white cloth band looked around my neck. You know, the one hair stylists use to keep freshly-cut hair from dropping inside your shirt.

Eventually, I settled on Johnnie Walker Black (light rocks, club soda on the side[2]) as my primary poison—though I also developed a taste for a port wine called Fonseca Bin 27. Between 1991 and 1993, I spent way too much time at the bar of an terrific restaurant called Christopher’s. In 2005, I used old credit card receipts, which I had stuffed into a desk drawer for years, to calculate I spent $1,939.23 there (roughly $3,500 in 2019) in just those three years—and that sum excludes cash payments. Apparently, a hallmark of being both obsessive and a math geek is the construction of Microsoft Excel spreadsheets to calculate inconsequential values.

It would be another 10 years before I worked Scotch into my emerging Friday night bath ritual—the one with the curated music and the darkness and the single large pine-scented candle from L.L. Bean and the lavender milk bath stuff and the way I would turn off every light before walking into the candle-lit bathroom with my full tumbler of Johnnie Walker Black, or 10-year-old Laphroaig on special occasions. Ahh, that delectably peaty aroma…

More recently, Nell and I moved away from beer and whisky, respectively, toward red wine, going so far as to join Wine of the Month Club. Well, I also developed a taste for rye whisky, be it neat, mixed with ginger ale or in an Old Fashioned.

The point of this borderline-dipsomaniac history is that my high school instincts about my obsessive nature were remarkably close to the mark. Prior to being diagnosed with depression, I self-medicated with alcohol far more than I ever wanted to admit to myself. Perhaps not coincidentally, I recently cut my alcohol consumption down to almost nothing, though my stated reason is the toll it was taking on my sinuses, which have had more than enough trouble already.[3]

**********

Family lore holds I learned to read at the age of 2½, which my elementary school educator wife tells me is physiologically impossible. Whenever it was, by the time I was eight or so, I had already amassed a solid library of books.

And then I learned about the Dewey Decimal System.

With that, it no longer sufficed to organize my books alphabetically by subject or author or title, or even to use the Library of Congress classification system. No, I had to Dewey-Decimalize them, which meant going to Ludington Library, where I spent a great deal of my childhood and teenage years, to photocopy page after page of classification numbers. I still have a few books from those days, penciled numbers in my childish handwriting on the first page just inside the cover. I even briefly ran an actual lending library out of my ground-floor playroom—the one rebuilt after the fire of March 1973.

Meanwhile, my mother, our Keeshond Luvey and I spent the summers of 1974 and 1975 living in the “penthouse” of the Strand Motel in Atlantic City, NJ; my father would make the 60-mile drive southeast from Havertown, PA most weekends. In those years, the roughly 2½ miles of Pacific Avenue between Albany and New Hampshire Avenues were dotted with cheap motels and past-their-time hotels. The Strand was one of the better motels, with a decent Italian restaurant just off the lobby, dimly lit with its semi-circular booths upholstered in blood-red leather; I drank many a Shirley Temple over plates of spaghetti there. In that lobby, as in every lobby of every motel and hotel along the strip, was a large wooden rack containing copies of a few dozen pamphlets advertising local attractions.

At first, I simply took a few pamphlets from the Strand lobby to peruse later. Then I wanted all of them. Then I began to prowl the lobbies—yes, at seven, eight years old I rode the jitney by myself during the day, at just 35¢ a ride—of every motel and hotel along Pacific Avenue, and a few along Atlantic Avenue one block northwest, collecting every pamphlet I could find. They were all tossed into a cardboard box; when the winter felt like it was lasting too long, I would dump the box out on my parents’ bed and reminisce.

In the year after that second summer, I became attuned to pop music, leaving Philadelphia’s premiere Top 40 radio station, WIFI 92.5 FM, on in my bedroom for hours at a time, while I did homework, read or worked diligently on…projects.

Back in 1973, my parents had bought me a World Book Encyclopedia set, complete with the largest dictionaries I had ever seen. The W-Z volume had a comprehensive timeline of key events in world history. Late in 1976, I received a copy of the 1977 World Almanac and Book of Facts, which also had a comprehensive timeline of key events in world history. And I soon noticed some events were on one timeline but not the other.

Thus, in February 1977, with WIFI 92 as my personal soundtrack, I began to write out a collated timeline, drawing from both sources. Thirty-six lined notebook pages hand-written in pencil later, I had only gotten as far as June 30, 1841—so I decided to slap a red construction paper cover on it and call it Volume I.

Important Events and Dates.JPG

I assigned it Dewey Decimal value 909.

You could say I came to my senses—or I bought a copy of the astounding Encyclopedia of World History—because I never did “publish” a Volume II. In April 1978,[4] however, I wrote a similarly non-knowledge-advancing booklet—no cool cover this time—called 474 PREFIXES, ROOTS AND SUFFIXES. This volume, assigned Dewey Decimal number 423, was only 10 pages long, despite being more comprehensive.

**********

Even before I immersed myself in hours of 1970s Top 40 radio, I had heard bits and pieces of New Year’s Eve countdowns of the year’s top songs. The first one I remember hearing was at the end of 1974, because I heard Elton John’s “Bennie and the Jets,” which topped the Billboard Hot 100 in April 1974—though I could be mixing it up with John’s “Goodbye Yellow Brick Road,” released as a single the previous year.

In January 1980, Solid Gold debuted with a two-hour special counting down the top 50 songs of 1979. I was particularly curious to know the ranking of my favorite song at the time, Fleetwood Mac’s “Tusk;” if memory serves, it led off the show at #50. A few days earlier, my cousins and I had listened in the house we then shared to WIFI-92’s top 100 songs of 1979 countdown.

I was vaguely aware there were weekly magazines that tracked top songs and albums, but I did not buy a copy of Cashbox until late April 1980.[5] My Scotch whisky revelation nearly eight years later was a mere passing fancy compared to this slender combination of music and data. I pored over its charts for hours, even calling my best friend to all but read the singles and album charts to him; utterly disinterested, he was nonetheless very patient with my exuberance. That fall, I noticed that every Saturday, the Philadelphia Bulletin published that week’s Billboard top 10 singles, albums—and two other categories, possibly country and soul. Reading these charts—literally covering them with a napkin which I slid up to uncover each song/album from #10 to #1—became a staple ritual of my regular Saturday morning brunch with my father, from whom my mother had separated in March 1977. Not satisfied with reading them, I clipped each set of charts so I could create my own rankings along the lines of “top songs, September 1980 to March 1981.”

On December 31, 1980 and January 1, 1981, I heard two radio stations present their “Top 100 of 1980” countdowns. I listened to the first one with my cousins in my maternal grandmother’s apartment in Lancaster, PA; my mother and her sister were also there. The second one my mother and I heard in the car driving home, although we lost the signal halfway through the countdown; I still was able to hear one of my favorite songs then: “More Love” by Kim Carnes. The following weekend, I found a paper copy of yet another 1980 countdown while visiting the Neshaminy Mall with my mother and severely mentally-impaired sister, who lives near there. It was probably there I also found Billboard’s yearend edition, which I purchased—or my mother purchased for me.

After a delirious week perusing its contents, I obtained a copy of the first official weekly Billboard of 1981, for the week ending January 10—albeit released Tuesday, January 6. One week later, I bought the January 17 edition, then the January 24 edition, then the January 31 edition. In fact, I bought every single issue of Billboard for the next seven-plus years, ritualistically digesting its charts using the same uncovering method as the charts published in the Bulletin. I brought each issue to school with me, where my friends and I would pore over its contents during lunch period. Later, I happily scrutinized airplay charts from a selection of Top 40 radio stations across the country—I underlined particular favorites—while waiting to make deliveries for Boardwalk Pizza and Subs in the spring and summer of 1984.

On the few occasions I did not have the $4 purchase price, I sold an album or two to Plastic Fantastic, then located on Lancaster Avenue in Bryn Mawr, PA, to make up the difference; this was after cajoling my mother to drive me to the excellent newspaper and magazine store which then stood a short walk down Lancaster Avenue from Plastic Fantastic. While new issues of Billboard were released every Tuesday, in 1981 and 1982, I would have heard the new week’s Top 40 singles counted down the previous Sunday night on the American Top 40 radio program, then hosted by Casey Kasem.

Sometime in 1981, I began to compile weekly lists of the Top 10 groups, male artists and female artists…so it is not all surprising that over winter break from my sophomore year of high school, I calculated my own “Top 100 of 1981” lists. In the days prior to Excel, this meant I gathered all 51 weekly issues (the final chart of the year freezes for a week) into what I would later call a “mountain of Billboards” on the floor of my bedroom—sometimes the mountain would migrate into the living room—and tally every single and album that had appeared in the top 10 on blank sheets of paper, using acronyms to save my hands from cramping. I used a combination of highest chart position, weeks at that position, total weeks on the chart, and weeks topping such charts as Adult Contemporary, Rock, Country and Soul to generate my rankings. There would always be fewer than 100 singles or albums entering the top 10 in any given year so I would then move into the top 20 for singles and top 30 for albums. I had ways—long since forgotten—of adding up an artist’s singles and albums “points,” allowing me to produce an overall top 100 artist countdown.

Digging into my record collection, and pestering friends for whatever tracks they had, on January 1, 1982, I sat in my bedroom with my cousin and DJ’d my first Top 100 countdown, using a snippet of “Lucifer” by Alan Parsons Project for “commercial breaks.”

That first year, I stuck to the primary charts, but ambition seized me over the next few years, and I began to contemplate creating sub-generic lists; I would usually run out of steam after a week or so, however.  Fueling this obsessive data compiling were large navy mugs filled with a mixture of black coffee and eggnog. Even after enrolling at Yale in September 1984,[6] I would look forward to arriving back in our Penn Valley, PA apartment so I could dive into Billboard mountain and immerse myself in that year’s charts. I would come up for air to visit with family and friends, of course, but then it was right back into the pile, MTV playing on my bedroom television set.

Over the years, I never threw any issues away, which meant schlepping them with me on the Amtrak train from New Haven, CT to Philadelphia; my poor mother had to move giant piles of them twice, in 1986 (~275 issues) and 1987 (~325). They were a bit lighter then because I had gotten into the habit of taping some of the beautiful full-page ads depicting covers of albums being promoted that week. It started with Icehouse by Icehouse, then Asia by Asia; when my mother moved from our Penn Valley apartment, I had taped up a line of pages running nearly halfway around the walls of my bedroom.

Then, one week in September 1988, I did not buy the new edition of Billboard. Most likely, my musical tastes were shifting after I discovered alternative-rock station WHFS. Another explanation is that election data had been slowly replacing music chart data over the past four years. Moreover, I had landed on a new obsession: baseball, specifically the Philadelphia Phillies. Whatever the reason, I have not bought a Billboard since then, though I still have two Joel-Whitburn-compiled books from the late 1980s.

Besides the Phillies and American politics, I have had a wide range of obsessions since then, most recently film noir, Doctor Who, David Lynch/Twin Peaks and, of course, Stranger Things. My obsession with Charlie Chan is old news. But none of these had quite the immersive allure those piles of Billboards had in the 1980s.

Alas, my mother finally threw out all of them in the 1990s. While I wish she had at least saved the eight yearend issues, perhaps it is all for the best. Did I mention a college girlfriend once broke up with me—on Valentine’s Day no less—because I alphabetized my collection of button-down Oxford shirts by color, solids to the left of stripes?

Until next time…

[1] Nell reminds me that at some point in the year before our October 2007 wedding, she came into the bathroom while I was counting down. She apparently interrupted me because I told her, “Now I have to start again!”

[2] For reasons long since forgotten, I switched to Jack Daniels—bourbon—for a few years around 2000. I must have talked a lot about that being my default adult beverage order, because on a first date in December 2000, my soon-to-be girlfriend (my last serious relationship before Nell, for those keeping score at home) waited expectantly for me to ask for “that thing you always order.”

[3] I have long joked that if my upper respiratory system were a building, it would have been condemned decades earlier. In October 2011, I finally had surgery to repair a deviated septum and remove nasal polyps. I may still snore, but it longer sounds like I am about to stop breathing.

[4] April 19, to be exact

[5] I remember “Rock Lobster” by The B-52’s being listed, which narrows the editions to April 19 and April 26.

[6] I was so obsessed with Billboard, I actually suggested I analyze its charts for a data analysis course I took my sophomore year. Not surprisingly, that was a non-starter with the professor.

Reaching milestones of my own invention

In my last post, I described how a great friend of mine and I exchange generous Amazon gift cards for our birthdays. One gift I have already used this year’s card to purchased is this four-DVD film noir box set:

Filn Noir collection.JPG

Filn Noir collection--titles.JPG

Of the four titles in this no-frills set (the only extras are trailers for every film except Storm Fear), the only one I had already seen was He Ran All the Way. Both the surprisingly-well-made Storm Fear and the classic He Ran All the Way are superb examples of what could be called “hostage noir.” Other examples would be Suddenly—featuring a spellbindingly psychotic Frank Sinatra; Blind Alley and its 1948 remake The Dark Past; and the underrated gem Dial 1119.

Witness to Murder, despite featuring Barbara Stanwyck, George Sanders and Gary Merrill, is a watered-down version of the brilliant Rear Window; what redeems it is mesmerizing black-and-white cinematography by the ground-breaking John Alton. The titular witness, Stanwyck, does her best with the material, including a hard-to-swallow romance with Merrill’s homicide detective. Sanders, however, is believably menacing and creepy as he-who-is-witnessed; no spoilers here, as the trailer itself reveals Sanders is the killer.

A Bullet For Joey is a 1955 film best described as “bonkers,” albeit generally entertaining. Edward G. Robinson, terrific as always, is wasted as a homicide Inspector—working in a Montreal which looks suspiciously like Los Angeles, and where nobody speaks with a Canadian accent. Audrey Totter looks bored, and George Raft is—well, George Raft, wooden yet strangely charming. Both Robinson and Raft had great early success in early 1930s gangster films, but while Robinson seamlessly shifted to other roles, Raft always seems stuck around 1931. To be fair, Raft is quite good as a homicide detective in a 1954 film I quite like called Black Widow, a rare example of color film noir from the “classic” era, roughly 1940 to roughly 1960.

**********

But wait, IS Black Widow a film noir?

Nearly two-and-a-half years ago, I wrote about the “personal journey” I had taken to become a devoted fan of film noir. Two months later, a conversation with my wife Nell about career paths inspired me to write the book I am close to finishing (working title: Interrogating Memory: Film Noir Spurs a Deep Dive into My Family’s History…and My Own). My original plan was simply to flesh out the multiple facets of my personal journey into book-length form, but it quickly morphed into a full-on investigation of…what the working title sums up nicely.

In that May 2017 film noir post, I introduced my quantitative film noir research project. Essentially, I collected as many published—either as a book or on a credible website—film noir lists as I could find. These lists could be explicit (encyclopedias, dictionaries, guides, filmographies) or implicit (discussed as film noir within the text of a book about film noir), and needed to include a minimum 120 films.

Ultimately, I acquired 32 such lists, from which I created an Excel database of 4,825 films at least one “expert” labelled film noir, however indirectly. From these data I calculated a score cleverly called “LISTS,” which denotes how many lists feature that title. The idea is simple: the more film noir lists on which a film appears, the more widely it is considered film noir. Just to be perfectly clear, this is not a measure of how “noir” a film is, merely how often it is cited by acknowledged experts as noir. To date, no agreed-upon definition of “film noir” exists.

Somewhat to my surprise, only four films appear on all 32 lists: Double Indemnity, Kiss Me Deadly, The Maltese Falcon and The Postman Always Rings Twice; not surprisingly, these are exemplary films noir. Along those lines, only 201 titles (4.2%) appear on as many as 20 lists, and only 478 titles (9.9%) appear on as many as 12 lists; at the opposite end, just under half of the films appear on only one list.

Using additional information from 1) 13 shorter lists and 2) lists within lists, such as the 50-film Canon in The Rough Guide to Film Noir[i], I next calculated a score called “POINTS.” The maximum number of POINTS a film can receive is 67.5; Double Indemnity comes closest with 62.0 POINTS, followed by Out of the Past (59.0); The Maltese Falcon (58.0); Kiss Me Deadly (54.0) and Murder, My Sweet (53.5). As with LISTS, shockingly-few films had as many as 20 POINTS—249, or 5.2%–while only 515 (10.7%) had as many as 12 POINTS. Just under half—48.2%–of films had only one POINT; by definition, they appeared on only one list as well.

You may review my 46 total sources and POINT-allotment system here: Film Noir Database Sources.

Based upon the similar distributions of LISTS and POINTS[ii], every title is classified as Universal (≥12 LISTS or POINTS), Debatable (>5, <12 LISTS or POINTS) or Idiosyncratic (≤5 LISTS or POINTS); the percentage of films in each category is roughly 10%, 10% and 80%, respectively.

So, to answer the question with which I opened this section: Black Widow has 7 LISTS and 8.5 POINTS, putting it squarely in the Debatable category. I encourage you to watch it and draw your own conclusions.

**********

When I first wrote about my film noir fandom “journey” in May 2017, I had seen 558 (11.6%) of the films in the database. Incrementally increasing the LISTS minimum from 1 to 20, the percentage of films I had seen increased steadily to 87.1%. And the films I had seen comprised well over 30% of total LISTS and 40% of total POINTS; unfortunately, I failed to record the precise percentages at the time.

However, through my recent viewing of Storm Fear, every one of those values has increased. I have now seen 698—14.5%–of the 4,825 films in the database; that is 140 first-time film noir viewings in nearly 30 months, or nearly five titles a month. Updating the original breakdown:

Any film        698/4,825=14.5%

LISTS≥3        564/1,613 =35.0%

LISTS≥6        470/890    =52.8%

LISTS≥12       362/478    =75.7%

LISTS≥15      308/364    =84.6%

LISTS≥20      193/201    =96.1%

As of this writing, the only films with LISTS≥20 I have yet to see are The Devil Thumbs a Ride, Suspense, Kiss the Blood Off My Hands, Rogue Cop, Nightmare, The Thief, The New York Confidential and World For Ransom. The bottom line, however, is that the 698 films I have seen total 8,887 LISTS, or 46.3% of all LISTS in the database, putting me 705 total LISTS shy of a majority. I could reach that milestone by watching the top 40 films, by LISTS, I have yet to see, which I very much look forward to doing.

Meanwhile, when my DVD set arrived, I had seen 695 films totaling 10,735 POINTS, or 49.85% of all POINTS in the database. Witness to Murder (19 LISTS, 19 POINTS) got me to 49.94%, while A Bullet For Joey (10,10) got me to 49.98%. And…after watching Storm Fear (16,16), I was at 10,780 POINTS, which is 50.06% of the 21,534.5 POINTS in the database.

Having seen a set of films comprising a majority of all POINTS in my film noir database is a milestone I invented, but that makes it no less fun to celebrate.

**********

Speaking of milestones…I am extremely reluctant to tout my blog statistics. I write on this site because I think I have something interesting to say, not for accolades or gaudy view numbers—not that I am averse to either, mind you.

This reticence, to be honest, stems in large part from the statistics themselves: as I approach the end of three years writing on this site, I have “only” 109 followers, and my posts have been viewed “only” 8,814 times. Still, the rate of increase for both—and the latter especially—has been steadily accelerating over time. And I greatly appreciate every single follower and view—even the fellow on Twitter who said that someone to whom he had shown this post—which I published two year ago today—had called it “trash.”

And, to be fair, a number of my posts have been (relatively) widely read. In fact, in September 2018, Film Noir: A Personal Journey became my second post to receive 100 views; it has now been viewed 148 times. One month later, this post on now-Associate-Justice Brett Kavanaugh became the third to reach that milestone, and last month it topped 200 views, my second-ever post to do so. It has now been viewed 215 times, while five posts in total have now topped 100 views—133 or more views, actually.

So which post beat “Personal Journey” to 100 views and “Kavanaugh” to 200 views?

It was one I wrote on a lark as I began to write the “Charlie Chan” chapter of my book, the one in which I describe how my love of classic black-and-white crime and mystery films was predicated upon my discovery—just shy of my 10th birthday—of the 20th Century Fox Charlie Chan films of the late 1930s and early 1940s[iii]. Collecting information about those films, I built an SPSS database containing, among other data, how various organizations and critics rated those films. Combining those data into a single value, I was able to “rank” every Charlie Chan film in relative quality from lowest to highest.

I published Ranking Every Charlie Chan Film on August 26, 2017 to what could best be described as crickets. It was viewed only seven times that month and only 23 times through the end of the year, close to the median 25 views my posts receive. By the end of April 2018, it had received 42 views, just over my post-average of 40.

But starting in July 2018, something happened. The post received 20 views that month, followed by 33, 34, 46, 55 and 53 views over the next five months; by the end of 2018, it had been viewed 299 times. And, of course, the more it was read, the higher it rose on Google searches, and so the more it was read. Over the first eight months of 2019, in fact, it was viewed an astonishing (to me, anyway) 823 times, or 103 times a month. And in July 2019, nearly two years after I published it, it crossed the 1,000-view threshold. As of this writing, it has been viewed 1,234 times.

Not coincidentally, if you Google “Charlie Chan films,” the 41st entry is my post; until recently it had been 16th, but I am not complaining one bit. And if you add the word “ranked” to the search, the very first entry is my post.

As esoteric and specific as that is, I am deeply humbled by it.

**********

There is one last thing.

I do not read or follow as many blogs as I am “supposed” to in order to a “successful” blogger, but there are a handful whose latest posts I am always excited to see appear in my Inbox. In no particular order, they are:

In Diane’s Kitchen

bone&silver

MadMeg’s Musings

JulieCares

What these sites have in common, besides each author’s gracious reactions to my, at times, long-winded comments, is they are all authored by women with uniquely interesting and powerful personal stories to tell. I always have something to learn from them.

Until next time…

[i] Ballinger, Alexander and Graydon, Danny. 2007. The Rough Guide to Film Noir. London, UK: Rough Guides, Ltd.

[ii] The correlation between the two scores is 0.983.

[iii] There is a lot more to this story, of course, mostly involving my relationship with my father, his gambling and an old family business, but I save that for the book itself.

When is a pleasure “guilty?”

I first watched The Cotton Club (Francis Ford Coppola, 1984) as a sophomore in college, under curious circumstances.

That year, I lived with two other men in a converted basement seminar room in Ezra Stiles College. The year before, that room had been occupied by a student we generally referred to as the “Saudi prince” (or was it “sheikh?”); I forget his actual nationality and title. He apparently purchased a great deal of electronic equipment—and by “purchased,” I mean “charged without ever paying”—which he used in secretive solitude.

All that remained when my friends and I moved into the room was the mid-1980’s version of a big screen television. Another classmate lent us her early-model VCR—which made the fact that one of my roommates worked in the Audio-Visual department all the more valuable.

I do not remember how a copy of Café Flesh turned up in our room…but that was quite an education for me (the previews were a hoot), back when adult films were expected to have at least some coherent plot. The film made enough of an impression on me that I purchased the terrific Mitchell Froom soundtrack on vinyl.

The Key of Cool

But back to The Cotton Club. I recall vaguely enjoying it (it is a beautiful film), even though much of the historical “back story” eluded me[1]. I also remember hearing stories about how its production was more interesting than the movie itself.

I thought little about the film after that until I kept happening upon it on television in the mid-1990s. And when I sat and watched it from start to finish for a second time, I very much enjoyed it. So much so that I bought the excellent John Barry soundtrack, my first tentative foray into jazz (which I now love) and learned more about the historical “back story” I referenced earlier.

Yes, the plot is overly ambitious and convoluted[2]. Yes, it garbles and condenses and rewrites the compelling underworld history of late-1920s/early-1930s New York City (e.g., the film ostensibly ends in 1931 with the slaying of Dutch Schultz—which occurred on October 23, 1935). Yes, it is too long…or too short, depending how interested in the interweaving plot threads one is.

But I now rank it among my 10 or 20 favorite films, recently purchasing a DVD copy when I was unable to watch it on of our streaming services. As it happens, I also have a copy of Café Flesh (on VHS), and I have previously discussed my continuing love another critical non-favorite I recently purchased on DVD, Times Square.

The Cotton Club.JPG

One thing these three films have in common is a middling average score (on a 0-10 scale) on the Internet Movie Database (IMDB): 6.5 for Café Flesh and The Cotton Club and 6.7 for Times Square. For context, in his 2008 video guide[3], esteemed film critic Leonard Maltin gives The Cotton Club 2.5 stars (out of four) while giving Times Square a rating of “BOMB;” for obvious reasons, he does not include Café Flesh in his guide.

While not the worst-reviewed films ever (hello, Ed Wood!), neither are they among the greatest films ever made. Which begs the question (and setting aside the pornographic nature of Café Flesh) whether they could be characterized as “guilty pleasures.”

Which further begs the question: what makes a pleasure “guilty?”

**********

In this post, I gathered IMDB, RottenTomatoes (RT) and Maltin ratings data to “rank” the 47 Charlie Chan films released between 1926 and 1949. I decided to take the same approach with the larger universe of movies I like (loosely defined as “movies I have seen multiple times, to the best of my recollection”) to see if I could statistically distinguish “guilty pleasure” films (ones I love but to which critics/users respond with “meh”–or worse) from critically-praised films I love (e.g., L.A. Confidential, The Maltese Falcon, numerous films directed by Alfred Hitchcock or Woody Allen[4]—or starring The Marx Brothers), as well as from films SO bad they have become cult classics and/or been parodied on Mystery Science Theater 3000).

To that end I compiled a list of 557 films I am fairly certain I have seen in their entirety twice (or, at least, I have seen all the way through once and large segments of at different times). I excluded the Charlie Chan films discussed in the previous post[5].

For each film I entered its:

  • Title
  • Year of release (according to IMDB)
  • Length in minutes (ditto)
  • IMDB score and number of raters
  • Tomatometer score (% RT-sanctioned critics deeming film “fresh”), average critic rating (0-10) and number of critics
  • Audience Score (% RT users deeming film “fresh”), average user rating (0-5) and number of user raters
  • Number of stars assigned by Maltin[6], with BOMB = 0.

I included year of release[7] and length as a way to distinguish older, shorter films from more recent, longer films. There are six slightly different ways to broadly measure a film’s perceived quality. I included three “number of raters” measures to see if there was a relationship between a film’s perceived quality and the number of viewers willing to take the time to quantify their opinions on-line[8].

I also divided the films into six broad categories[9]:

  • General  (64%)
  • Film Noir (19%)
  • Other Pre-1960 (7%)
  • Woody Allen[10] (5%)
  • Alfred Hitchcock  (3%)
  • Marx Brothers (2%)

Arguably, there is overlap between Film Noir (restricted for this analysis to films released between 1940 and 1959) and Alfred Hitchcock…and a few Other Pre-1960 films…but I am comfortable with these general categories.

I have complete data for 515 films. Eight films have no Maltin rating, either because they were released in 2008 or later (Frozen, Night at the Museum 2: Battle of the Smithsonian, The Spirit, Star Trek), are relatively obscure films noir (The Guilty, Night Editor—and the excellent Spanish film Muerte de un Ciclista [Death of a Cyclist]) or…I don’t know why (the charming 1992 film Jersey Girl). The latter four films also have no Tomatometer rating or critic average rating (along with 34 other films, primarily Film Noir); I entered “0” for the number of critic raters. All analyses were performed using Intercooled Stata 9.2[11].

Some of these variables do not follow a “bell curve” (or “normal”) distribution (Table 1). For example, while the average year of release is 1974, the median year (the value at which half of all values are lower, and half are higher) is 1982. The difference results from a “skew” towards earlier films.

Table 1: Summary statistics for Film Ratings Measures

Measure N Mean

(SD*)

Median Minimum Maximum
Year of Release 556 1974.1

(20.9)

1982 1920 2013
Length (mins.) 556 103.4

(17.8)

101.0 61 220
IMDB Score 556 7.1

(0.8)

7.2 4.2 9.0
# IMDB Raters 556 79,833.4 (184,390) 19,095 140 2,015,091
Tomatometer 517 77.1

(22.4)

85 0 100
Critic Rating 517 6.9

(1.4)

7.1 2.1 9.5
# Critics 556 40.0

(42.5)

30 0 342
RottenTomatoes Audience Score 556 71.9

(17.7)

76 20 96
RottenTomatoes User Rating 556 3.5

(0.4)

3.5 2.2 4.4
# RottenTomatoes User Raters 556 216,214.5

(1,965,582)

11,867.5 39 34,296,962
Maltin Stars 548 2.8

(0.7)

3 0 4

*SD=standard deviation, a measure of how tightly values cluster around the mean: the smaller the value, the tighter the clustering. In a normal distribution, 68% of values are within 1 SD, 95% are within 2 SD and 99% are within 3 SD.

Indeed, as Figure 1 shows, the distribution of release year is bimodal, meaning there are two “peaks” in the data: one in 1946-50, reflecting the preponderance of film noir titles among my multiple-viewing films, and one between roughly 1978 and 1999, my prime movie-attendance years (ages 11-33).

Figure 1: The Distribution of Year of Release is Bimodal

Film Release Years.jpg

See here for the distribution of Length, in minutes

There is also heavy skew to the right (a long “right tail”) in the three “number of raters” measures, with the median consistently lower than the mean. In the most extreme case, while 452 films (81%) had between 29 and 99,999 RT user raters, 13 films had more than 1,000,000 raters, topping out at a staggering 30,984,432 RT user raters for Donnie Darko and 34,296,962 for Spider-Man. Not surprisingly, these three measures are strongly related to each other: the average correlation[12] between them is a moderately high 0.41; the extreme right-skew of these measures is likely lowering the correlations. There is also a modest relationship between year of release, length and number of raters: films have gotten slightly longer over time (correlation [r]=0.25), while more recent films have more raters (mean r=0.22).

Here are the distributions of these variables:

IMDB raters

Critics

RT Users

The remaining seven variables were generally normally distributed (means≈medians. Thus, films averaged 103 minutes in length (one hour, 43 minutes), with approximately two-thirds of films (66%) between 88 and 113 minutes long; eight films were more than 2½ hours long, topped by JFK (three hours, nine minutes), It’s a Mad Mad Mad Mad World (three hours, 25 minutes) and The Ten Commandments (three hours, 40 minutes). Not surprisingly, the 33 films between 61 (Dick Tracy, Detective) and 79 minutes long had a mean year of release of 1943.5[13].

There was reassuring consensus between the ratings, as the means of IMDB score (7.1), critic rating (6.9), RT user rating (3.5 out of 5 = 7.0 out of 10), and Maltin stars (2.8 out of 4 = 7.1 out of 10) all converge around a “good, but not great” 7 out of 10. Moreover, values tended to cluster relatively around the means (i.e., SD<<mean). Thus, 90% of IMDB scores were between 6.1 and 8.3, 80% of critic ratings were between 5.5 and 8.8, 93% of RT user ratings were between 5.8 and 8.4 (adjusted for a 0-10 scale), and 73% of films were assigned between 2½ and 3½ stars by Maltin (6.2-8.8 on a 1-10 scale). Fifty films I have seen more than once were assigned four stars by Maltin, whereas he rated only four of them “BOMB”[14]. The average correlation between the six pairs of ratings is a 0.75, meaning there is broad agreement between IMDB users, critics, RT users and Maltin (though mean correlation jumps to 0.85 without Maltin’s scores).

IMDB scores

Tomatometer

Critic rating

Audience Score

RT User rating

Maltin Stars

The story is similar for the Tomatometer and Audience Scores, although the former is skewed by 50 films with a Tomatometer of 100 (Audience Scores top out at 96[15]); both measures have higher medians than means. On average, 77.1% of critics, but just 71.9% of RT users, rate a given film as “fresh.” Fully two-thirds (67%) of Tomatometers are 75 or higher, while a similar percentage of Audience Scores (65%) are between 67 and 94. The correlation between the two measures is 0.72.

Across all six ratings measures (15 pairs of measures), finally, the average correlation is 0.76; without Maltin’s ratings, the average jumps to 0.83 (mean r w/Maltin=0.64).

In general, however, the vast majority of these 557 films fall in a fairly narrow range between “not bad” and “fairly good.” Bear in mind, however, that this is the universe of films I have chosen to see again; this could easily skew all of the ratings values up slightly.

**********

To separate the films into “quality” categories, I used a technique called factor analysis[16].

Factor analysis groups variables into underlying “dimensions” (or “factors”). We have already seen evidence of two dimensions in these 11 measures: six (IMDB score, Tomatometer, critic rating, Audience Score, RT user rating, Maltin stars) are all fairly highly correlated with each other—and thus with a single dimension we could call “perceived quality,” while the three “numbers of raters” measures (plus year of release and length) are modestly correlated with each other—and thus with a single dimension we could call “public awareness.”

And that is precisely what the factor analysis revealed[17]. Two factors alone accounted for 95% of the total variance in these data, which is remarkably high.

The first factor (71%) was dominated by IMDB Score, Tomatometer, critic rating, Audience Score and RT user rating[18] as well as Maltin stars and year of release. This is clearly “perceived quality.” For each film, I determined how many SD above or below the mean (set to 0) its perceived quality (PQ)[19] is.

Here are the 17 films with PQ>1.5:

The Maltese Falcon (1941 version) 1.51
Chinatown 1.52
To Be or Not To Be (1942 version) 1.53
North by Northwest 1.53
It’s a Wonderful Life 1.56
Metropolis 1.58
Kind Hearts and Coronets 1.58
On the Waterfront 1.59
Rear Window 1.61
The Cabinet of Dr. Caligari 1.61
Double Indemnity 1.61
Citizen Kane 1.62
The General 1.65
The Third Man 1.66
Casablanca 1.70
Sunset Boulevard 1.75
M 1.78

Just to reiterate: these are not the best films ever made, nor are these my favorite films (to be honest, I don’t love Sunset Boulevard, and I burned out on It’s a Wonderful Life). They are simply the most highly-rated films I have seen multiple times; Nonetheless, this is a very impressive list of films, of which The Maltese Falcon is easily my favorite, followed by Rear Window.

In fact, on average, these films have an IMDB score of 8.3, a Tomatometer of 98.2 (all≥93; six=100), a critic rating of 9.1, an Audience Score of 93.1 and an RT user rating of 8.4 (on a 0-10 scale); three have 3½ Maltin stars[20], with the rest having four. These could all be considered “Classic” films, including three silent masterpieces (Metropolis, Caligari, The General), given their average release year of 1944; only Chinatown was released after 1970 (1974). The average length of these films was slightly higher than average (108 minutes).

At the other end of the spectrum—and now we are getting to the heart of the matter—are the 22 films with PQ<-2.0:

Who’s Harry Crumb? -2.06
Cookie -2.06
Doctor Detroit -2.07
The League of Extraordinary Gentlemen -2.10
Once Upon a Crime… -2.17
Sunset -2.19
Dog Park -2.24
Mannequin -2.24
Young Doctors in Love -2.26
City Heat -2.28
The Phantom -2.32
The Marrying Man -2.33
Thank God, It’s Friday -2.35
The Meteor Man -2.42
Mixed Nuts -2.55
The Gun in Betty Lou’s Handbag -2.59
Wholly Moses! -2.61
Random Hearts -2.63
Wild Wild West -2.69
Hexed -2.74
The Adventures of Rocky and Bullwinkle -2.87
The Opposite Sex and How to Live With Them -3.06

Poor Arye Gross, who starred in two 1993 films—Hexed, The Opposite Sex…—that are two of the three worst-rated of the 515 films with complete data (I suspect The Spirit, from 2008, would also be in this low-rent neighborhood). On average, these films have an IMDB score of 5.3, a Tomatometer of 21.3 (Once Upon a Crime… has the only Tomatometer of 0 in the group), a critic rating of 3.9, an Audience Score of 35.3 and an RT user rating of 5.0 (on a 0-10 scale); the average Maltin stars is 1.6, ranging from BOMB (n=3) to three (Cookie). These are relatively recent films, with an average release year of 1991; only Thank God, It’s Friday was released before 1980 (1978). Perhaps mercifully, these films averaged 98 minutes in length.

The three films closest to the mean of 0 are Murder by Decree, Everything You Always Wanted to Know About Sex *But Were Afraid to Ask and Heaven Can Wait, with PQ of -0.004, -0.004 and 0.004, respectively. All were released in the 1970s, with average scores similar to the overall averages.

As for The Cotton Club and Times Square, they had PQ of -0.68 and -0.85, respectively—definitely in the bottom 25% of films I have seen multiple times.

The second factor (24%), meanwhile, was dominated by critics (factor loading=0.78), IMDB users (0.73), year of release (0.54), length (0.39) and RT users (0.33). This is clearly “public awareness.” For each film, I determined how many SD above or below the mean (set to 0) its public awareness (PA) was. Topping the list, with a whopping 7.1, is The Dark Knight, followed by Batman Begins (4.9) and Spider-Man (4.4)—three blockbuster superhero films from the 2000s. At the other end of the spectrum are four films released between 1935 and 1943: Mad Love (-1.30), Journey Into Fear (-1.30), Room Service (-1.29) and the film I consider the first film noir of the classic era: Stranger on the Third Floor (-1.28).

From the perspective of guilty pleasures, however, this particular dimension is far less interesting than the first one.

Before determining what films are my “guiltiest pleasures,” here are mean PQ values by category:

Category # Films PQ
Other Pre-1960 36 1.11
Alfred Hitchcock 18 0.99
Marx Brothers 9 0.65
Film Noir 80 0.60
Woody Allen 25 0.28
General 347 -0.34

Given that 11 of the 17 top-rated films are in the Other Pre-1960 category, it is not surprising that these 36 (of 39 overall) films have the highest average PQ, followed by my favorite director, Alfred Hitchcock.

As noted above, I do not necessarily love—or even much like—every one of these 557 films; some I saw multiple times when I was young (e.g., The Apple Dumpling Gang, Hot Lead and Cold Feet) but barely remember now. And there are films I quite like that are NOT on this list simply because I have yet to see them a second time (e.g., The Shawshank Redemption, Zodiac, Shutter Island, Watchmen). But those latter films are generally well-rated (e.g., mean IMDB score=8.2), so they are hardly “guilty pleasures.”

And…finally…to discover which of these multiple-viewed films are my “guiltiest pleasures,” here are the films with PQ<-1.00 I would give a 5 (or maybe 4.5, out of 5) on the “how much I like it” scale.

  1. Thank God, It’s Friday
  2. Doctor Detroit
  3. The Shadow
  4. Radioland Murders
  5. Legal Eagles
  6. Tapeheads
  7. Mystery Men
  8. Empire Records
  9. The Secret of My Success
  10. Johnny Dangerously
  11. So I Married an Axe Murderer

Each of these films are in the General category and were released during my prime movie-attendance years (1978-99), with a mean release year of 1989; I did not actually first view Thank God, It’s Friday and Empire Records until the last five or so years. They average 101 minutes in length, only slightly shorter than average. Their mean IMDB, critic and RT user ratings (on a 0-10 scale) are 6.0, 4.9 and 6.0, respectively, suggesting they are relatively more popular with the broader movie-watching public than with critics; this is echoed by having an average of only 1.8 stars from Maltin (median=2). By the same token, the average Audience Score for these 11 films (51) is higher than their average Tomatometer (43). Finally, they are far less well-known (or, at least, have fewer viewers willing to rate them online, even anonymously), averaging 19,604 IMDB raters (median=12,292), 28 critics (median=17; Mystery Men had 103) and 55,464 RT users (median=9,198).

As I hypothesized, while these films are certainly of less perceived quality compared to the other 546 films I have seen multiple times, objectively they tend to fall in the middle of the “quality” spectrum, or even a hair above it–neither truly excellent nor truly awful.

They are mostly just…meh, according to the larger universe of film critics and casual fans, with the latter being just a bit more accepting of these films than the former.

And all I will say in defense of these films is that there is a fascinating temporal intersection in Thank God, It’s Friday when the late Donna Summer (near the height of her career), a pre-fame Debra Winger and a pre-Berlin Terri Nunn are all looking into the same bathroom mirror.

Finally, to come full circle: The Cotton Club and Times Square rank as “only” my 13th and 16th guiltiest film pleasures, respectively, using this very subjective (and subject to change) method. Still, that puts them in…good?…company.

Until next time…

[1] I expect to revisit this film in more detail in a later post, but for now I will simply say the film revolves around the legendary Harlem night club—owned by powerful bootlegger and fixer “Owney” Madden—between 1928 and 1931, when “Duke” Ellington, then Cab Calloway, directed the house band. A key subplot revolves around Arthur Flegenheimer (aka Dutch Schultz) and his violent takeover of the Harlem numbers rackets.

[2] The film follows two sets of brothers in conflict with each other—one white, one black—with one of the white brothers being close friends with one of the black brothers, while each of those two friends has a love affair blocked by external forces. The parallels are fascinating and complex—but they are only part of the overall storyline.

[3] Maltin, Leonard ed. 2008. Leonard Maltin’s Movie & Video Guide: 2008 Edition. New York, NY: New American Library.

[4] Despite my ambivalence about Allen as a human being, I still love many of his films.

[5] Only Charlie Chan at the Wax Museum has a complete set of RottenTomatoes values.

[6] For nine older films, I used the rating in the 2003 edition, as Maltin stopped including many older films in later editions.

[7] As well as date of release, which I do not analyze here.

[8] Recognizing that these primarily measure a film’s overall “visibility.”

[9] I could easily have added “starring John Cusack,” “Jerry Lewis,” “David Mamet,” “Star Trek,” “The Pink Panther,” “Batman,” “Coen Brothers.”

[10] Including What’s New Pussycat.

[11] StataCorp. 2005. Stata Statistical Software: Release 9. College Station, TX: StataCorp LP.

[12] A measure of linear association between two variables ranging from -1.00 (every time one increases, the other decreases) to 1.00 (every time on increases, the other decreases).

[13] That said, Fritz Lang’s 1927 silent masterpiece Metropolis is a full 153 minutes long.

[14] Besides Times Square, they are Mannequin, The Opposite Sex and How to Live With Them and Thank God It’s Friday.

[15] Pulp Fiction, Raiders of the Lost Ark, Star Wars Episode IV: A New Hope, The Usual Suspects

[16] I experimented with cluster analysis, which groups cases instead of variables, but found little of interest.

[17] Principal factors, with an orthogonal varimax rotation, forced to two factors.

[18] Each had a “factor loading” (essentially, correlation with the “underlying dimension”) ≥0.87. The factor loadings for Maltin stars and year of release were 0.72 and -0.52, respectively.

[19] Using the “Predict” command in Stata. In essence, it converts each variable to a “z-score” (mean=0, SD=1), recalculates the factor loadings, then sums each value weighted by the factor loadings.

[20] To Be or Not to Be, Kind Hearts and Coronets, The Cabinet of Dr. Caligari.

Organizing by themes VII: Words beginning with “epi-“

This site benefits/suffers/both from consisting of posts about a wide range of topics, all linked under the amorphous heading “data-driven storytelling.”

In an attempt to impose some coherent structure, I am organizing related posts both chronologically and thematically.

In this post, I sketched the winding road on which a 28-year-old man who had just resigned (without any degree) from a doctoral program in government ended up a 48-year-old with a doctorate in epidemiology.

And in this post, that degree turns out to the endgame (for now), not the starting point.

In between those two points, that man found a genuine resting place in the field of epidemiology. So much so, that when his blog—OK, my blog—debuted in December 2016, I was already contemplating the need to publish an epidemiology “primer” to provide context for the many epidemiology-centered posts I just knew I would be writing.

Ultimately, there was only one such post, based upon an unsettling implication from my doctoral research.

This latter post appeared in April 2017, just three months before I decided to stop looking for an epidemiology-related position (or, at least, one that built upon my 19 years as health-related data analyst that was commensurate with my salary history and requirements, education and experience[1]) and focus on writing and my film noir research.

In this two-part series (which includes links to my doctoral thesis and PowerPoint presentations for each of its three component studies), I describe my experience at the 2017 American Public Health Association Annual Meeting & Expo. In January 2017, when I still considered myself an epidemiologist, I submitted three oral presentation abstracts (one for each doctoral thesis study). Two were accepted, albeit after I had announced my career shift. Nonetheless, I traveled to Atlanta, GA to deliver the two talks; the conference became a test of whether the “public health analyst” fire still burned in me the way it had.

APHA 2017 1

APHA 2017 2

Spoiler alert: not so much.

**********

Here is the thing, however.

I still love epidemiology in the abstract. As I wrote in my previous post: “In epidemiology, I had found that perfect combination of applied math, logic and critical thinking…”

In fact, I even have a secular “bible”:

modern epidemiology

In essence, epidemiology was both an analytic toolkit and an epistemological framework: critical thinking with some wicked cool math. Moreover, the notion of “interrogating memory” is informed by my desire to “fact-check” EVERYTHING–I am innately a skeptic.

Well–I was not ALWAYS a skeptic.

And much of my writing about contemporary American politics reflects my concern that the United States is facing an epistemological crisis.

Given my ongoing love for epidemiology (even if it is not currently how I make a living) and my desire to promote critical thinking, it is very likely I will revisit my doctoral field in the future on this blog.

Until next time…

[1] I hesitate to say that I was the victim of age discrimination (at the age of 50), since I cannot back up that assertion with evidence. I am on far safer ground noting that the grant-funded positions I occupied for most of the last two decades barely exist anymore.

An update on projected 2018 Democratic U. S. House seat gains

UPDATED Midnight EST, November 20, 2018. As of this writing, Democrats have netted 38 seats in the United States House of Representatives, with three races still to be called. Democrat Ben McAdams narrowly leads incumbent Republican Mia Love in Utah’s 4th Congressional District (CD), while Democrats trail narrowly in California’s 21st and Georgia’s 7th CD. For a fuller analysis of the 2018 midterm elections, see here. For more details on called and uncalled races, see here and here.

**********

With Democrat Conor Lamb’s narrow victory over Republican Rick Saccone in the March 13, 2018 special election to fill the United States House of Representatives (House) seat vacated by Republican Tim Murphy, there were 238 Republican-held seats, 193 Democratic-held seats and four open seats (2 Democratic, 2 Republican).

The two Democratic open seats (MI-13, NY-25) are likely to remain open until the entire House is up for election on November 6, 2018. In the last three presidential elections, the Democratic nominee won these Congressional districts (CD) by an average 67.4 and 18.3 percentage points (points), respectively, so it is highly likely both seats will be won by Democrats. The seat vacated by Republican Trent Franks (AZ-08) will be filled in a special election on April 24, while that of Republican Patrick Tiberi (OH-12) will be filled in a special election on August 7. The last three Republican presidential nominees won these two CDs by 22.7 and 10.2 points, respectively. The Arizona seat will likely remain Republican, though the Ohio seat could be close.

For now, however, let us assume that party control of these four CDs does not change. That would leave the partisan balance at 195 Democrats and 240 Republicans. This means that Democrats need to flip a net total of at least 23 seats to win a House majority after the 2018 midterm elections.

In a previous post, I listed projected Democratic net House seat gains based upon four models; I developed two of these models.

Briefly, I use the change in the margin by which Democrats won/lost the total nationwide vote for all 435 House seats (Democratic % – Republican %) from the previous Congressional elections (24 election, 1970-2016). In 2016, the Democrats lost the national House vote 47.6% to 48.7%, for a margin of -1.1 points[1]. The current FiveThirtyEight estimate is that Democrats leading in generic Congressional polls (“If the election was held today, would you vote for the Democratic or Republican candidate in your district?”) by 6.5 points (13.1% unsure/other parties).  If that were the actual election margin, that would equal a change of 6.5 – (-1.1) = 7.6 points.

In my “simple” ordinary least squares (OLS) regression model, I use only this value:

Estimated Democratic net seat gain = -1.63 + 3.11 * Change in Democratic margin

On average, every one percentage increase in the change in Democratic margin nets Democrats 3.1 additional House seats. Plugging a 7.6 point change in Democratic margin into the formula yields:

-1.63 + 3.11 * 7.6 = 22.01

This model has Democrats falling one seat shy of a House majority, even after winning the national House vote by 6.5 points—a result that should give believers in “one person, one vote” pause.

The 95% confidence interval (CI) around this estimate is 13.3 – 30.8, meaning we can say with 95% confidence that a 7.6 point shift in Democratic national House margin would result in a net gain of between 13 and 31 seats. This translates to a 41.2% chance the Democrats net at least 23 House seats in November 2018.

Again—that is when Democrats win the national House vote by 6.5 points. Take your pick of what to blame: incumbency advantage (Republicans gained a net 63 House seats in 2010 before the most recent round of redistricting), geographic self-sorting (Democrats in urban areas, Republicans in rural/small town areas, suburbs up for grabs) and partisan gerrymandering.

In a slightly more complex model, I account for whether the Congressional election coincided with a midterm or a presidential election:

If Midterm: Estimated Democratic net seat gain = -2.77 + 3.41 * Change in Democratic margin + 2.433

If Presidential: Estimated Democratic net seat gain = -2.77 + 2.08 * Change in Democratic margin

And here I make a confession.

In previous discussions of results from this complex model, I used an incorrect “intercept” term (-1.63 instead of -2.77), increasing estimated net Democratic House seat gains (by 1.14) and probabilities of House recapture.

Not a huge change, but I do strive for accuracy here.

Cutting to the chase, under the assumed 7.6 point change in national Democratic House margin, the complex model yields an estimated Democratic net gain of 25.6 seats (95% CI: -8.2 to 59.3[2]) with a 56.0% probability of regaining control of the House.

Here is Table 1 from my previous post updated to reflect the corrected “Berger 2” model and a required 23 net House seat gain for Democrats.

Table 1: Projections of 2018 Democratic net gains in House seats

Dem national House margin Abramson Brennan Berger 1 Berger 2
2% 19 5 8

(0%)

10

(12%)

4% 23 7 14

(1%)

17

(33%)

6% 27 13 20

(28%)

24

(52%)

8% 30 15 27

(78%)

31

(66%)

10% 34 21 33

(97%)

37

(75%)

11% 36 28 36

(99%)

41

(78%)

12% 38 31 39

(99+%)

44

(81%)

14% 42 41 45

(99+%)

51

(85%)

16% 46 56 52

 (100%)

58

 (88%)

And here is my updated graphic displaying the probability Democrats recapture the House given various changes in Democratic margin  (and dotted line indicating House control is 50-50):

Figure 1: Probability Democrats control U. S. House of Representatives after 2018 elections given change in Democratic margin from 2016

Democratic Probability 2018 House capture

You may ask why I assessed change in margin rather than actual margin. The two values are correlated r=0.55, meaning that there is a relatively strong, linear association between them (i.e., more often than not, when one increases/decreases the other increases/decreases), so the models should be broadly similar.

The outcome of interest to me is whether Democrats net at least 23 House seats they did not win in 2016. For that to happen, Democratic 2016 margins must increase from, say, 0 to -15 points to between +15 to 0 points. Republicans won 31 House seats in 2016 by ≤15 points, so a uniform swing (i.e. margin in every CD changes an identical amount) of 15 points toward the Democrats would net them eight more House seats than they need.

But let’s say the swing is not uniform. Perhaps the 195 Democratic-held seats see no change in margin in 2018, while the 240 Republican-held seats see a uniform shift of 15 points. That equates to an 8.3 point overall margin shift toward the Democrats, still netting them 31 seats (by comparison, my two models show an 8.6 point change in Democratic margin netting them 24-28 House seats).

Now, I could simply compare the actual Democratic margin in the national House vote to the actual number of seats won. The simple model is:

Estimated Democratic seats = 217.92 + 4.61 * Democratic margin

And the more complex model is:

If Midterm: Estimated Democratic seats = 211.29 + 4.18 * Democratic margin + 9.877

If Presidential: Estimated Democratic seats = 211.29 + 5.90 * Democratic margin

Plugging the current FiveThirtyEight estimate (Democrats+6.5) into either model yields an estimated 248 Democratic House seats—a net gain of 53 seats! That is a far more optimistic projection that those estimated using change in Democratic margin (22-26).

Moreover, if the Democrats and Republicans simply break even in 2018 (1.1 point increase from 2016), the “actual margin” models would project a bare Democratic House majority (218-221 seats), a net gain of 23-26 seats. By contrast, the “change in margin” models would project a net Democratic gain of only 2-3 seats under this assumption.

Statistically speaking, all four models are valid and account for more than 80% of the variance in the independent variable (net seat gain, seats won), a very high value for such simple models. Various indicators,[3] meanwhile, suggest that the simpler models are a better fit to the data than the complex models.

For now, though, I stick with the “change in margin” models as they align more with other projections (and a back-of-the-envelope seat-by-seat examination) and have, to me, a stronger conceptual underpinning.

That leaves one final test of the “change in margin” models.

One way to test the reliability (measuring the same underlying concept over repeated measurements) of OLS regression models is to remove, say, 1970 data, then rerun the regression. You would use the new model to estimate the value for the missing data point. Doing this for all (or a random subset) of data points yields illustrative comparisons between actual and estimated value, as shown in Table 2.

Table 2: Estimated (after removing that year’s data) and actual net Democratic change in House seats, 1970-2016

Year % Point Change in Dem  Margin Actual Net Dem House Seats Estimated –Actual Net Dem House Seats Model 1 Estimated –Actual Net Dem House Seats Model 2
1970 6.8 12 8.2 12.6
1972 -3.4 -13 0.8 3.6
1974 11.4 49 -17.9 -13.6
1976 -3.3 1 -13.6 -12.3
1978 -4.7 -15 -1.3 -1.5
1980 -6.0 -35 15.8 25.5
1982 9.1 27 -0.4 4.5
1984 -6.6 -16 -6.7 -0.6
1986 4.8 5 8.9 12.4
1988 -2.1 2 -10.6 -10.2
1990 0.1 7 -8.7 -7.6
1992 -2.8 -9 -1.4 0.5
1994 -12.0 -54 17.9 16.6
1996 7.1 3 19.1 12.4
1998 -1.2 4 -9.8 -9.2
2000 0.5 1 -1.1 -3.0
2002 -4.3 -8 -7.4 -7.8
2004 2.0 -2 6.9 3.8
2006 10.5 31 0.1 5.6
2008 2.6 23 -17.4 -22.8
2010 -17.2 -63 11.0 6.6
2012 7.9 7 17.6 9.9
2014 -7.1 -12 -12.8 -14.5
2016 4.7 6 7.5 1.2
Average, Raw Values 0.2 0.5
Average, Absolute Values 9.3 9.1
Average, Absolute Values, Midterms Only 8.7 9.4
Average, Absolute Values, Presidential Only 9.9 8.8

In the final two columns, a positive value means Democrats underperformed their estimated net House seat gain, while a negative values means they overperformed. Democrats underperformed and overperformed on both models 11 times each. In 1982 and 1992, the Democrats slightly overperformed on the simple model and slightly underperformed on the complex model.

The largest underperformances (≥10.0 in either model) were in 1980, 1994, 1996, 2010 and 2012; three were wave years (1980, 1994, 2010) in which Democrats averaged a net loss of 54 House seats. Similarly, the largest overperformances were in 1974, 1976, 1988, 2008 and 2014; strong waves occurred in 1974 and 2008 (Dems+36, on average) and 2014 (Dems-12). The pattern is not perfect, however, as the model was fairly close in the wave year of 2006 (Dems+31), missing by an average of three seats. Still, in 10 of 24 elections, at least one model missed the actual net change in Democratic House seats by ≥10 seats.

On average, estimates were spot on, as overperformance and underperformance essentially cancel out. However, when you examine the absolute value (difference in either direction) of differences, the models do worse, averaging +/- 9 seats. The slight differences between midterm and presidential election years make little practical difference.

Applying these average differences to the current projections (change from 2016 in Democratic national House margin of 7.6 points) yields

Berger 1: 22.0 +/- 9.3 = 12.7 to 31.3 net Democratic House seats

Berger 2: 25.6 +/- 9.1 = 16.5 to 34.7 net Democratic House seats

Put another way: Democrats would need to win the national House vote by 9.8 points in the simple model, and by 8.5 points in the complex model, for the lower end of these projected ranges to be 23 seats (what Democrats need to net to control the House after the 2018 midterm elections).

This is not how a representative democracy is supposed to work.

Until next time…

[1] Using data from Dave Liep’s indispensable Atlas of U.S. Presidential Elections

[2] This extremely wide CI results from using only 24 data points to estimate three parameters.

[3] Among them adjusted r-squared, residual sums of squares, degrees of freedom.

Projected 2018 Democratic U.S. House seat gains

This piece (only available to subscribers) appeared earlier today on Taegan Goddard’s absolutely essential Political Wire.

A new Brennan Center report says “extreme gerrymandering” could cost Democrats control of the House unless they ride a massive blue wave.

Because of maps designed to favor Republicans, Democrats would need to win by a nearly unprecedented nationwide margin in 2018 to gain control of the House of Representatives. To attain a bare majority, Democrats would likely have to win the national popular vote by nearly 11 points. Neither Democrats nor Republicans have won by such an overwhelming margin in decades. Even a strong blue wave would crash against a wall of gerrymandered maps.

Yet this is misleading without also mentioning the “the great sorting” of voters that has taken place over the last two decades. An equal, if not bigger, barrier to Democrats winning the House is the extreme urbanization of Democratic voters which leads to millions of “wasted” votes.

 Pew Research study shows:

Voters in urban counties have long aligned more with the Democratic Party than the Republican Party, and this Democratic advantage has grown over time. Today, twice as many urban voters identify as Democrats or lean Democratic (62%) as affiliate with the GOP or lean Republican.

Overall, those who live in suburban counties are about evenly divided in their partisan loyalties (47% Democratic, 45% Republican), little changed over the last two decades.

In addition, while mapping technology has made it easier for congressional maps to be gerrymandered in the redistricting process, the sorting of the electorate into distinct geographic areas makes it even easier.

Both phenomena — the use of gerrymandering during redistricting and the geographic sorting of voters — coexist and give Republicans an advantage in congressional elections.

What does this mean for the 2018 midterm elections? The Cook Political Report has found that in the last three election cycles, Democrats have won roughly 4% fewer seats than votes received nationally. If this trend holds true in 2018, then Democrats would need to win the House popular vote by roughly 7% to win the 23 seats they need to take a majority. This is a similar to a projection made by Emory University political scientist Alan Abramowitz.

In contrast, the Brennan study, which looks at responsiveness of vote margins in individual states, suggests Democrats would need an 11% margin to take control of the House.

I have expressed mild skepticism about the role gerrymandering has played in the maintenance of Republican majorities in the United States House of Representatives (House) since 2011, writing this about “wasted” votes (which I call “extraneous votes” [ExV]):

In 2016, Democrats averaged 112,222 ExV and Republicans averaged 98,582, meaning Democrats averaged 13.8% more ExV than Republicans. Narrowing the analysis only to the 39 states where partisan redistricting is even possible closes the gap: 111,401 to 102,963, with Democrats averaging 8.2% more ExV. Further removing seats with candidate(s) of only one major party reduces the absolute gap to 97,701 to 89,970, with Democrats averaging 8.9% more ExV than Republicans.

This is additional evidence for the geographic self-sorting of Democrats, which I agree (along with the creation of majority-minority legislative districts under the Voting Rights Act) has enabled Republican gerrymandering. I would also observe that Republicans won a net total of 63 House seats (and House control) in 2010, before the current legislative district lines were drawn.

The piece concluded with a tabulation of projected 2018 Democratic net House seat gains given a range of Democrats national House vote margins using the “Abramson” and “Brennan” models. Democrats need a net gain of 23 House seats to recapture the majority.

Given my own research into the relationship between national House vote margin and House seats won (using change in Democratic share of the national House vote from two years earlier; Democrats lost the national House vote by 1.1 percentage points in 2016, despite netting six seats), I decided to append my projections to Goddard’s table. “Berger 1” uses percentage point change only, while “Berger 2” also adjusts for midterm vs. presidential election. For my projections, I display both the estimated net seat gain as well as the probability Democrats net the 23 seats necessary to regain House control. In each column, boldfaced values represent Democratic House control.

Table 1: Projections of 2018 Democratic net gains in House seats

Dem national House margin Abramson Brennan Berger 1 Berger 2
2% 19 5 8

(0%)

11

(14%)

4% 23 7 14

(1%)

18

(36%)

6% 27 13 20

(28%)

25

(55%)

8% 30 15 27

(78%)

32

(68%)

10% 34 21 33

(97%)

39

(77%)

11% 36 28 36

(99%)

42

(80%)

12% 38 31 39

(99+%)

45

(82%)

14% 42 41 45

(99+%)

52

(86%)

16% 46 56 52

 (100%)

59

 (89%)

My projections fall in between those of Abramson and Brennan: Berger 1 requires Democrats to win the national vote by 6.8 percentage points, while Berger 2 requires a margin of only 5.4 percentage points (see Figure 1 below).

Figure 1: Probability Democrats Control U.S. House of Representatives After 2018 Elections Based Upon Change in Democratic National House Vote Share, 2016-18

Democratic Probability 2018 House capture

As of this writing (7:24 pm EST, March 26, 2018), the FiveThirtyEight estimate of Democratic advantage in the generic ballot is 5.7 percentage points (46.0 to 40.3%, down from a high of 13.3 percentage points on December 26, 2017). If that is the actual national House vote margin on November 6, 2018, Democrats would be projected to net between 12 and 27 House seats, depending on the model, meaning they either fell well short of their goal, or just eked it out.

Still, it is a sign of the current lopsided state of American electoral geography that a political party needs to win the national vote by between 4 and 11 percentage points just to break even in House seats.

Until next time…