Rating Doctor Who, post “reboot,” episode 2

In a previous post, I posed the following question (edited for brevity):

Which of the 131 Doctor Who episodes, nine Series’ and four Doctors have been the most (and least) admired since Rose first aired on March 26, 2005?

We learned that AI scores, IMDB ratings, and number of IMDB user-raters for each episode reveal:

  • It took time for audiences to warm to the Doctor Who reboot, while audiences were cool to the 12th Doctor until late in Series 9
  • Episodes later in Series 4—topped by the two-part Series ending The Stolen Earth/Journey’s End (AI scores=91)—were among the most admired when first aired, as is true of Series-ending episodes generally
  • Blink has become the most-admired episode ever, with an astonishing 9.8 IMDB rating and 12,881 user-raters (so far)
    • Two other “introduction for non-fans” episodes (The Girl in the Fireplace and Vincent and the Doctor) have also become very well-admired (9.3 each)
    • The 2nd half of Series 2 has three of the least well-admired episodes

AI scores and IMDB ratings were moderately-highly correlated (0.43), as were IMDB ratings and number of user-raters (0.47). Indeed, Doomsday, Silence in the Library/ Forest of the Dead, The Stolen Earth/Journey’s End, The End of Time: Part Two, The Pandorica Opens/The Big Bang, A Good Man Goes to War and The Day of the Doctor remain among the most admired (and oft-rated), while Sleep No More and Love & Monsters are still best forgotten

The next question is: Which episodes have become more admired over time, and which have lost their luster?

One good way to answer this question would be to compare every episode’s AI score to its current IMDB rating (as I did here: ai-score-vs-imdb-rating): if the latter is higher, the episode’s stature has increased, and if the latter is lower, the opposite has occurred.

This is tricky, however, because the two measures have different scales. One good solution is to convert each value to its “z-score.” A z-score is simply how many standard deviations (SD) above (+) or below (-) the average any value is; any measure converted to z-scores has an average of 0 and SD of 1. For example, my favorite episode, A Good Man Goes to War, has an IMDB rating of 9.1. Overall, IMDB ratings have an average of 8.1 and SD of 0.8. Subtracting 8.1 from 9.1, then dividing by 0.8, yields a z-score of 1.2, meaning this IMDB rating is more than one SD higher than the average. For context, in a normal distribution (also called a “bell curve”), 68% of all values are within 1 SD of the average, while 99% are within 3 SD of the average.

But let’s not get too bogged down (blogged down?) in statistical arcana…I can only ask you to bear with me so much.

Figure 1: AI Score vs. IMDB Rating (z-scores), Doctor Who episodes, 2005-16 (n=131)


According to Figure 1 above, 77 episodes have the same relative stature after multiple viewings and commentary as they did when they first aired: 40 episodes remain below-average in admiration (lower left quadrant), while 37 episodes remain above-average (upper right quadrant).

However, 27 episodes that had a below-average AI score now have an above-average IMDB rating (upper left quadrant), topped by Heaven Sent, the middle episode of the three-part Series 9 finale: a full 3.7 unit increase from -1.9 (AI score=80) to 1.9 (IMDB rating=9.6)! Episodes with a similarly-large jump in admiration include Hell Bent (+2.2; 82 to 9.0), the follow-up episode to Heaven Sent, Listen (+2.2; 82 to 9.0), The Girl in the Fireplace (+1.9; 84 to 9.3), The Husbands of River Song (+1.7; 82 to 8.6); The Empty Child (+1.6; 84 to 9.1), The Witch’s Familiar (+1.5; 83 to 8.7), Last Christmas (+1.3; 82 to 8.4) and A Christmas Carol (+1.3; 83 to 8.6). Six of these nine listed episodes are 12th Doctor episodes, and four are from Series 9 and from the Christmas special which aired one month after the Series ended. The 12th Doctor is definitely growing on me, and there is evidence that his 2nd Series, at least, is gaining in stature with many fans.

Two other episodes merit attention for dramatically increasing in stature, while still being below average in IMDB rating. These are the very first post-reboot episodes: Rose (+2.6; 76 to 7.6) and The End of the World (+2.7; 76 to 7.7).

At the other extreme are 27 episodes that went in the opposite direction: from above-average in AI score to below-average in IMDB rating (lower right quadrant), including three with at least a 2-unit decrease in stature: The Lazarus Experiment (-2.1; 86 to 6.8) and The Curse of the Black Spot (both -2.1; 86 to 6.8), and Daleks in Manhattan (-2.0; 87 to 7.2). The first two of these episodes featured the 10th Doctor, as did four other episodes which dropped considerably in stature: Planet of the Dead (-1.8; 88 to 7.6), The Poison Sky (-1.7; 88 to 7.7), as well as Partners in Crime and The Doctor’s Daughter (both -1.6; 88 to 7.8).

Finally, two other episodes dropped from just-below-average in stature to well-below-average: Fear Her (-1.8; 83 to 6.2) and In the Forest of the Night (-1.5; 83 to 6.4). Other than the fact that seven of the 10 episodes named here and in the previous paragraph feature the 10th Doctor, there is no obvious pattern I can discern as to which episodes most declined in stature.

In the final “episode” in this series, I will use these data to determine which of the nine Series’ and four Doctors since the 2005 reboot were—and are—the most- and least-admired.

Until next time…


Published by

Matt Berger

I am a data geek, writer, investigator and film noir devotee with academic training in political science (Yale BA, Harvard MA), biostatistics (Boston University SPH MA) and epidemiology (Boston University SPH PhD). In January 2021, I finished writing Interrogating Memory: Film Noir Spurs a Deep Dive Into My Family History...and My Own for which I seek literary representation and a publisher. In Chapter 6: So...What Is Film Noir, Again?, I analyze my film noir database, which contains 4,825 titles. My musical holy trinity is Genesis, Miles Davis and Stan Ridgway. I am a liberal Democrat who understands a thriving democracy requires at least two mature political parties. I grew up just outside Philadelphia, and I live just outside Boston, where my wife and daughters keep me happy, sane and grounded. Please ask me anything else you want to know.

3 thoughts on “Rating Doctor Who, post “reboot,” episode 2

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s