Measuring the Unmeasurable: Ranking One’s Favorite Music, Part 4

In the first three essays in this series (here, here, here), I detail the evolution of the 434 music “mixes” I have created since August 1981, when I recorded 14 current favorite tracks onto a 60-minute cassette, calling it “My Stuff,” by putting a speaker from my stereo turntable next to a hand-held tape recorder, then hitting “Record.” These mixes migrated from cassette to CD to 150-GB flywheel iPod, with a handful of videocassettes added along the way. As of April 2024, I have put 3,625 tracks onto 9,270 “slots.” Thus, the average mix has 20.4 slots, or tracks, on it.[1]

At the end of 1992, it occurred to me that I could rank my favorite tracks using data from these mixes: the more times I put a track on a mix, the more I liked it, adjusting for “burnout.” Using hypotheses (and a very subjective measure called “Future Placement”) I detail in the earlier essays, I calculated my 100 favorite tracks in 1993, 1994, 1994, 1998 (twice), 1999, 2000 and 2004. I misplaced the 2000 list, I reconstruct which tracks appeared on it, if not the exact ordering; I recall #1 was “Coming Up You” by The Cars. The 2004 list was the first I made after rebuilding my Excel Workbook from scratch so that it contained track listings from every mix cassette I recorded prior to “Boston Drive Vol. I-VI.” These six cassettes, which I recorded in August 1989, are an executive summary of the 56 mixes I had created over the previous eight years, as well as nine first-time tracks. I listened to them during my drive from Philadelphia to Somerville, MA to start a doctoral program.

Through 2013, meanwhile, these were the only track-ranking data I had. That year, I bought a new PC, and I switched from Musicmatch Jukebox to iTunes, with its accompanying play count.[2] For the first time, I knew how many times I had played a track, albeit only on my iPod or computer – and only since 2013. At the same time, the number of tracks in my digital library increased sharply from 5,350 (July 2010) to 8,767 (March 2015) to 9,948 (April 2024) as I uploaded every CD I owned – and digital versions of most of my vinyl albums – into it, while I purchased hundreds of tracks on iTunes.

Thus, by the early 2020s, when I began to write these essays, I had two distinct data sources I could use to rank nearly 10,000 tracks in an objective way: 1) appearances on mixes and 2) plays recorded in iTunes. All I needed to do was to create a single “appearance” measure and a single “plays” measure, then combine them into a single “score” measure, which I could use not only to rank my favorite tracks, but also my favorite albums, artists, genres, years, etc.

Easy, right?

***********

Spoiler: it was not easy.

We start with data from the 434 mixes[3] I constructed between August 1981 and November 2023. After reviewing my hypotheses, I calculated seven values for each track:

  • COUNT: Number of appearances by track overall
  • YEARS: Number of distinct years track was recorded at least once
  • MIN: First year track was recorded
  • MAX: Most recent year track was recorded
  • MEAN: Mean of years track recorded at least once
  • MEDIAN: Median of years track recorded at least once
  • SD: Standard deviation of years track recorded at least once

The first measure needs no explanation.

The second measure helps to control for too many appearances in too short a period, resulting in “past burnout.” Since values on the first two measures are identical for 2,790 (77.0%) tracks, however, I calculated the geometric mean (square root of the product) of the two values, calling it COUNTYEARS.

Two-thirds of the 3,625 tracks appear on only one (37.2%) or two mixes (29.9%), with another 24.4% appearing on between three and five mixes, and 7.3% between six and 10 mixes. Only 45 (1.2%) tracks appear on more than 10 mixes, topped by “When Ye Go Away” by The Waterboys and “Too Hot” by Kool and the Gang (22 appearances), “The Public Eye” by Mark Isham (23), and “The Carnival Is Over” by Dead Can Dance and “The Shadow” by Jerry Goldsmith (26). The Goldsmith and Isham movie themes reflect a growing appreciation for movie soundtracks. The mean number of appearances is 2.6, with a median of two.

The seventh measure reflects that I like tracks to which I return over decades more than those whose appearances cluster in a few months or years. Standard deviation is a better of this “spread” than range, which only captures the first and most recent appearances.

The four middle measures, finally, were an attempt to capture how long a track has been appearing on mixes (minimum); how much I like the track now (maximum); and whether appearances clustered on the earliest mixes, on the most recent mixes, or somewhere in between (mean, median). Given how similar mean and median are, I also calculated their geometric mean (MEANMED).

Immediately, I realized I had a problem. For nearly three decades, I had been fixated on appearances on mixes as the sole indicator of how much I liked a track. However, more than 6,000 tracks now under consideration never appeared on a mix; they were simply the other tracks on albums I owned. And while “0” is the obvious value for these tracks for COUNT, YEARS and COUNTYEARS, it is not at all obvious what value to assign these tracks for the other six measures. Not loving my options, I settled on a value of “1980,” the year before I began constructing mixes, as minimum, maximum, mean and median – and, thus, a value of “0” for standard deviation.

But this “solution” created an anomaly. The logic behind MIN is that the earlier a track first appeared on a mix, the more I like it; it has more continuous appeal. However, the non-mix tracks now had a minimum lower than the mix tracks, the opposite of what I wanted. One “solution” is to assign “1980” for MAX and “2024” for MIN. But this creates a new anomaly, as SD is now 22.0 for non-mix tracks, much higher (i.e., I like them more) than the mean of 3.1 for mix tracks. The resulting non-mix track MEAN, MEDIAN and MEANMED of 2002 was also not much lower than the mix-track mean of 2004.

Despite this flashing red methodological light, I remained convinced that when tracks appeared was necessary for an accurate, objective ranking of all ~10,000 tracks. Sticking with the values of “0” and “1980” for non-mix tracks, I used factor analysis to combine these nine measures into a single “Media Appearance” index.[4] I spare you the many permutations I went through as it slowly dawned on me that, for the most part, when a track appears on a mix tells me nothing about how much I like that track. My usual rules of track-making – alternating first-time tracks with returning tracks, nearly all of which had themselves only recently appeared for the first time – meant that all I was really measuring was when I first became aware of or acquired a track. The need to conserve just over half the available slots for first-time tracks severely limited how many older tracks could appear on any subsequent mix.

And I still used “adjusted median” – how far in either direction MEDIAN was from the overall median (which, at least, had the salutary effect of minimizing this value for the non-mix tracks) – along with COUNTYEARS and SD to construct a first version of Media Appearance. This index had high face validity, meaning the way it ordered tracks made intuitive sense, but it led to two additional problems. One was that using factor analysis meant importing data into STATA, making the process more time-consuming.[5] Ultimately, I would like to simplify the process of assigning scores to tracks as much as possible.

We return to the other problem shortly, because it relates to how to combine Media Appearances (“MA”) with a measure of plays to create the elusive track score.

***********

Despite its maximal face validity – the more times I play a track, the more I like it (with the omnipresent caveat of burnout) – the plays measure presented its own challenges.

First, it was demonstrably incomplete, even beyond the severe mismatch in first track appearance (1981) and first play (2013). Putting a track onto a mix increased its number of plays, often substantially – so what do I do about the tracks on the 281 mixes (64.7%) created prior to 2013?

Second, it includes many plays that were not of my own choosing. On multiple occasions, for example, I accidentally started playing the alphabetized-by-artist tracks on my iPod, beginning with “A Woman’s Got the Power” by The A’s, not realizing I had done so until well into, say, the 29 Alan Parsons Project tracks. I also often found it easier to listen to birthday mixes in the car on my iPod. While nearly every track I curated for our children was one I had either put on one of my mixes, or could see doing so at some point, I would never put eight of these tracks onto a mix. But now they had artificially-high plays values.

Meanwhile, I built one of every three mixes (147) – traditional, bathtub, family travel and Thanksgiving – between 2012 and 2017. Fifty tracks appeared more than five times on those mixes – substantially increasing their Plays value – with six appearing 10 or more times: the top five tracks by appearances overall plus “Beyond Belief” by Elvis Costello & the Attractions (10). There was thus an additional recency bias built into Plays. Related to this was my choice to put select “triplets” – three tracks appearing in succession on an album – onto one or more of these mixes: “Bye Bye Love,” “Moving in Stereo,” and “All Mixed Up” on The Cars by The Cars; “Stand or Fall,” “The Strain,” and “Red Skies” on Shuttered Room by The Fixx; and “Omega Man,” “Secret Journey” and “Darkness” on Ghost in the Machine by The Police. Each of these nine tracks thus had more plays than would otherwise be expected.

Next, because studio and live versions of the identical track and artist are listed separately in iTunes (mostly from 16 live albums – six by Genesis), I was artificially lowering plays for the studio versions of these tracks. When I constructed my Excel workbook, I collapsed studio and live versions (same track, same artist) into a single track. Doing this with plays (and summing studio and live plays for the same track/artist) reduces the number of tracks being analyzed to 9,563.

Finally, because my iPod is not synched to the desktop PC I needed to purchase in September 2018, plays only increase for tracks played on iTunes on my PC.

I addressed these issues in reverse chronological order, starting with capturing new plays – not just on my iPod, but through watching YouTube videos, etc. Now, every time I play a track somewhere other than on my PC, I “play” that track on my computer by clicking its beginning than jumping to the final few seconds. And for score calculation, I collapse studio and live versions of the same track by the same artist into a single “track.”

Meanwhile, as a first pass, I adjusted plays by…

  • Subtracting either 2.5, 5 or 10 plays for tracks by artists starting with the letter “A” through Any Trouble,
  • Either deducting about 25% plays from tracks on birthday mixes or, for the eight never-mix tracks, setting their plays value to the median (3)
  • Deducting a percentage (between 5 and 20%) from plays for tracks on 2012-17 mixes based on number of appearances on those mixes, and
  • Deducting a small percentage of plays (~5%) from the “triplets.”

These adjustments allowed me to calculate Adjusted Total Plays (“ATP”), which I then converted to a z-score (number of standard deviations above or below the mean). Because I used factor analysis to create MA, I now had two scores with identical distributions: mean=0 and standard deviation=1. Furthermore, these scores were highly correlated (~0.85), meaning they measure the same underlying concept – how I rank-order favorite tracks.  Thus, all I had to do was sum ATP and MA to get Final Score.

Except, not so fast – because of the second problem with how I calculated MA.

While both scores had similar minima (between -0.500 and -1.000), ATP had a maximum of ~11.0 while MA had a maximum of ~5.0. Thus, if I simply summed the two values, I would artificially weight ATP roughly two times higher than MA. Nonetheless, I jerry-rigged another “solution”: multiply every value of MA by around 2 to make the two distributions even, then sum the two values.[6]

I now considered this sum “the” track ranking score. It had moderately high face validity, though it still tended to elevate tracks first acquired in the mid-2010s with a lot of early plays (i.e., “Ai No Corrida” by Chaz Jankel, which I like a great deal – but it is not in my Top 100). Every time I played a track, I updated ATP – increasing Final Score about 0.057 – and I updated MA after making Thanksgiving cleanup mixes in 2022 and 2023. Otherwise, even though I was not wholly satisfied with this version of Final Score, other projects kept me from focusing on it.

***********

In mid-April, however, I decided I needed a mental break after writing (and discussing with two close friends) a new draft of Chapter 2 (Tragedy By the Oakford Bridge) of The West Philadelphia Story: An Immigrant Jewish Journey. So, I returned to the calculation of Final Score, determined to both simplify and finalize it.

In short, I decided it was time to do three things.

One, assign some number of plays to the tracks appearing on the first 281 mixes, adjusted for time.

Two, simplify the construction of MA so that it did not involve factor analysis.

Three, slightly adjust – using a simple, relatively-objective multiplier – any sum of ATP and MA.

Taking these actions in order, here is I how assigned additional plays to tracks on the 1981-2012 mixes:

  • Every track on a cassette or video mix (n=151, 1981 to 2003) received 20 additional plays with 23 exceptions:
    • 10 Plays: Stuff and Such Vol. XXXI, Punk Becomes New Wave, Stuff and Such Vol. LXXXIV-LXXXV, half of 1997’s In Memoriam I-92 Rock of the 80 (if repeated from 1994 version)
    • 25 Plays: The Anger Mix (Stuff and Such Vol. XXVI)
    • 30 Plays: I-92 Mix (1983), Boston Drive Vol. I-VI, and Stuff and Such Vols. XXX,[7] XLII, XLIII, XLVI, LIII, LIV, LXII, LXXXII, LXXXIII
  • Every track on CDs made between 2003 and 2009 (n=84, CD Stuff Vol. I-LXXXIII) received 10 additional plays, with 17 exceptions:
    • 15 Plays: CD Stuff IV, V, XVI-XVIII, XXIV-XXVI, XXX, XLII, L, LVII, LIX, LXV
    • 20 Plays: CD Stuff III, X, LXIV
  • Every track on CDs made between 2010 and 2012 (n=18, CD Stuff LXXXIV-CI) received 5 additional plays, with one exception:
    • 15 Plays: CD Stuff C
  • Every track on a bathtub mix made in 2006 and 2007 (n=19) received 2 additional plays, with two exceptions:
    • 10 Plays: First 2006, Final 2007
  • Every track on the three wedding mixes (October 2007) received five additional plays.
  • Every track on the 2008 and 2009 bathtub mixes (n=2) received five additional plays.
  • Every track on the 2010-2012 bathtub mixes (n=4) received 20 additional plays.

These additional plays were weighted by time ((Year-1980)/33), which had the effect of weighting more recent plays about twice as much as these “older” plays. That said, because mix cassettes were relatively harder to make, I gave them twice as many plays in general, evening out the two effects.

Mixes with higher additional plays were particularly beloved cassettes and/or CDs I played most while bathing on Friday evenings. Other than the repeated tracks on the two 1997 I-92 Memoriam cassettes, the four cassettes with only 10 plays assigned to its tracks were either not very compelling (XXXI), made for a special purpose only (Punk Becomes New Wave) or not recorded with Dolby noise dampening (LXXXIV-LXXXV).

I also assigned up to five additional plays, also weighted by year, to just under 200 tracks appearing on other non-mix CDs (bought or burned for other purposes) played while bathing on Friday evenings. I played a LOT of music during baths that lasted as long as four hours.

When I had finished assigning additional time-weighted plays to ~2,900 tracks, I had nearly doubled the ATP to ~ 100,000. Having added these pre-2012 plays, I removed all adjustments except for subtracting up to nine plays for accidental plays and deducting two-thirds of plays from the eight never-mix birthday mix tracks.

Going forward, this is how I plan to calculate ATP.

Meanwhile, once I realized that SD efficiently captures the “chronologic spread” I was trying to capture with MIN, MAX, MEAN, MEDIAN and MEANMED, and that COUNTYEARS efficiently captures “burnout-adjusted appearances,” I simply multiplied the two values together – adding one to SD to account for values of “0” – and calculated a z-score.

And, voila! MA now ranges from -0.354 and 12.240, while ATP now ranges from -0.601 and 13.283, close enough to add without additional weighting. Moreover, they are correlated a solid 0.75, high enough to indicate they measure the same underlying concept, but with some interesting differences. Going forward, this is how I plan to calculate MA…in Excel, not STATA.

Which left only what I call “Historic Adjustment.”

To each of the 162 tracks that appeared on a “Favorite Tracks” list I compiled in 1981, I assigned “1,” unless it was in the “upper echelon” or ranked #2-#6 in a separate ranking (2), or ranked in the “top echelon” or #1 (3). To each of the tracks I deemed a pre-1981 favorite, I assigned “1.” To each of the 80 vinyl 45 RPM singles I own, I assigned “1” for the A-side, as well as “0.5” to 17 well-liked B-sides; I also assigned “2” to five 12” singles. I treated the eight Top 100 rankings from 1993 to 2004 as follows: I assigned a “3” to the number one track, “2” to the nine other top 10 tracks and “1” to the 90 remaining tracks. However, because the 1994-2000 rankings are not statistically independent of the 1993 ranking, I weighted each of these values one-half. When I returned to the Boston area in 2005, I ranked my 100 favorite tracks of my 4+ years in the Philadelphia area. And in 2011, I ranked my 200 favorite tracks since returning to the Boston area. For these latter rankings, the #1 track , the remaining top 10 tracks and all other tracks were assigned “3,” “2” and “1,” respectively. Each of these values was weighted by time ((Year-1980)/43), except for singles, assigned a weight of 0.25.

Finally, I created a catch-all “Other” category, assigned a weight of 0.33. Thus, tracks are assigned “1” if they appear in my Interrogating Memory “soundtrack,” or I put it on “Nell’s Survival Mix” in March 2006, or I once acquired its sheet music, or for a wide range of personal reasons. And in a nod to how my construction of traditional mixes evolved from a random process to one that was carefully thought out – including having the very first track on the mix (or on each mix in a set constructed at the same time[8]) be particularly special – I assigned “3” to the first track on a multi-mix set and “2” to the first track on every other mix in the set, as well as to the first track on a standalone mix. I also assigned “2” to the last track in a multi-mix set – I always left a special track to be the closer. I designate Stuff and Such Vol. XXXVI – constructed in July 1993 for a trip to Florida to visit my aunt – as the starting point for assigning these values, though, as with all of these measures, I will rethink them in the future.

I calculated Historic Adjustment – which can only ever increase a track’s rank (and is the closest I plan to come to “subjective adjustment” – by summing the time-weighted values for each track, dividing by 10, then adding 1. As of this writing, Historic Adjustment allows me to increase a track score by up to 37.4% (“Blue in Green” by Miles Davis). The 919 tracks with Historic Adjustment>1 average an increase of just 5.9% (median=3.4%).

Multiplying this value by the product of ATP and MA (or dividing if product<0), yielded my new FINAL SCORE.

The resulting ranking was a revelation. Not perfect, but FAR closer than I ever came before. And this is without any kind of post-hoc doubling and halving so track order better fits my pre-conceived ideas. I have cursorily assessed the validity of these rankings by sorting tracks by year and artist, and the rankings still have very high face value. Best of all, I had greatly simplified the process by 1) adding mix plays prior to 2013 and 2) letting go of the notion when a track appeared substantively impacted how much I like it.

In the next essay in this series, I will unveil my Top 100 tracks (as of May 2024) – followed my favorite artists, albums and rankings of all three by year.

Until next time…and if you like what you read here, please consider making a donation. Thank you.


[1] This value is higher by 399 than the count given in the previous essay because I added other mixes – and might have miscalculated.

[2] Whatever play counts Musicmatch Jukebox recorded were lost when my PC crashed.

[3] I exclude special birthday mixes I made for our two children because I selected tracks for them, not for me.

[4] Using STATA 9.2

[5] I have the same issue with creating Perceived Quality scores for films.

[6] I condense a lot of calculation trial and error here.

[7] This is arguably the best mix I ever made.

[8] Given how easy it had become to burn tracks quickly onto a CD from iTunes, I burned up to 11 mix CDs for trips to Philadelphia in the 2010s.

2 thoughts on “Measuring the Unmeasurable: Ranking One’s Favorite Music, Part 4

Leave a comment