||Show Albums||Show Songs|
This website presents a consolidated list of the albums and songs that have been on top of the music charts since 2000. It employs charts from the USA, UK, Germany, France, Canada, Australia, Italy and Spain to work out the top songs and albums for every day, month and year between 1 Jan 2000 and 31 May 2018. This is version 0.3.0039 of the data and was processed on Thu, 07 Jun 2018 20:00:54 GMT.
The site consolidates 1,290,478 weekly chart entries (735,401 about albums and 555,077 about songs). These entries contain 31,485 distinct artist names, 32,703 song names and 42,632 album names. The site lists the overall top 1000 artists, 2000 songs and 2000 albums (each with a plot to show how success varies through time) as well as providing charts for every year, month and even every day between 1 Jan 2000 and 31 May 2018.
This website is the modern equivalent of //tsort.info/, a website that combines music charts starting in 1900. For that site the scarcity of source data (especially for the years before 1950) means that every available chart has to be employed. In contrast this site can rely on the fact that since the year 2000 extensive and trustable charts are available for all the largest music markets. This allows us to estimate the worldwide top 10 songs for every single day.
Most of the pages on this site provide summary information, for example the daily, monthly and annual charts, details about the top songs, albums and artists. All pages have a comment box at the bottom to allow you to ask questions, point out mistakes and suggest improvements.
This page is different from the rest of the site, in that is contains quite a lot of text (the other pages attempt to minimise text in order to make finding the information easier). This page provides the background to explain where the original data came from, how the site is generated and who we are. It has the following sections:
In addition to providing web pages we have also created some data files that let the reader do further analysis (should they want to). The following data files are available for download:
Following a suggestion from one of our readers we also provide CSV files listing the top songs and albums of each decade covered:
All this data is copyright and should only be used if the original source (i.e. this web site) is credited and the version number is mentioned (which for this data is 0.3.0039). We have no issue with applying this data for personal use, or with people that use the data and tell everyone where it came from, those who use these results and attempt to pass them off as their own work are liable to have action taken against them.
The charts used are the best ones available for each country, we have only used charts that are considered "notable" according to the definitions used by Wikipedia. This has limited the data that we can get hold of. At the moment the following chart data is being used:
|USA||//www.billboard.com||Top 100 songs and top 200 albums each week|
|UK||//www.officialcharts.com||Top 100 songs and albums each week|
|Germany||//germancharts.com||Number 1 songs and top 15 albums up to 2007. Top 10 songs and albums after Jan 2007|
|France||//lescharts.com||At least top 100 songs and top 75 albums each week|
|Canada||Wikipedia and //www.billboard.com||RPM number 1 singles for 2000. Billboard top 100 singles after 2007. Top 10 albums from 2000 - 2007 and at least top 25 albums after 2007|
|Australia||//australian-charts.com||Top 50 songs and albums each week|
|Italy||//italiancharts.com||Top 20 songs and albums each week|
|Spain||//spanishcharts.com||Top 100 albums each week from 2005 and top 50 songs (only top 20 between 2000 and 2008)|
|Netherlands||//www.dutchcharts.nl||Top 100 albums and singles each week|
Many of these sites have dates with no entries and individual positions missing in otherwise good charts. In addition the names of artists, songs and albums are inconsistent (both between charts and even from week to week in the same chart).
The first and most obvious challenge with all these sources is inconsistency. Take as an example the song "It wasn't me", in the US, UK and French charts we have the following entries:
|Billboard||16 Dec 2000||4||Shaggy Featuring Ricardo 'RikRok' Ducent||It Wasn t Me|
|13 Jan 2001||2||Shaggy Feat. Ricardo 'RikRok' Ducent||It Wasn't Me|
|UK Chart||18 Feb 2001||37||Shaggy||It Wasn't Me|
|25 Feb 2001||37||Shaggy Ft Rikrok||It Wasn't Me|
|France||17 Mar 2001||4||Shaggy feat. RikRok||It Wasn't Me|
To a human being it is obvious that all these entries refer to the same song, by the same combination of artists. It is, however, quite hard for a computer program to correctly identify that fact without also generating false positive matches. Even the fact that the name in the first row is missing the apostrophe would make a naïve program fail. When entries are laid out next to each other these types of anomalies are easy to spot. In this case these five entries were hidden in the 100 positions listed in each of the weeks shown.
During the period covered by this site there was a large number of songs that were attributed to "featured" artists. This makes matching artist names particularly difficult (and also makes properly assigning the credit rather problematic). We've attempted to assign each song to a combination of artists in a consistent way (and not use the word "featuring").
Say we select a random date, for the sake of argument the 12th Feb 2009 (just because it was the 200th anniversary of the birth of a great man). What did the various pop charts look like on that particular Thursday?
In these eight countries the top 5 songs were those shown above. Now, before we go any further, I would like to apologise to any reader who understands Japanese, I am confident that the entries as listed contain some stupid mistakes. There are a few other things that are obvious, even from this short list:
If we want to consolidate these types of charts then a few base ground rules will be required. First the correct handling of non-ASCII characters raises all sorts of issues, so all entries will have to be converted to ASCII. In cases where accents should be used it is easy enough for the reader to fix the issues later. Secondly the chart from Japan is just too difficult to convert. Now luckily this doesn't impact us much as most Japanese hits appear nowhere else. It unfortunate but to make the processing tractable we just have to ignore Japanese charts until someone shows us how to deal with them.
Having identified all the songs that were in the charts on a particular day how should we present them? One option would be to just list them, but given that we have both many more entries within each chart than just these 5 and more than just these 8 charts that could be excessive. The obvious way to identify the "biggest hit" is to assign each item some kind of score and only list the highest achievers.
In order to work out which song to give most credence to we have to convert positions in weekly charts into some kind of "score". Obviously the number 1 record in a particular chart gets more points than the number 2 record and so on, and just as clearly the difference between say the 10th and 11th position is greater than that between the 30th and 31st. So we know that the score function has to be some kind of descending curve that flattens out, like the curve of 1/x or 0.5x.
Luckily we have some data that relates sales volume of music to positions in a chart, this is from our companion site, //tsort.info/. If we try to match this curve with either a reciprocal or power law one then the shape fails to match, both those curves overemphasise the "middle ranking" positions.
The simplest formula that we found to match this data is:
When we actually tried using this formula to assign scores we found that it was overestimating the impact of the lower positions in the chart, so in our calculations we replaced the 3 with 2.5 and ensured that we limited it to positions above 300;
Since the year 2000 the IFPI (International Federation of the Phonographic Industry) has published an annual report that provides estimates of the revenue generated by the music industry in different countries. This does not provide a complete chart for all years so we have had to interpolate some values. With those interpolations we have the following (numbers are millions of US dollars);
These values provide a scaling factor for comparing a chart entry in one place with one in a different year and location. It goves us a measure of the importance of different countries. It also gives us confidence that if we have taken account of the top 10 countries any undiscovered entries in countries below the Netherlands will have little impact on the overall results. This is important since we don't have a reliable source for Brazil's charts.
This also allows us to see how the revenues of the music business have dramatically dropped since the year 2000. Another interesting element is that revenue has dropped by 65% in the USA and almost 50% in the UK while over the same period Germany's revenue only dropped 40%. This means that in 2013 Germany overtook the UK to become the 3rd largest market. In 2007 Australia overtook Canada to become the 6th largest. So the relative importance of these different territories changed between 2000 and 2014.
If we average the revenue from each country over this period it is obvious that the sales from the top five countries dominate the total figures. Out of those five the obvious anomaly is Japan, as we have already said most music that is a hit there does not show up in any of the other major countries.
There are two reasons why the total scores for songs, albums and artists can be inconsistent with the sum of their parts. The main one is because of the way scores get assigned to contributing artists. Suppose for example we consider the song "Blurred Lines" by Robin Thicke, T.I. & Pharrell Williams, on its own this gets a score of something like 16,500, so the artist responsible gets assigned that score. In this case the artist "Robin Thicke, T.I. & Pharrell Williams" only had one hit but this was sufficient to make them artist number 281. But the artists "Robin Thicke" and "Pharrell Williams" were notable enough on their own to have individual entries. So we assign an additional 8,250 to each of them for the work they put in to "Blurred Lines". We have had to take a very simplistic approach here, the headline artist gets the full tally of points, any further artists get 50% of the points (however many of them there are). This factor means that an artist which appears in numerous collaborations might get a higher score than one above them in the list.
The second anomaly has to do with the way the "success curve" curves are generated. In order to create the effect we wanted we had to divide the year into a number of equally sized chunks. Processing on a daily basis would give too much detail, using weeks would make handling the switch from one year to the next "challenging" and months have odd lengths through the year. The underlying fact is that unfortunately 365¼ days won't divide evenly into anything. So what we did was split each year into 40 sections (9.125 days each for most years and 9.15 days for leap years). Because we can only process individual days this means that the actual periods are sometimes 9 days and other times 10 days. When we stitch these together there is a potential to get some rounding errors.
So now we have all the elements required to calculate a score over a given period. We first split the timeframe into individual days, then assign scores to each song or album that appears in any chart on that day. The score assigned is calculated from the position and the music revenue in that particular country for the year in question. Here is the Perl code used:
my $year_val_M = market_size($country,$year); my $num1_val_k = ($year_val_M*10)/365; return $num1_val_k*(1-log($position)/(log(10)*2.5));
In this case the year value is returned in millions of dollars. We assume that a number one is worth 1% of the daily take (this just makes the maths easier). The position then adjusts that (so number 2 is worth 0.87%, number 3 0.81% and so on). We sum up all the scores from charts that day and that provides an overall score for that item on that day in all the charts it appears in.
By adding up all the scores from days in a given month (or year) we can get an idea of which songs and albums were the most successful for any time period.
This site should be considered as a complement to the //tsort.info/ site. This site deals only with charts that are more recent than 1 Jan 2000, while that site is especially good for combining chart positions from multiple sources over a longer period to create reliable results.
|Period||1 Jan 2000 - 31 May 2018||1900 - Last year, but results become unreliable after about 2009|
|Charts||Only notable charts, one for each major country||Any chart that is both available and seems reliable, this becomes more relaxed for earlier time periods and countries that are seldom covered|
|Chart Types||Weekly charts only||Any chart that is applicable: including weekly, monthly and annual charts, lists of gold records, Oscar, Grammy, Brit, Juno and other awards and so on|
|Calculation||Combine daily scores using a justified model and the known music market size in each country||Custom tuned algorithm that provides reasonable results for a the wide range of input dates and chart types employed|
|Style||Clean, data focused (san-serif)||Detailed, focused on interesting results (serif)|
Most of the charts available from before 2000 do not have weekly data (or where they do it is hard to deal with). So the tsort.info algorithm relies on having a relative position and possibly a count of the duration. For the Chart2000 data we are in a much better position, there is enough data to allow us to track the position of songs and albums for every individual day since 2000 in most of the major countries.
We attempt to keep this site as up to date as possible, we gather and process the data at the start of each month. This covers the charts up to the end of the previous month, so when you visit last month's charts should be available.
We rely on data from a number of sources and we believe that our use of these public resources is "Fair Use" because this site presents a transformative view of a small proportion of each source's overall content. This site is an "original work of authorship" that presents a "compilation work". The content here consolidates facts presented within the sources without duplicating the particular form that the sources present them in (and facts are not copyrightable).
The sources provide more than 1,000,000 entries providing weekly positions of 27,000 artist names with 28,000 song titles and 36,500 album names. The output data here lists about 18,000 entries (about 8,000 albums and 9,000 songs) assigned to 2,000 song artists and 1,200 album artist (900 or so artists have both song and album entries).
You can contact us to discuss any issues, suggest data fixes or complain about the positioning of your favourite artist by filling in the form at the bottom of each page.