While, LabROSA also has playlists from OpenNap, there are no preferences given; a song is either on a person's playlist or not on a person's playlist. I've been using Last.fm's API to try to remedy this situation. First, I gathered the top listeners for each of the 400 artists in the USPop2002 set. Over the past couple weeks I have been extracting the total combined weekly chart lists to get the number of plays of a particular song for each listener. While number of plays may not be a direct measure of preference (or rating), it is reasonable to assume that people will listen to song they like more than the ones they do not like. At the moment, I have only downloaded about 4000 listeners (I have to download several pages per listener and Last.fm requests a 1 second wait between requests). Also, artist names appear in several different varieties. Rap and hip-hop seem to be exceptionaly bad since they are unable to do any song without a guest star.
There's tons of data to play with, but for now, let's look into what artists are popular. Note: there are still thousands of users to download and some artists' top 50 listeners have not been reached yet. These results should be taken with precaution so that we don't leap to Montauk monster conclusions (it's a racoon, let it go people).
One neat thing occurred in the top 5 artists: Beatles, Radiohead, Pink Floyd, David Bowie, Queen. Only the Beatles and possibly David Bowie have had enough users from their lists to explain such high results. Indeed, it appears that the other artists would be just as popular if I had taken a random group of users (note: I'm sure the Beatles will also have this once I extract more pages).
I'll have more later.
No comments:
Post a Comment