Have you ever loaded a playlist of your favorite songs, hit random play and settled back expecting a well-mixed listen only to find that four out of the first ten songs are all by the Beatles? Do you ever find yourself clicking the shuffle button multiple times to try and remedy this, all the while suspecting the player harbors some secret love for the repeated artist?
If so, you are not alone. In this article, Brian puts his own suspicions to the test using iTunes, specifically analyzing the algorithm it uses to <i>play higher rated songs more often</i>.
I\'ll tell you how "anal" I have configured my iTunes using smart playlists. I have rated my entire collection. First let me establish the meanings of the star-ratings: 0 = OnTheGo Exiled, 1 = Yet to be rated, 2 = Low Rotation, 3 = Medium Rotation, 4 = Super Rotation, 5 = OnTheGo Rated. The idea behind it: All imported music gets 1 star. When on the road, if I like the tune, I update the rating to 5-stars. If I absolutely hate a tune, i\'ll exile it from playback by updating the rating to 0. When I get home and the stats are synchronised, I have a smart playlist called \'OnTheGo Rated\'. I can then review the \'approved\' songs and assign a more appropriate rating. This results in 3 smart playlists for Low Rotation, Medium Rotation and Super Rotation, much like a typical radio station or musicvideo tv channel. In the partyshuffle mode, I assign another smart playlist, which 90% of the time is set to \'Rotation Low .. Super\'. This will make sure that my party shuffle only plays approved tracks. If I only want "major hits", I select a more narrow selection using the smart playlist \'Rotation Medium .. Super\' or just \'Rotation Super\'. Furthermore, I\'ve set up \'Best Of Ggenre\' smart playlists, which enables me to narrow the selection to a specific genre. The option \'Play higher rated songs more often\' is used throughout the partyshuffle mode, ensuring that more popular tracks are played more often. However, by utilizing the 0-star and 5-star for OnTheGo rating and using the 1-star as an initial rating, the partyshuffle mode only utilizes the 2..4 star rated tracks. The majority of the tracks reside in the 2-star category, which in regards to this statistical experiment is a big shift, as the article claims that \'3-star\' is the center of gravitation. I\'m not a big math-freak, so perhaps you could tell me wether or not this has a positive influence on the \'Play higher rated songs more often\' option. Oh, to make the playlist mania complete: using this strategy, I\'ve also created these smart playlists: \'Rated but never played\', which lists all tracks having a 2..4 rating, but have never been played before, either in iTunes or the iPod itself, and a \'OTG to-be-rated\' playlist, which features all 1-star tracks (initial import rating), which makes a handy tool to force you to review tracks and assign ratings, either in the comfort of home, or on the road using the OTG-exiled (0 stars) and OTG-rated (5-stars) ratings for easy rating ;). Oh and to complete it all, I\'ve also set up an \'Airplay top 100\'. I\'d like to hear all your comments on my implementation and the mathematical implications.
I question the validity of assuming a bell curve distribution for the song ratings.
Its a self-selected group... why would you import/purchase songs you don't like? I suspect the curve is skewed significantly.
I have no ratings for any of my 4363 songs in my library. Using regular shuffle itunes will play the same songs twice within a couple hours. I do not have any duplicates. Also, the next day or the next time I restart itunes, itunes will play many of the same songs it played the previous day or time. I have watched and checked and watched over the last 6 months and this random thing is not so random.
...of time and man-power. The math behind it is simple and plain, the article just proves that itunes functions work.
I personally don't understand why people praise itunes so much as if it were some artefact. I think itunes is not geeky at all.
The formula in the paper is more than a bit unnecessarily complex. The evidence points to the following explanation provided by Bert 690 in the slashdot discussion for this story:
OK, after a bit more thinking, you were indeed very close. It appears the actual formula is:
points(0 stars)=1
points(1 stars)=3
points(2 stars)=4
points(3 stars)=5
points(4 stars)=6
points(5 stars)=7
probability(X stars) = points(X stars) / 26
This yields the following probabilities, listed along side the observed values from the article along with 95% confidence intervals.
p(5 star)=.2692 [.270 +- .0038]
p(4 star)=.2308 [.230 +- .0036]
p(3 star)=.1923 [.189 +- .0033]
p(2 star)=.1538 [.154 +- .0031]
p(1 star)=.1154 [.118 +- .0027]
p(0 star)=.0385 [.039 +- .0016]
As you can see each computed probability falls within the 95% confidence interval, so there's a good chance this is the correct forumla.
Boy do I have too much time on my hands today.
All of which reminds me of a question I found interesting in Information Theory...
What do we mean by random?
What definition do people use? Please make it as matematically exact as you can...
Does the "play count" number affect the weight of the songs at all? Can another test be run with several tracks of the same rating, but with varying play counts?
one thing i noticed was that you only used six songs in your test. What i wonder is did you name the songs 1,2,3,4,5,6 corresponding to the number stars, 1 being a zero star and 6 being a five star. the reason i mention this, is because i had thought that perhaps preference might be given on the basis of song title/artist as it is listed alphabetically. I also wonder if there might be some colinearity related to the number of times a song was previously played. the affect of this being that, initially the first six songs might be played rather randomly due to a small sample size. from there on out, the songs being played may have the play count reflected on the number of times played in the future.
Another point i would like to make is that when you use a sample size of six, it is very hard to get an actually statistically significant outcome. Not that i'm hating on what you did, but i would like to see/do a study of say 300 samples randomly titled, randomly rated, and then arbitrarily played thousands of times to get a better description of the data.
From my understanding most users have at least a thousand songs, and from my statistics classes in college, it is quite evident that you need a much larger sample size to actually represent the real population.
Well... today iTunes can help you out my friend. There is a new iTunes in town and it will help you lose that rating system of yours which i feel bad about. i mean, ive gone through my whole library naming every song to perfection and i knwo that was a pain. but the ratings too? damn man i feel for you. anyway i hope you enjoy the new iTunes!
peace.
J3
It's possible, on a Mac at least, to have ratings between 0 and 100; where 0 corresponds to no star, 20 to one star, and so on up to five stars.
What I'd like to see is this experiment repeated with 101 tracks, each with a different rating, just to see if this is taken into account by iTunes...
Any takers?
I just put my iPod on shuffle on a playlist I made. I might just believe in the same voodoo he was trying to put to death in the article, but I think the selection algorythm might have certain extra criteria. If it doesn't then at the very least, it's possible that a program to create playlists could.
With the 18 song playlist on shuffle, the songs seemed to follow a few trends:
- tracks got slower, and then faster (a degcrease, and then an increase in BPM)
- the only instrumental track on the list ("Bean-E-Man" by DJ Logic), and a track with sparse lyrics that are mixed in almost in the background ("Pulk/Pull Revolving Doors" by Radiohead) occurred next to each other, around number 12 out of 18 for the track.
-both instances where an artist appeared twice on the playlist, the two songs were played successively. In both instances, both songs came from the same album. These instances were "Beautiful" and "Batman and Robin" from <i>Paid tha Cost to be Da Bo$$</i> by Snoop Dogg, and "Award Tour" and "Electric Relaxation" from <i>Midnight Marauders</i> by A Tribe Called Quest.
This might not just be a coincidence- the BPM for each of these albums remains somewhat constant, and the tones/frequencies recorded also are consistant; Snoop Dogg's voice is very distinctive, as are those of Q-Tip and Phife Dawg. The title track from Midnight Marauders even says, the entire album is "Bass Heavy".
Keep in mind as well that each and every studio-recorded mp3 file was mixed in stereo, with each instrumental and vocal track given a unique distribution between right and left, most likely using digital equipment. If a computer can put something like this together, then it most likely can take something like this apart.
Though computers are not capable of things like "mood" or "preference", they can and have been used to recognize things like audible frequency, beats per minute. I'm not sure whether or not iTunes, or the software on the iPod uses these things to compute the optimal order for songs to occur in- I do know, however, that it's possible.
I'll say this, too: I liked my iPod's order for the songs more than mine.



article
by 
Add a Comment (36)
Email This
Message Author
Statistics
RSS


did you do this? by bradsmith :: NR5 :: Show
Brian, is this an article that you posted...or did you run this experiment. I heard this a while back about somebody determining that the algorithm apple used was indeed random in design. But man if you ran this experiment, kudos. You got some time on you hand up in dallas bro!