Django experiments

I’ve been playing about with Django and Last.fm data, intended as an eventual upgrade to historical charts. It’s fun .. I now know I play most music in March and November (university deadline time!), that my top three artists of 2010 so far are Animal Collective, Grizzly Bear and The Delgados, and that my favourite three discoveries are Starless & Bible Black, RM Hubbert and Soweto Kinch.

I also discovered that using a foreign key of an object returned by a Django QuerySet as a dictionary key prompts Django to look the actual data up. I had something like the following:

class WeekData(models.Model):
    artist = models.ForeignKey(Artist)
    plays = models.PositiveIntegerField()
    ..


tracking = defaultdict(int)
for week in WeekData.objects.all():
    tracking[week.artist] += week.plays

The dictionary update was taking ages. Confused, I enabled Mysql’s logging and discovered 25,000 lines of the following..

SELECT `id`, `name` FROM `muncher_artist` WHERE `id` = 22
SELECT `id`, `name` FROM `muncher_artist` WHERE `id` = 23
SELECT `id`, `name` FROM `muncher_artist` WHERE `id` = 24
SELECT `id`, `name` FROM `muncher_artist` WHERE `id` = 25

At which point I realised that using week.artist as the key here looks up the artist every time, meaning 25,000 useless database queries and a really, really slow function. Perhaps I was being too hopeful in my expectation that Django would be clever.

Changing the last line to:

tracking[week.artist_id] += week.plays

sped the function up by a factor of ten and lets me produce images like this in reasonable time:

It’s showing which artists occupy which chart positions as the weeks go by. The dark black line is for Fleet Foxes .. seems I went pretty mad for them :o