Thursday, August 12, 2004 - Posts

Playing with playlists: Using XSLT and XPath with the iTunes library

A recent suervey on The Register showed that Rush was the average developers favorite artist.  I don’t consider myself as an “average developer”, at least not when it comes to musical tastes.
I use the Apple iTunes media player which has a nifty “hidden feature”, it stores the library as a XML file.
The XML document has a rather odd structure. It is not your typical XML tree but more of a serialized dictionary object. This means that the elements in the document are limited to keys and values. These key-value pairs a composed of sibling pairs, where the first element is the key and the next is the value.
I guess you can’t blame Apple for not using a more intuitive structure, after all the document is a representation of a in-memory data structure.

<plist version="1.0">
       <dict>
             <key>Major Version</key>
             <integer>1</integer>
             <key>Minor Version</key>
             <integer>1</integer>
             <key>Application Version</key>
             <string>4.5</string>
             <key>Music Folder</key>
             <string>file://localhost/C:/Documents%20and%20Settings/anders.noras/My%20Documents/My%20Music/iTunes/iTunes%20Music/</string>
             <key>Library Persistent ID</key>
             <string>E4F5BED92B419441</string>
             <key>Tracks</key>
             <dict>
                    <key>35</key>
                    <dict>
                           <key>Track ID</key>
                           <integer>35</integer>
                           <key>Name</key>
                           <string>The Step</string>
                           <key>Artist</key>
                           <string>!!!</string>
                           <key>Album</key>
                           <string>!!!</string>
                           <key>Kind</key>
                           <string>MPEG audio file</string>
                           <key>Size</key>
                           <integer>8857518</integer>
                           <key>Total Time</key>
                           <integer>369005</integer>
                           <key>Track Number</key>
                           <integer>1</integer>
                           <key>Date Modified</key>
                           <date>2004-03-08T12:46:09Z</date>
                           <key>Date Added</key>
                           <date>2004-05-01T13:57:18Z</date>
                           <key>Bit Rate</key>
                           <integer>192</integer>
                           <key>Sample Rate</key>
                           <integer>44100</integer>
                           <key>Location</key>
                           <string>file://localhost/C:/Documents%20and%20Settings/anders.noras/My%20Documents/My%20Music/Complete/!!!/!!!%20-%20!!!/01-The%20Step.mp3/</string>
                           <key>File Folder Count</key>
                           <integer>-1</integer>
                           <key>Library Folder Count</key>
                           <integer>-1</integer>
                    </dict>
       </dict>
</plist>

As you can see, the structure is not the easiest to extract data from by using XPath expressions. However, since the document follows a strict key-value pair structure it is a simple task to write an XSLT script to make it easier to do queries against the node tree.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="
http://www.w3.org/1999/XSL/Transform">
       <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
           <xsl:template match="/">
        <tracks>
            <xsl:apply-templates select="plist/dict/dict/dict"/>
        </tracks>
    </xsl:template>
    <xsl:template match="dict">
        <track>
            <xsl:apply-templates select="key"/>
        </track>
    </xsl:template>
    <xsl:template match="key">
        <xsl:element name="{translate(text(), ' ', '_')}">
            <xsl:value-of select="following-sibling::node()[1]"/>
        </xsl:element>
    </xsl:template>
</xsl:stylesheet>

Using the above XSLT script you can transform the document into the following structure:

<?xml version="1.0" encoding="UTF-8"?>
<tracks>
       <track>
             <Track_ID>35</Track_ID>
             <Name>The Step</Name>
             <Artist>!!!</Artist>
             <Album>!!!</Album>
             <Kind>MPEG audio file</Kind>
             <Size>8857518</Size>
             <Total_Time>369005</Total_Time>
             <Track_Number>1</Track_Number>
             <Date_Modified>2004-03-08T12:46:09Z</Date_Modified>
             <Date_Added>2004-05-01T13:57:18Z</Date_Added>
             <Bit_Rate>192</Bit_Rate>
             <Sample_Rate>44100</Sample_Rate>
       <Location>file://localhost/C:/Documents%20and%20Settings/anders.noras/My%20Documents/My%20Music/Complete/!!!/!!!%20-%20!!!/01-The%20Step.mp3/</Location>
             <File_Folder_Count>-1</File_Folder_Count>
             <Library_Folder_Count>-1</Library_Folder_Count>
       </track>
</tracks>

This is a vanilla XML document. Using the “new and improved” iTunes library document you can perform all sorts of XPath queries on the document.
I have extracted the songs I’ve listened to the most from the start of August till today, and as the following “chart” shows I’m not your average Rush fan.

Artist

Song title

Tubby Hayes & The Paul Gonsalves All Stars

Don't Fall Off The Bridge

Seiji

Untitled

Troublemakers

Le Bocal

Incognito

The 25th Chapter

Cowboy Junkies

One Soul Now

Whirlwind

Full time thing (between dusk and dawn)

Magic Number

That day fest. Rachel Foster

Radar One

Don't run away

Landspeed

Got To Get Your Love (Special Kenny Dope Gonzales Edit)

Anthony Hamilton

I Tried

The Divine Comedy

My Imaginary Friend

The Sound Providers

Braggin’ Boasting Ft. Little Brother

Bettye Swann

The Dance Is Over

Calexico

Si Tu Disais

Dublex Inc

Mentiras

General Levy

Mad Them

Talib Kweli

Get By (Blackbeard Vocal)

Vanessa Freeman

Lifeline

Edson Frederico

Bobeira

Anthony Hamilton

Better Days

Jackson 5

Ben (Hiroshi Fujiwara & K.U.D.O.'s HF Remix #2)

Usher

Can You Help Me (Bootleg Vocal House Mix)

Johnny Copeland

Sufferin' City

Doris Monteiro

A peteca

Peven Everett

See Saw

Fertile Ground

Homage (Yesterday)

South

Mend These Trends

Louie Vega

Brand New Day featuring Blaze

The Roots

Duck Down

Nancy Wilson

A Lot Of Livin' To Do

Brandy

Afrodisiac

Quintessence

1st Impressions

Wande De Sah

Só Danço Samba

Nina Simone

O-O-Child

Vanessa Freeman

Cover Me

Blackalicious

Rock The Spot

Chiara Mastroianni & Benjamin Biolay

L'Arizona

Gilberto Gil

Queremos guerra

Chiara Mastroianni & Benjamin Biolay

Tete à Claques

Hot Chip

The Beach Party

I Beats & L'armonica Di Franco Di Gemini

Ciao Dal Mur

Bruno E

Dado

George Soul

Get Involved

I used Visual XPath to run a variety of expressions on the library.

To extract all songs play more than 2 times you can use this query:
songlist/song[Play_Count &gt; 2]

To extract all songs played on a particular date you can use a query like this:
songs/song[starts-with(Play_Date_UTC,'2004-08-12')]

 

Watching webcasts in the middle of night

I have a 3 ½ week old daughter who really enjoys keeping her father up at night. The only place she is calm is on my shoulder. As a result I’m often sitting on the living room couch for an hour or so in the middle of night. As you all know late night television is not particularly interesting, so I try to watch some recorded presentations on the web instead.

Lately I’ve been checking out the TechEd 2004 webcasts and Clements Vasters excellent session on message oriented distributed systems from the Finnish EMEA Architect Forum.

If you have some spare time these presentations are well worth looking at.