Batch Downloading of Lyrics for MP3


I recently wanted to try some synchronized lyric plugins for music playback. Unlike mp3/ogg files, synchronized lyrics are very hard to find. I figured my best bet would be to try and download text lyrics, and sync them myself. It’s not a hard process, it just takes a few minutes.

Actually downloading the lyrics, though, can be slow. Many sites provide user contributed lyrics, but they’re not set up nicely for easy downloading. Before I started to write my own perl parser, I decided to search CPAN, and came up with David Precious’ outstanding Lyrics::Fetcher package. If you download his Bundle::Lyrics::Fetcher, you get the main fetcher, as well as modules LyricWiki, AZLyrics, and AstraWeb.

One I had this installed, the script to do this for all my music was quite simple. The only configuration needed is in the first variable; add a list of everywhere you have music. Note that this script currently only works for MP3; it could easily be expanded to process tags with other formats, however.

Source Code Download

Source Code:

#!/usr/bin/perl -w
# use modules
use MP3::Tag;
use File::Find;
use Lyrics::Fetcher;
use File::Basename;
##############################
# User Defined Parameters    #
##############################
# set up array of directories to search for songs
# Add new directories by adding them quoted, with commas
@dirs = ("/Music/Songs");
# Read the list of the filenames already processed
open(IN, "</tmp/lrc.txt");
while (<IN>) {
  chomp;
  $index{$_} = 1;
}
close(IN);
#Open the file to store the files processed
open(INDEX, ">>/tmp/lrc.txt" );
# look for files in each directory
find( \&processMP3, @dirs );
#Close and quit
close(INDEX);
# this function is called every time a file is found
sub processMP3 {
    # if the file has an MP3 extension
    if (/\.mp3$/) {
        $mp3 = MP3::Tag->new($_);
        # Skip files that were already processed
        if (exists($index{$_}) )
        {
                print "Skipping $_ since it already was processed\n";
                return;
        }
        # Skip files for which we have lyrics already
        $filename = $File::Find::name;
        ($file,$dir,$suffix) = fileparse($filename, ".mp3");
        $lyricname = $dir . $file . ".lrc";
        if (-e $lyricname)
        {
                print "Skipping $_ since we already have lyrics\n";
                return;
        }
        $lyricname = $dir . $file . ".txt";
        if (-e $lyricname)
        {
                print "Skipping $_ since we already have lyrics\n";
                return;
        }
        # Extract ID3 Information
        # ID3v1 tags, for use as a fallback
        ( $song, $track, $artist, $album ) = $mp3->autoinfo();
        $mp3->get_tags();
        if (exists $mp3->{ID3v2})
        {
                $id3v2 = $mp3->{ID3v2};
                $song = $id3v2->title();
                $artist = $id3v2->artist();
        }
        # Clean up the names, somewhat
        $artist =~ s/([\[\]\`"])//g;
        $artist = nice_string($artist);
        $song   = nice_string($song);
        $artist =~ s/(.)/(ord($1) > 127) ? "" : $1/egs;
        $artist =~ s/\ \/\ //g;
        $song   =~ s/(.)/(ord($1) > 127) ? "" : $1/egs;
        $song   =~ s/\ \/\ //g;
        print "Processing $song by $artist\n";
        # Fetch everything, escaping characters in the title
        $lyrics = Lyrics::Fetcher->fetch($artist,quotemeta($song));
        if ( $lyrics )
        {
                open( OUTFILE, ">$lyricname" );
                print OUTFILE $lyrics;
                close(OUTFILE);
        }
        else
        {
                print "\t*** No lyrics found ($Lyrics::Fetcher::Error)!\n";
        }
        # Add an entry to the list of processed files
        print INDEX "$_\n";
        # clean up
        $mp3->close();
    }
}
sub nice_string {
    join(
        "",
        map {
            $_ > 255
              ?    # if wide character...
              sprintf( "\\x{%04X}", $_ )
              :    # \x{...}
              chr($_) =~ /[[:cntrl:]]/
              ?    # else if control character
              sprintf( "\\x%02X", $_ )
              :          # \x..
              chr($_)    # else as themselves
          } unpack( "U*", $_[0] )
    );                   # unpack Unicode characters
}

Leave a Reply

Your email address will not be published. Required fields are marked *