Tuesday, January 13, 2015

Create your own movie database easily with SmartMovieDB!

You can directly jump to the git repository that hosts this project :)


There's a pretty nice source over the internet, regarding the creation of your own movie database.
You can read in detail its usage and functionality over the fore-mentioned link, but, generally, the following schematic explains how it works:

That is, you give as an input a text file that contains the desired movies' names and with the help of the perl scrapper, you get an output of an sql format with all the sql commands that need to be done to your database.

This means of course some extra effort, so why don't we go for an automatic process? What about an sql script runner, which takes as an input an SQL file, connects to the specified database and runs the generated from the scrapper script, queries?  It would be nice if we could stick with Perl, but I didn't had the luxury to waste a lot of time on Perl-MySQL connection, so I chose the easy way, which is Java and fortunately, there is an easy way to do it, again with the support of an open source project (we 'll here need only one part of it, as you noticed, too).

I'm choosing to start with the Java project, which is responsible for reading an sql file and executing the included queries to a given database connection:


  • DatabaseConnection.java defines a valid database connection.
  • ScriptRunner.java is responsible for the convertion of sql statements into databse queries.
  • MainClass.java coordinates the game, as the script runner's instance needs a connection argument, which is actually passed by calling the getConnection() static method of DatabaseConnection.java
And that's is! We 're good to go with the script runner's part.

What is left now, is to find a way to connect the script runner (that takes an sql file and executes the existing queries into a db) with the perl scrapper.

According to the scrapper's part, I'd firstly like to introduce a small parenthesis: I support open source software, so I'm here using the OMDB Api, instead of the iMDB's API, so, together with some updates that had to be done to the scrapper script, here is the updated version of getMovieData.pl:

#!/usr/bin/perl -w
use strict; 
use XML::Simple;
use Data::Dumper;

my $xml = new XML::Simple;

die "Please make that you the movie title is provided into quotes!\n" if(!@ARGV);
my $movie = shift;
$movie =~ s/\s/+/g;

my $cmd = "curl http://www.omdbapi.com/?t=$movie&y=&plot=short&r=xml";
my $movieData = `$cmd`;
my $data = $xml->XMLin( $movieData );

my $released = escapeSingleQuote($data->{movie}->{released});
my $rating = escapeSingleQuote($data->{movie}->{imdbRating});
my $director = escapeSingleQuote($data->{movie}->{director});
my $genre = escapeSingleQuote($data->{movie}->{genre});
my $writer = escapeSingleQuote($data->{movie}->{writer});
my $runtime = escapeSingleQuote($data->{movie}->{runtime});
my $plot = escapeSingleQuote($data->{movie}->{plot});
my $imdb = escapeSingleQuote($data->{movie}->{imdbID});
my $title = escapeSingleQuote($data->{movie}->{title});
my $votes = escapeSingleQuote($data->{movie}->{imdbVotes});
my $poster = escapeSingleQuote($data->{movie}->{poster});
my $year = escapeSingleQuote($data->{movie}->{year});
my $rated = escapeSingleQuote($data->{movie}->{rated});
my $actors = escapeSingleQuote($data->{movie}->{actors});

my $tstamp = time();

print "INSERT INTO movie_collection VALUES (NULL , '$title', '$year', ";
print "'$rated', '$released', '$genre', '$director', '$writer', '$actors', '$plot', ";
print "'$poster', '$runtime', '$rating', '$votes', '$imdb', '$tstamp');\n";

sub escapeSingleQuote {
 my $str = shift;
 $str =~ s/\'/\\'/g;
 return $str;
}

Once the sql file creation is done , we 'll make a system call from Perl to run the exported jar file of our Java project. This means that we need to add the following line in the end of the batch.pl script:   
#!/usr/bin/perl
while(<>){ 
 my $cmd = "perl getMovieData.pl \"$_\""; 
 system($cmd);
}
system "java -jar absolute/path/to/the/exported/jar/file.jar"
Finally, everything is fired up from command line, so, keeping in mind the exact locations of the required files (movies list, jar and generates sql's file), execute the following command:
perl batch.pl movielist.txt > sqlInserts.sql

For more details, there's also a git repository that hosts this project :)