I have some data describing some experiments and I would like to produce some reports of the data describing various aspects. Lets see how easy it is to do this using Rails. I'm not familar with Rails so I'll be putting in links to tutorials and documentation that I've found along the way.
I have a single database table that is already in SQL format (it was exported from MySQL). So the first step is to insert this data into a postgres database.
> sudo -s > su - postgres > createuser rcole Shall the new user be allowed to create databases? (y/n) y Shall the new user be allowed to create more new users? (y/n) y > exit > exit > createdb jobs CREATE DATABASE
The next step is to create the database. Since I have an SQL dump sent to me by Robert persumably I can just insert it. (Note that MySQL and Postgres have different syntax for auto-incrementing unique id's on table entries).
> psql jobs rcole -f debian.sql psql:debian.sql:19: ERROR: syntax error at or near "EXISTS" at character 15 ...
Hmm, that didn't work, there were lots more errors, but lets look at the first one.
DROP TABLE IF EXISTS `jobs`;
Hmm, it would appear that this MySQL instruction isn't supported by Postgres. Let's see what Google has to say. Google says you can't conditionally drop tables in Postgres, you have to muck around with stopping errors and such. Not too nice.
Well we don't have the table so I'll just comment out the line.
> psql jobs rcole -f debian.sql psql:debian.sql:31: ERROR: syntax error at or near "`" at character 14 ...
This is generated from the following statement.
CREATE TABLE `jobs` ( `id` int(10) unsigned NOT NULL auto_increment, `job_type` varchar(10) NOT NULL default '', `job` varchar(255) NOT NULL default '', `insert_date` timestamp NOT NULL default '0000-00-00 00:00:00', `start_date` timestamp default NULL, `finish_date` timestamp default NULL, `state` varchar(10) default NULL, `machine` varchar(20) default NULL, `batch_date` date NOT NULL default '0000-00-00', PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Well the engine stuff can go. And the 'auto_increment' attribute needs to become 'serial', but not in Postgres 7.3 where it must become 'default nextval('jobs_serial')'. We move the primary key statement into the definition for job types. It also seems that those backticks are no good. Time fields are also different in Postgres: 'timestamp' becomes 'timestamp without timezone'. The zero time also got change to allballs. It is also apparent after a while that 0000-00-00 is not a valid date and so postgres rejects it saying out of range. Instead one must convert 0000-00-00 to 000-01-01 or perhaps null, probably better is to use now.
DROP TABLE jobs; SET datestyle TO iso; CREATE TABLE jobs ( id integer default nextval('jobs_serial') not null primary key, job_type varchar(10) NOT NULL default '', job varchar(255) NOT NULL default '', insert_date timestamp without time zone NOT NULL default now(), start_date timestamp without time zone default NULL, finish_date timestamp without time zone default NULL, state varchar(10) default NULL, machine varchar(20) default NULL, batch_date date NOT NULL default now() ) ;
The next error involved locking the table jobs. In postgres presumably we don't need to lock the table. I'm not sure what operations are considered atomic, but I guess until a commit is performed the operation is atomic, but is it single access? Probably not.
The next error involved a comma between the values being inserted into the table. It seems that postgres only allows one item to be inserted at a time into the table unless the data is being read from an external file. So I used emacs and a macro to split the insert statement up into multiple statements.
Here I'm following the process describing in the http://www.onlamp.com/pub/a/onlamp/2005/01/20/rails.html?page=2 which goes through a very basic Rails Application.
Rails is basically a framework that seperates out different aspects of a web-application. Data is stored in a backend database. Views of the data are constructed and method for modifying the data are also constructed. The view part has two orthogonal dimensions: the visual style of the application; the presentation of tables; the collection of information on pages.
Well we should expect these orthogonal facets to unfold as we use Rails to construct an application. The first step is to create an empty application.
rails jobs
Now we test that it worked. We run the server and then visit And then visit: with our browser.
cd jobs ruby script/server
So now we create a controller. A controller will essentially give us a space in which we can construct pages. The following also sets up a my_test directory in views indicating that views are organised by controllers, i.e. the controller is functionally dependent on the view.
ruby script/generate controller MyTest emacs app/controllers/my_test_controller.rb &
So we can add the following method to the controller that shows that each method of the controller responds to suitable URLS.
class MyTestController < ApplicationController def index render_text "Hello World" end end
Now to make a connection to the database we need to contruct a model with Rails
ruby script/generate model debian
Our database was called jobs. But I got confused here and called it debian on the rails side. It doesn't matter that the name doesn't match because the name gets specified in a configuration file a little bit later. Anyway now we have to generate a controller that we will connect to the debian model.
ruby script/generate controller debian emacs app/controllers/debian_controller.rb &
Now we add scaffolding to the debian controller. This will essentially add methods for new, old, etc. I expect that this is essentially like using a mixin, it imports methods and perhaps also sets up a variable.
class DebianController < ApplicationController scaffold :jobs end
Now testing the new debian controller we get an error.
Errno::ENOENT in Debian#index No such file or directory - /var/run/mysqld/mysqld.sock script/server:51
Ok, so we need to modify the model to make it refer to a postgres database. The relevent file seems to be: config/database.yml
development: adapter: mysql database: rails_development host: localhost username: root password: # Warning: The database defined as 'test' will be erased and # re-generated from your development database when you run 'rake'. # Do not set this db to the same as development or production. test: adapter: mysql database: rails_test host: localhost username: root password: production: adapter: mysql database: rails_production host: localhost username: root password:
We need to change this so that it refers to postgres.
development: adapter: postgresql database: jobs host: localhost username: rcole password: # Warning: The database defined as 'test' will be erased and # re-generated from your development database when you run 'rake'. # Do not set this db to the same as development or production. test: adapter: postgresql database: jobs_test host: localhost username: rcole password: production: adapter: postgresql database: jobs_production host: localhost username: rcole password:
A change to the database means that the server must be restarted. Not that the other modifications to the ruby scripts are picked up automatically.
ruby script/server
Now trying to load the page reports an access problem.
PGError in Debian#index FATAL: IDENT authentication failed for user "rcole" script/server:51
Hmm, there's a problem here. I don't want to put my password into the rails application. So I need to create a new user, give them a password and give them LOCAL access to the database.
> psql jobs rcole jobs=# create user rails CREATE USER jobs=# grant all on jobs to rails ; GRANT alter user rails password 'xxxxxxxx'; ALTER USER
Obviously I didn't really set the password to 'xxxx', but rather something else ;) Now I have to go back and put rails as the user of the database. Now don't forget when the database info is updated the server has to be reset.
Now the postgres database has to be updated to accept password authentication for the rails user from localhost. The relevent file to edit is: file:///etc/postgresql/7.4/main/pg_hba.conf
local all postgres ident sameuser # # All other connections by UNIX sockets local all rails md5 local all all ident sameuser # # All IPv4 connections from localhost host all rails 127.0.0.1 255.255.255.255 md5 host all all 127.0.0.1 255.255.255.255 ident sameuser
Now the scafolding doesn't work. Presumably the scaffold must conform the name to the table and also the name of the model. Rather than bothering with the scaffolding lets instead start trying to construct some usefull methods.
Jumping forward in the model we see that it continues to use the scaffolding to fetch the relevent view. So it looks like we need go back and construct a new model.
script/destroy model debian script/generate model job script/destroy controller debian script/generate controller job emacs app/controllers/job_controller.rb &
We reinsert the scaffolding by editing: app/controllers/debian_controller.rb
class JobController < ApplicationController scaffold :job end
Ok, so now the scaffolding works, in so far as the table is being displayed quite competently to the user. The table could be made much more readable by using a style. What are the options?
Well the tutorial suggests that we construct our own view. Since rails automatically did such a good job from the schema data this seems a bit of pity, however lets run with it and see where it gets us.
First we add a method to the controller to app/controllers/job_controller.rb.
class JobController < ApplicationController scaffold :job def list @jobs = Job.find_all end end
Incidently it would seem that this is going to fetch the entire table into memory. It isn't clear how lazy the application framework is. We now need to construct a view: app/views/job/list.rhtml
<html> <head> <title>List of All Jobs</title> </head> <body> <h1>Jobs Used to Compute MinUnsat and Other Things</h1> <table border="1"> <tr> <td width="20%"><p align="center"><i><b>Id</b></i></td> <td width="20%"><p align="center"><i><b>Type</b></i></td> </tr> <% @jobs.each do |job| %> <tr> <td><%= link_to job.id, :action => "show", :id => job.id %></td> <td><%= job.type %></td> </tr> <% end %> </table> <p><%= link_to "Create new job", :action => "new" %></p> </body> </html>
Ok, so that made something pretty limited and ugly. It displayed all of the jobs. The tutorial adds a bit of functionality that was already provided by the frameworkd that isn't very relevent to what I want to do so I'm skipping it.
The next thing of interest is the use of layouts. Hopefully these will allow us to specify headers and footers across the application and to insert reference to a style sheet.
First thing is to edit the job controller again to add a layout: app/controllers/job_controller.rb.
class JobController < ApplicationController layout "standard-layout" scaffold :job def list @jobs = Job.find_all end end
The use of a string instead of a symbol is curious here. What is that all about? Maybe it will become clear. So we now need to add some custom layout: app/views/layouts/standard-layout.rhtml
<html> <head> <title>Debian Jobs</title> </head> <body> <h1>Debian Jobs</h1> <%= @content_for_layout %> <p> <%= link_to "Create new job", :controller => "job", :action => "new" %> <%= link_to "Show all jobs", :controller => "job", :action => "list" %> </p> </body> </html>
Ok so now the layout is fighting the content. The both have header and footer information. The API for active records can be found here: It is possible to use the method find(:all, :offset = 0, :limit => 10). So we modify app/controllers/job_controller.rb.
class JobController < ApplicationController layout "standard-layout" scaffold :job def list @jobs = Job.find(:all, :offset => 0, :limit => 10) end end
So now we want to display a histogram using rails. There is a good SVG library for constructing histograms. For the later purpose of including these files in a latex document vector graphics is good, but SVG may be a poor choise because of the lack of conversion to PDF.
After looking around it seems that it is quite difficult to convert SVG to PDF. In fact it is quite difficult to construct embedded PDF. The easiest thing to do is to use the svg_graph library to produce png for inclusion in latex documents since few tools are capable of transforming SVG into either eps or pdf without rasterising. There was a program in Cairo, but of course Cairo is broken, and the program isn't supported and doesn't compile any more.