A quick update

I’ve kind of let this blog wither on the vine for the past eighteen months, but I’m still elsewhere on the web: I continue to review theatre for both Musical Theatre Review and The Reviews Hub (the new name for The Public Reviews).

But as those who know me are aware, those are my evening pursuits: these days, my full-time job is as a software developer for a company that monitors online discussions about scholarly and academic works.

I used to mix both code and criticism posts on this blog, but now I have a new location for the former – Scott Codes, a Github-hosted blog that currently has a paltry two entries, but will shortly get more.

The blog posts I’m planning for over there will touch on the duality of my working experience, looking at how the worlds of software development and journalism have enough similarities that one discipline can learn from the other. I’ll be kicking that investigation off with a 25-minute session at the next meeting of the London Ruby User Group (LRUG) entitled “Hack like a journalist”:

News reporters are trained in techniques to produce stories that are concise, well structured, easy to follow and with a consistent house style. How can those same techniques help us write better code?

If you’re a Ruby developer, I look forward to seeing you there.

Touch me: cleaning up Rails association code

Yeah, this is one of those posts where I stop wittering on about what I’ve seen at the theatre or in the cinema, and talk about computer code I’ve been writing. Odd mix for a blog, I know: welcome to my world…

In a Rails project I’m currently working on, I have a parent object that has a :has_many collection of children. In this case, they’re all time based, so let’s call the parent class Calendar and the child association Event:

Class Calendar < ActiveRecord::Base
  has_many :events

Class Event < ActiveRecord::Base
  belongs_to :calendar

(As you may have guessed, these aren’t the real class names. Confidentiality, and all that.)

Many operations on our Calendar object rely on the overall date range between the first and last dates in the list of events. Now we can, if we want to, calculate those on the fly using associations:

Class Calendar < ActiveRecord::Base
  has_many :events

  def first_date
    events.order('starts_on ASC').first.starts_on

…and so on. Again, we have quite a few of these methods, and they’re actually a bit more complicated than this.

Continue reading Touch me: cleaning up Rails association code


So I’ve been thinking lately that I should scrap this blog. I so rarely use it any more – instead using my [work blog](http://blogs.thestage.co.uk/tvtoday/), [Twitter](http://twitter.com/scottm), Facebook, Flickr and UncleTomCobleyAndAll.com. Most of the traffic is to a few old posts, whether it’s to [the `simply_helpful` plugin](http://matthewman.net/2006/09/04/new-rails-feature-simply_helpful/) whose functions are now built-in to Rails 2.x, or [long-forgotten memes](http://matthewman.net/2005/05/03/bad-wolf-hunting/).

Maybe I’ll do something new with this place. In time.

For now, on the subject of expiration, I’m linking to [this comment](http://railspikes.com/2008/9/29/an-experiment-with-page-caching#comment-1738) at the base of a post about Rails’ different caching methods, and their expiration techniques.

> One very useful way to avoid the dog pile is to expire behind. Similar to a write behind cache, on checking a ttl and finding the content expired, still serve up the stale content for that request, but also asynchronously start rebuilding the content via a queue/background worker. If you randomize your ttl’s a bit this results in very even system load.

To me, this makes great sense, especially in the context of an application I’m working on at the moment. So I’m linking it here not so much so I can find it again, but in the hope that the concept’ll permeate into my subconscious as I’m coding.

Eager loading objects in a Rails has_many :through association

I’m still working on the Rails application that was the source for [this tutorial](http://www.matthewman.net/articles/2006/01/06/rails-activerecord-goes-through) — which loads of developers who I really respect keep linking to, so I must have done something right there!

Anyway, I’m going to blog this bit of code in the hope that it will help me remember it.

As stated before, I have `Production` and `Venue` objects, which are linked by means of intermediary `Run` objects. Minimal code:

class Production < ActiveRecord::Base has_many :runs has_many :venues, :through => :runs

class Venue < ActiveRecord::Base has_many :runs has_many :productions, :through => :runs

class Run < ActiveRecord::Base belongs_to :production belongs_to :venue end Now, when listing the productions, I need to display a small amount of information about when and where each production is on. The rules that determine what get displayed vary, and aren't particularly relevant to the discussion here. What matters to me is that, when listing (say) 20 productions, with run and venue information included, all the data is collected in a single SQL query. As with standard joins, you can use the `find` method's `:include` option to include associated objects within the query. But I wasn't having much luck pulling in both the runs and the venues: the runs would be detected, but iterating through the productions would trigger a fresh SQL query to find the relevant venue information. Today, I found what I had to do to get it to work. The format for the find needs to be: @productions = Production.find(:all, ... , :include => {:runs => :venues})

Lo and behold, all the requisite data is collected in a single query, which substantially cuts down on the page load time on my development box.

I’m not quite sure why the hash works — or, to be honest, how I stumbled upon the solution: I’ve been scouring the internet all day, yet for some reason I can’t find the page in my browser history. All I know is that it’s not an approach I’d have stumbled upon myself, as it’s not necessarily as intuitive as most ActiveRecord code is. However, there is a certain elegance to it, with the hash’s arrow symbolising the relationship from one object type to the next, which should hopefully enable me to remember the construction in future.

New Rails feature: simply_helpful

UPDATE: Thanks to everybody who’s linking to this blog post, especially the mighty DHH himself. It’s worth pointing out that simply_helpful development is proceeding apace. Some of the bugs mentioned have been fixed; there are additional helpers that aren’t covered here; and, as a plugin that’s very, very new, simply_helpful may well change at short notice.

UPDATE 2: The company I work for is looking for Rails developers who are also comfy maintaining PHP code. View our job ad for more. (Done. Look forward to our first big public Rails app sometime in the summer!)

UPDATE 3: DHH linked to me again — thanks David, and welcome to all you hundreds of new visitors!

A new plugin appeared in the Rails SVN overnight, it seems. Named simply_helpful, the intention seems to be to simplify some of the most common uses of helper functions.

I suspect that these changes have been implemented as a plugin so that they don’t impinge on anybody running Edge Rails until they’re 100% stable and 100% backwards-compatible. So, if you want to explore the changes right now, you’ll have to install the plugin in your project yourself:

script/plugin install simply_helpful

With the new plugin installed, some basic helpers get simplified. For example, in my index.rhtml for my Venues controller, I used to have:

  <tr><th>Name</th> <th>City</th> <th>Postcode</th></tr>
  <%= render :partial => 'venue', :collection => @venues %>

An aside: that’s a greatly simplified version of the table; I’ve cut out a lot of stuff that would get in the way of seeing what simply_helpful is doing.

But, given that Rails can tell that my @venues array is populated with Venue instances, the new helper doesn’t need me to explicitly name the partial I’m using:

  <tr><th>Name</th> <th>City</th> <th>Postcode</th></tr>
  <%= render :partial => @venues %>

So the :collection key becomes redundant (we’re passing an Array, so a collection can be assumed), as does the naming of the partial component itself — simply_helpful automatically derives the location from the passed objects’ class — in this case, as it’s a collection of Venue objects, the partial in question is assumed to be venues/_venue.rhtml.

Along with the collection partials rendering as outlined above, you also get the following (from what my limited Ruby-brain can tell from deciphering the as-yet-undocumented code):

DOM ID attributes

<%= dom_id(object, prefix = nil) %>

Will provide an HTML ID attribute value for an onscreen object based on the object’s class and database ID. For example, dom_id(@person), where @person is a Person instance with a database ID of 123 would return person_123. If the object hasn’t been saved to the database yet, the id will be new_object_name.

By tying this feature into enhancements for RJS helpers, it means that you can now directly link DOM objects on the page to the data objects they represent. The following are now identical in meaning:


Pretty nifty.

DOM Class names

<%= dom_class(object) %>

Creates a CSS class name based on the object’s class — so a ProductionType object would get a DOM class of production_type.

On its own, this new function maybe isn’t as exciting as the dom_id and render improvements, but it’s used in the next feature, and it’s encouraging good practice by having clear, readable and consistent class names for your DOM objects.

Form blocks

The new form_for syntax combines with the new Restful Routes, to greatly simplify the form setup:

  • The form’s action points to an update or create action, depending on whether your object has previously been saved.
  • The form will automatically get a DOM id of new_object or edit_object_# (where object is the object’s class name, and # is its database ID), depending on whether it’s a new form or not
  • The form will also automatically get a class name of either new_object or edit_object (again, where object is your object’s class name)

This means that calls to form_for get massively simplified. For example, this line would now work for both new and existing Venue instances:

<% form_for @venue do |f| %>

For an existing venue object, that would produce:

<form action="http://localhost:3000/venues/123" class="edit_venue" id="edit_venue_123" method="post">
<input name="_method" type="hidden" value="put" />

whereas the same line of code would, for a new venue object, produce:

<form action="http://localhost:3000/venues" class="new_venue" id="new_venue" method="post">
<input name="_method" type="hidden" value="post" />

Unfortunately, as of right this second (revision 5050), it’s a teensy bit broken. There’s an assumption of an :html option hash being present, so I could only get it to work by changing the above line to < % form_for @venue, :html => {} do |f| %> (this is now fixed).

It also doesn’t support custom form builders yet, which I’m using in my project to automatically attach labels to fields, and to visually demarcate required fields from optional ones (I fixed this! Yay me! :-) )

But then, simply_helpful is less than a day old — these bugs will get ironed out pretty quickly. And when they do, Rails will have yet another way in which it encourages you to write simpler, better code, in a way that’s 100% backwards-compatible.

Credit where it’s due: this post by Marcel Molina Jr. helped me figure out my way round the plugin code.

Rails: ActiveRecord goes :through

Sad geek that I am, I’ve fallen a little bit in love with Ruby on Rails, the application development framework that’s ideal for building database-based web applications.

While Rails is relatively new – version 1.0 was released just last month – there are exciting things just waiting round the corner. I’m in the process of taking our existing UK theatre listings database, which currently runs with PHP and MySQL, and porting it over to Rails. In the meantime, I’ll be taking the opportunity to tidy up the database schema a bit. In doing so, I’m going to be able to take advantage of some new mechanisms only available in Edge Rails – the pre-release version of the platform.

With our database, we have two main types of object to keep track of:

* venues (e.g., the West Yorkshire Playhouse or Menier Chocolate Factory)
* productions (e.g. Phantom of the Opera, a touring production )

For productions that tour a number of venues, we construct a run sheet – a list of venues that the production will be appearing in, alongside the start and end dates of each run. Similarly, venues need a list of productions appearing at their space. In other words, both these relationships between Venue and Product objects can be represented by a Run object.

In Rails code, the three objects – and their immediate relationships with one another – are represented thus:

class Production < ActiveRecord::Base has_many :runs end class Venue < ActiveRecord::Base has_many :runs end class Run < ActiveRecord::Base belongs_to :production belongs_to :venue end So far, so good. It explains the concept in ideological terms. But what really dragged me down in PHP was managing the answers to questions that a web site visitor would have: * What's on at my favourite theatre next month? * I enjoyed this production. Where will it tour next? Maybe my friends around the country should know about it... For the real answers to human questions, we need to have a closer relationship between our Venue and Production objects. Now, in PHP – and, to an only slightly lesser extent, in Rails 1.0 – that would mean dropping down to a pure SQL level. In Rails 1.1, the `:has_many` method has a couple of new parameters that allow us to effectively bridge our intermediate object, the Run, when necessary. ### Talking it :through A basic bridging is as simple as adding a new `:has_many` method on the two main objects, adding a `:through` parameter which specifies the intermediate object. So our code now becomes: class Production < ActiveRecord::Base has_many :runs has_many :venues, :through => :runs # < = new end class Venue < ActiveRecord::Base has_many :runs has_many :productions, :through => :runs # < = new end class Run < ActiveRecord::Base belongs_to :production belongs_to :venue end (We could also add the usual extra conditions, such as an :order parameter, to determine the order in which items get retrieved -- for now, I'll leave them out to make the code as simple as possible.) Now, to get an array of all the productions that a venue has on its books, we can just say: venue.productions However, if a production plays more than once at a given venue – which has been known to happen – we'll get one copy of a Production object for each visit to a venue. If we just wanted one object per production, we can build that in to the relationship: class Venue < ActiveRecord::Base ... has_many :productions, :through => :runs,
:select => ‘DISTINCT *’

Behind the scenes, Rails sorts out the SQL necessary. Very impressive, when you’re used to hand-rolling it all yourself. _(Since this article was originally written, you can also add a `:uniq => true` option, which filters out duplicates after the records have been retrieved. Which method of eliminating duplicates works best for you will depend on your application.)_

But that’s not the best part.

### Association extensions

Our online database extends nearly two years back, and can only get larger. How can we answer some of the following questions?

* I’m going to be staying in Leeds sometime next Autumn. What productions might I be able to see at the West Yorkshire Playhouse?
* Where can I find the touring production of Saturday Night Fever three weeks from now?

A new feature, called association extensions, allows us to define code that only applies in the context of this relationship. Now, bear with me on this: there’s little to no documentation yet (such is the burden of working on cutting-edge technologies) and what there is doesn’t really go into detail. But this evening I had a little epiphany on how it can help me.

Let’s replace the Venue object’s relationship with its many productions, and add a little spice. (Now, this code works for me; it’s probably not the most efficient or most database-neutral. So shoot me, I’m still a learner!)

class Venue < ActiveRecord::Base ... has_many :productions, :through => :runs,
:select => ‘DISTINCT *’ do
def between(start_date = nil, end_date = nil)
date_where = []
if start_date
date_where < < %(runs.ends_on >= ‘#{start_date.strftime(“%Y-%m-%d”)}’)
if end_date
date_where < < %(runs.starts_on <= '#{end_date.strftime("%Y-%m-%d")}') end where_clause = date_where.join ' AND ' find :all, :conditions => where_clause

So now, if we want to present a quick look at all productions that are on at a given venue within a certain time frame, we can in a single line:

@venue = Venue.find_by_name(“Royal Opera House”)
@productions = @venue.productions.between(1.week.ago, 1.month.from_now)

Even better, the same `do…end` block of code could be used in the Production object to filter its Venue objects. Rails allows me to hive this off into a separate module, so both classes can share the same block of code and the same filtering techniques.

Now, you can say there are other ways to do this, and you’d be right. You can argue that stepping over the Run object like this causes problems that going a more ‘traditional’ route would avoid. You’d be right, too, in a way.

For me, though, in this particular case, the Run object is the least important of the three objects in the model. What can be thought of as “conventional” approaches give the run an undue level of priority in the scheme of things. The new features of Rails give me the opportunity to structure my model far more closely to how our customers think of our information. And that can only be a good thing.

Besides, who can’t love this line of code?

@productions = @venue.productions.between(1.week.ago, 1.month.from_now)

Hell, that’s a line of code you can show to management and they’d understand it. That’s priceless.