Jon Leighton: Rails Codesmith

Articles

Tracking down method definitions in Ruby »

One of my favourite features of Ruby 1.9 is the #source_location method of Proc and Method. Let me explain. Often I am confronted with a large code base (usually Rails), and want to figure out exactly where some method on some object is defined. Rails has many dark corners, and sometimes finding things isn’t entirely straightforward.

A trick that I often use is to get a Method object for the method that I want to find, and then to print out its location. Like this:

class Foo
  def bar
    "omg, where am I?"
  end
end

p Foo.new.method(:bar).source_location # found you!

Recently I was trying to find out where the #flash method of ActionDispatch::TestRequest was defined. It’s not defined in the main files that contain the definitions for ActionDispatch::TestRequest or ActionDispatch::Request (which is the superclass).

So I pulled out my standard tool:

r = ActionDispatch::TestRequest.new
p r.method(:flash).source_location

It didn’t work:

ArgumentError: wrong number of arguments (1 for 0)

It turns out that ActionDispatch::Request has its own method method, which relates to the HTTP method. Hence Ruby’s method method is overridden.

However, all was not lost! Ruby also makes it possible to get a reference to a method definition that isn’t bound to any particular object. This is called an UnboundMethod, and you can’t call it until you bind it to some object. So I was able to get an unbound reference to the original method method and then bind it to my object. Like so:

r = ActionDispatch::TestRequest.new
meth = Object.instance_method(:method)
p meth.bind(r).call(:flash).source_location # found it!

It turns out that the Request class is reopened in actionpack/lib/action_dispatch/middleware/flash.rb and the flash method gets defined there.

How to prevent Ruby's test/unit library from autorunning your tests »

Today I had a situation where I wanted to use perftools.rb to profile a test suite, which was written with Ruby’s test/unit library.

test/unit installs an at_exit hook which runs all the tests before Ruby exits. This was problematic, because it meant that all the tests ran after perftools had already finished its profile.

So I wanted to turn off the autorunning and explicitly run the test suite. I found out how to monkey-patch test/unit to achieve this, and figured it might be useful to someone in the future, so:

Ruby 1.9

require 'test/unit'

class Test::Unit::Runner
  @@stop_auto_run = true
end

class FooTest < Test::Unit::TestCase
  def test_foo
    assert true
  end
end

Test::Unit::Runner.new.run(ARGV)

Ruby 1.8

require 'test/unit'

module Test::Unit
  def self.run?; true; end
end

class FooTest < Test::Unit::TestCase
  def test_foo
    assert true
  end
end

Test::Unit::AutoRunner.run

Hashes and encapsulation »

Earlier today I made an off-the-cuff remark about encapsulation on twitter:

FYI: if you're accessing hash elements via obj.hashthings['foo'] rather than obj.hashthing('foo'), you're breaking encapsulation

Then Josh, not realising that I was simply casting judgement down from my ivory tower, asked for more than 140 characters worth of explanation:

@jonleighton FYI its nice if you explain a bit more :) blog post?

So, I agreed. Here is that explanation.

Suppose you have an object representing a DOM element. DOM elements have attributes, which you store as a hash. Something like this:

class DOMElement
  attr_reader :attributes

  def initialize
    # ...
    @attributes = Hash[parse_attributes.map { |name, value|
      [name, DOMAttribute.new(name, value)]
    }]
  end
end

In your code that uses DOMElement, you access an attribute like this:

element.attributes['margin-left']

If you try to access an attribute that does not exist, nil will be returned.

Later on, you realise that you are checking for nil in lots of places where you use attributes. Listening to the code smell, you decide that missing attributes should instead return a null object.

What do you do? Well, your only option is to add a default proc to the Hash:

@attributes = Hash[parse_attributes.map { |name, value|
  [name, DOMAttribute.new(name, value)]
}]
@attributes.default_proc = proc { NullDOMAttributes.new }

You have now made your object impossible to marshal without using insane hackery, since procs cannot be marshalled.

The more important point, though, is that by allowing users to dip into the @attributes hash like this, you are failing to encapsulate the internal implementation of your DOMElement object.

This presents another problem: an unrelated dodgy bit of code could remove an attribute from the element, without your DOMElement object realising!

el.attributes.delete('margin-left')

If you pass around the same hash between lots of different objects, that hash is quite likely to get mutated at some point in my experience, and when that happens it will affect all the other objects relying on it.

Yes, you could call @attributes.dup every time the attributes method is called, but that’s hardly very efficient when we only want to get at a single attribute.

If the code had been written in an encapsulated manner to begin with:

class DOMElement
  def initialize
    # ...
    @attributes = Hash[parse_attributes.map { |name, value|
      [name, DOMAttribute.new(name, value)]
    }]
  end

  def attributes
    @attributes.dup
  end

  def attribute(name)
    @attributes[name]
  end
end

It would be straightforward to add the null object without further consequences:

def attribute(name)
  @attributes.fetch(name) { NullDOMElement.new }
end

So, exposing internal hashes can be convenient, but you should always think twice before doing so and realise that you might regret the decision at a later date.

Poltergeist: A PhantomJS driver for Capybara »

This announcement is coming way later than I had originally intended. Last October I started experimenting with the idea of writing a driver for Capybara that would use PhantomJS as the browser.

Initially the biggest problem was addressing the issue of how to communicate between a Ruby process and a PhantomJS process. But then it hit me: PhantomJS gives you a browser environment, so you can do everything you can do in a browser, and you can do Web Sockets in a browser. So I used Web Sockets.

After hacking away for a while I eventually had a pretty complete driver. But I was being plagued by segfaults that were coming from WebKit’s JavaScriptCore JS engine. Thus began months of poking C++ code and getting far too comfortable with gdb.

I tried the Qt 4.8 RC (it has since been properly released) and found that the JSC segfault had gone, but now there was a new segfault. After much hair-pulling I found a workaround. But I was still left with another problem: it wasn’t possible to attach files to <input> elements against Qt 4.8. After yet more hair-pulling I found the culprit for that one, too.

Which leads me to the point where I am now finally happy to invite you, dear reader, to try out my humble little Capybara driver. Let me know how you get on!

initialize_clone, initialize_dup and initialize_copy in Ruby »

Ruby has two methods for creating shallow copies of objects:

In general clone is meant to be a more “exact” mechanism for copying objects, whereas dup is often implemented by simply creating a new instance of the relevant class with the appropriate parameters. Both clone and dup copy the tainted state of the object. clone copies the singleton class (if any), whereas dup does not.

Initialising copies

Sometimes it is useful to be able to specify some initialisation code that should be run when an object is copied. For example, suppose an object tracks its own internal state in some way, you may wish to reset this state when the object is copied.

Ruby 1.9 has 3 methods to help you do this: initialize_clone, initialize_dup and initialize_copy. At present, there is no documentation (that I can find), so I had to do a bit of digging through C code to work out the exact behaviour.

The implementation is expressed by the following psuedo-code:

class Object
  def clone
    clone = self.class.allocate

    clone.copy_instance_variables(self)
    clone.copy_singleton_class(self)

    clone.initialize_clone(self)
    clone.freeze if frozen?

    clone
  end

  def dup
    dup = self.class.allocate
    dup.copy_instance_variables(self)
    dup.initialize_dup(self)
    dup
  end

  def initialize_clone(other)
    initialize_copy(other)
  end

  def initialize_dup(other)
    initialize_copy(other)
  end

  def initialize_copy(other)
    # some internal stuff (don't worry)
  end
end

initialize_copy runs for both clone and dup, but it is called by initialize_clone and initialize_dup. Therefore, if you implement your own version of initialize_clone or initialize_dup, it is advisable to call super to make sure that initialize_copy is also called.

Ruby 1.8

Ruby 1.8 behaves in roughly the same way, but it does not have initialize_dup or initialize_clone built-in.

It would be possible to implement some sort of backport in pure Ruby, but harder to get the semantics to be identical:

Ouch, head hurts

Yeah, never mind really. I was just curious and thought I’d share my findings.