Jon Leighton

Hashes and encapsulation

Earlier today I made an off-the-cuff remark about encapsulation on twitter.

FYI: if you're accessing hash elements via obj.hashthings['foo'] rather than obj.hashthing('foo'), you're breaking encapsulation

Then Josh, not realising that I was simply casting judgement down from my ivory tower, asked for more than 140 characters worth of explanation:

@jonleighton FYI its nice if you explain a bit more :) blog post?

So, I agreed. Here is that explanation.

Suppose you have an object representing a DOM element. DOM elements have attributes, which you store as a hash. Something like this:

class DOMElement
  attr_reader :attributes

  def initialize
    # ...
    @attributes = Hash[parse_attributes.map { |name, value|
      [name, DOMAttribute.new(name, value)]
    }]
  end
end

In your code that uses DOMElement, you access an attribute like this:

element.attributes['margin-left']

If you try to access an attribute that does not exist, nil will be returned.

Later on, you realise that you are checking for nil in lots of places where you use attributes. Listening to the code smell, you decide that missing attributes should instead return a null object.

What do you do? Well, your only option is to add a default proc to the Hash:

@attributes = Hash[parse_attributes.map { |name, value|
  [name, DOMAttribute.new(name, value)]
}]
@attributes.default_proc = proc { NullDOMAttributes.new }

You have now made your object impossible to marshal without using insane hackery, since procs cannot be marshalled.

The more important point, though, is that by allowing users to dip into the @attributes hash like this, you are failing to encapsulate the internal implementation of your DOMElement object.

This presents another problem: an unrelated dodgy bit of code could remove an attribute from the element, without your DOMElement object realising!

el.attributes.delete('margin-left')

If you pass around the same hash between lots of different objects, that hash is quite likely to get mutated at some point in my experience, and when that happens it will affect all the other objects relying on it.

Yes, you could call @attributes.dup every time the attributes method is called, but that’s hardly very efficient when we only want to get at a single attribute.

If the code had been written in an encapsulated manner to begin with:

class DOMElement
  def initialize
    # ...
    @attributes = Hash[parse_attributes.map { |name, value|
      [name, DOMAttribute.new(name, value)]
    }]
  end

  def attributes
    @attributes.dup
  end

  def attribute(name)
    @attributes[name]
  end
end

It would be straightforward to add the null object without further consequences:

def attribute(name)
  @attributes.fetch(name) { NullDOMElement.new }
end

So, exposing internal hashes can be convenient, but you should always think twice before doing so and realise that you might regret the decision at a later date.

9 January 2012

Comments