Standup 12/01/2008: Fun with libxml
Interesting Things
- Libxml has been giving us some more strange behavior on Linux. If you do
parser = XML::Parser.new parser.string = '<foo></foo>' document = parser.parse # Now watch me fail, but only on Linux! parser.string = '<bar></bar>' document = parser.parse
We're hosting a MagLev tech talk today compliments of Martin McClure.
Joseph Palermo has won the annual Pivotal Labs Mustache Competition. Granted, he was the only entry. But don't let that affect your admiration of his work. Photo to follow.
Standup 11/26/2008: Assorted Incompatibilities
Interesting Things
The garbage collection bug we encountered on Monday has popped up in new places:
Rspec version < 1.1.11 + Linux ruby 1.8.6 patchlevel > 114 = (small explosion)
libxml version 0.5.4 (11 versions out of date) + Linux ruby 1.8.6 patchlevel > 114 = (another small explosion)
Also, speaking of libxml:
- ActsAsSolr was incompatible with more recent versions of libxml, but David has submitted a fix which has been incorporated into mattmatt's github fork.
Ask for Help
"Turning off cache-busting for test environment replaces cache-busting junk with ??. Is there a way to make it replace it with nothing instead?"
With cache busting on, you have would have to test like this:
admin_button.should have_tag("a > img[src ='/images/v3/admin/image_name.png?12121212?']")
which is awkward.
We turned it off by adding the following to config/environments/test.rb (google said to do it)
ENV['RAILS_ASSET_ID'] = ''
But, then we had to do this:
admin_button.should have_tag("a > img[src ='/images/v3/admin/image_name.png??']")
Is there a way to turn off cache-busting that doesn't put weird/ugly question marks at the end of your image names?
"with_tag & have_tag don't work in non-controller (e.g. model) rspecs?"
Yes, they don't. Use Hpricot instead. Or maybe assert_elements.
Tracker 101 Screencast
If you're new to Tracker, or are considering trying it out but haven't signed up yet, check out the Tracker 101 screencast. Thanks to Ian McFarland for recording it.
Better View Testing with Elementor
We've got a few mantras at Pivotal. One of them has to do with testing all the time. It's a Good Thing, for sure. Until recently though, I had always inserted a tacit "except for views" to the end of it. The reason for my reservations wasn't the fact that view tests can be brittle. Any test can be brittle. I didn't like testing views because it seemed like the test I was writing never really described the code I was writing. Let's look at a typical view test to see what I mean:
describe "/posts/index.html.erb" do
def render_view
render "/posts/index.html.erb"
response.body
end
before(:each) do
assigns[:posts] = [
stub_model(Post, :name => "First!", :body => "first body."),
stub_model(Post, :name => "Second!", :body => "second body.")
]
end
describe "assertions using have_tag" do
it "renders posts" do
render_view
response.should have_tag(".post", 2)
end
it "renders post headers" do
render_view
response.should have_tag(".post .post-name", "First!", 1)
response.should have_tag(".post .post-name", "Second!", 1)
end
it "renders post bodies" do
render_view
response.should have_tag(".post .post-body", "first body.", 1)
response.should have_tag(".post .post-body", "second body.", 1)
end
end
end
This snippet uses the have_tag helper. It's somewhat slow, and to my eyes, expresses intent about as well as an apple floating in a top-hat filled with perfume. The "tag" is a selector? The content filter is just the second argument? And the last argument is the amount of "tag" the response should have? You can test like this, but why would you?
I've also seen a more manual pattern, using a library like Hpricot or Nokogiri to parse the response body, then asserting on the results of that:
describe "assertions using Nokogiri" do
def doc
@doc ||= Nokogiri(render_view)
end
it "renders posts" do
doc.search('.post').should have(2).nodes
end
it "renders post headers" do
headers = doc.search('.post .post-name')
headers.should have(2).elements
headers.detect { |element| element.text == "First!" }.should_not be_nil
headers.detect { |element| element.text == "Second!" }.should_not be_nil
end
it "renders post bodies" do
bodies = doc.search('.post .post-name')
bodies.should have(2).elements
bodies.detect { |element| element.text == "first body." }.should_not be_nil
bodies.detect { |element| element.text == "second body." }.should_not be_nil
end
end
It's faster, since it's not using have_tag, but still not very expressive. CSS selectors are still littered across the it statements, but at least it's only once per test. Still, using detect to find content is no good. And I don't think CSS selectors have any business in it statements at all. That seems like asserting on the name of a method being called, not its behavior.
The solution!
Given my problems with the above approaches, I created a gem that allows the following assertion syntax:
it "renders posts" do
result.should have(2).posts
end
it "renders post headers" do
result.should have(2).post_headers
result.should have(1).post_header.with_text("First!")
result.should have(1).post_header.with_text("Second!")
end
it "renders post bodies" do
result.should have(2).post_bodies
result.should have(1).post_body.with_text("first body.")
result.should have(1).post_body.with_text("second body.")
end
What's a result? And how does it know how many posts, post_headers, and post_bodies it has? The result is defined in a before block like so:
require 'elementor'
require 'elementor/spec'
include Elementor
attr_reader :result
before(:each) do
@result = elements(:from => :render_view) do |tag|
tag.posts ".post"
tag.post_headers ".post .post-name"
tag.post_bodies ".post .post-body"
end
end
The elements method allows you to name your CSS selectors using the tag block argument. The tag object uses method_missing to register your names. The :from option specifies a method to be called that will return some raw markup.
Naming selectors alone was a huge win for me, but there are a few other cool bits about the @result object. First, you get to use the with_text helper for filtering content. You'll also get a with_attrs helper for filtering based on a hash of attribute values.
The project is called Elementor, and you can install it like so:
[sudo] gem install elementor
The code is on the GitHub here: github.com/nakajima/elementor (and you can see the CI build here). Take a look at the specs for all of the examples of what you can do. Hopefully, you'll find it as useful as I have. If not, please share your reasons in the comments!
Hide .svn files in changes view of RubyMine
You have probably noticed that the changes view is unusable in RubyMine because of all the .svn files showing.
Well if you refresh (
) the changes view they go away...Hurray!
Standup 11/24/2008: Changes to GC in Ruby 1.8.6?
Ask for Help
"Does anyone know about recent changes to Ruby 1.8.6 garbage collection?"
We're seeing a strange test failure on one of our CI boxes:
[BUG] object allocation during garbage collection phase ruby 1.8.6 (2008-08-11) [i686-linux]
ruby -v gives this:
ruby 1.8.6 (2008-08-11 patchlevel 287) [i686-linux]
On the developer's OSX machine, which doesn't experience this bug, ruby -v gives:
ruby 1.8.6 (2008-03-03 patchlevel 114) [universal-darwin9.0]
There have apparently been changes to garbage collection in 1.8.7 and 1.9 -- does anyone know if these have been backported to 1.8.6 somewhere between patchlevels 114 and 287?
New York Standup 11/24/2008
Interesting
Rails 2.2.2 is released!
Even Rails 2.2.2 isn't always threadsafe. I found this out by running a script with JRuby from the command line. The script loaded the Rails environment and then launched two threads that simply tried to resolve an ActiveRecord class constant. Fireworks (in the form of LoadError) ensued deep inside of const_missing. I'll post the full example later today.
Tsearch2 is now built into Postgres (as of 8.3). This means you must remove the metadata from your tables, since Postgres now stores it in a separate place.
Unicode Transliteration to Ascii
Matthew O'Connor and I recently worked on a project that sent SMS messages to mobile customers. Unfortunately the SMS aggregator we used on the project rejected messages with non-ascii characters.
One approach we considered was to strip our messages of any characters that were not ascii and send them as is. After looking through some of the rejected messages we realized most of the problems occurred with unicode punctuation. Instead of simple deleting the characters we tried transliterating them to their ascii equivalent.
Our first approach used IConv:
require 'iconv'
module SmsEncoder
def self.convert(utf8_text)
text = Iconv.iconv("US-ASCII//TRANSLIT", "UTF-8", utf8_text).first
text.gsub(/`/, "'")
rescue Iconv::Failure
""
end
end
For some reason the backtick ` also caused problems so we converted that after using Iconv.
This approach worked perfectly on OS X but as soon as we moved to the Linux servers the libiconv characteristics changed and most untranslatable characters became question marks instead of empty strings.
Instead of wrestling with libiconv we looked for a solution entirely in ruby. We found unidecode which got us most of the way there. Unidecode did a little more than we wanted though and translated Chinese and Japanese characters to their approximate sounds. e.g. 今年1月 gets transliterated to Jin Nian 1Yue
We decided to only transliterate extended latin charaters, punctuation and money symbols.
Here is the final code with the unidecode monkey patch:
require 'set'
require 'unidecode'
module SmsEncoder
def self.convert(utf8_text)
Unidecoder.decode(utf8_text.to_s).gsub("[?]", "").gsub(/`/, "'").strip
end
end
module Unidecoder
class << self
def decode(string)
string.gsub(/[^\x20-\x7e]/u) do |character|
codepoint = character.unpack("U").first
if should_transliterate?(codepoint)
CODEPOINTS[code_group(character)][grouped_point(character)] rescue ""
else
""
end
end
end
private
# c.f. http://unicode.org/roadmaps/bmp/
CODE_POINT_RANGES = {
:basic_latin => Set.new(32 .. 126),
:latin1_supplement => Set.new(160 .. 255),
:latin1_extended_a => Set.new(256 .. 383),
:latin1_extended_b => Set.new(384 .. 591),
:general_punctuation => Set.new(8192 .. 8303),
:currency_symbols => Set.new(8352 .. 8399),
}
def should_transliterate?(codepoint)
@all_ranges ||= CODE_POINT_RANGES.values.sum
@all_ranges.include? codepoint
end
end
end
and tests:
class SmsEncoderTest < Test::Unit::TestCase
def test_transliteration_of_blank
assert_equal "", SmsEncoder.convert(nil)
assert_equal "", SmsEncoder.convert("")
end
def test_transliteration_of_whitespace
assert_equal "", SmsEncoder.convert(" \t\n")
end
def test_transliteration_of_text_surrounded_by_space
assert_equal "abc", SmsEncoder.convert(" abc ")
end
def test_transliteration_of_ascii
orig_text = "!\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~"
conv_text = SmsEncoder.convert(orig_text)
assert_equal orig_text.gsub(/`/, "'"), conv_text
end
def test_transliteration_of_unicode_punctuation
utf8_text = "“foo” ‹foo› ‘foo’ ,foo, –foo— {foo} (foo) `foo`"
ascii_text = SmsEncoder.convert(utf8_text)
assert_equal "\"foo\" <foo> 'foo' ,foo, foo-- {foo} (foo) 'foo'", ascii_text
end
def test_transliteration_of_common_latin1_characters
utf8_text = "ñ ò ^ ¡ ¿ Æ æ ß Ç §"
ascii_text = SmsEncoder.convert(utf8_text)
assert_equal "n o ^ ! ? AE ae ss C SS", ascii_text
end
def test_transliteration_of_money_characters
utf8_text = "€ £ $ ¥"
ascii_text = SmsEncoder.convert(utf8_text)
assert_equal "EU PS $ Y=", ascii_text
end
def test_untransliterable_characters
utf8_text = "ɏ \x1f \x01 \x00 Ʌ \x7f"
ascii_text = SmsEncoder.convert(utf8_text)
assert_equal "", ascii_text
end
def test_transliteration_of_chinese_characters
utf8_text = "ウェブ全体から検索"
ascii_text = SmsEncoder.convert(utf8_text)
assert_equal "", ascii_text
end
end
Standup 11/21/2008: Pro Bono Airwaves
Interesting Things
- Rails reminder:
flash[:notice] = "Good Job"will survive a redirect, whileflash.now[:notice] = "Good Job"will not. In general,flash.nowis used when you render a template without a redirect, such as when a form submit has validation errors. - Good Books: Several folks have recommended JavaScript: The Good Parts.
- Pro bono: Would anyone like to help out KUSF for free? Their new website project has been stalled for a year.
Ask for Help
"How do you get Selenium to work with Firefox 3?"
If you know how, pull the jar files out of a later release and use those. Good luck!
Why I test.
This is a cross post from my personal blog, because I'd like to hear from other Pivots about why they test.
First about unit and integration tests.
- I write unit tests for focused feedback; i.e. tell me exactly what broke. To keep them focused I try to keep them orthogonal which usually means using fakes of any collaborators.
- I write integration tests where I need more safety than a unit test will offer. They seem even more important when I’m stubbing and mocking a lot in a dynamically typed language like Ruby.
- I write both types of tests to help convey intent and understand the problem better. I TDD with either a unit test or integration test, whichever feels natural.
Then about interaction and state based testing.
- I pick the approach that feels natural at the time, favouring neither by default. I struggle with rules about when to use which.
- I dislike interaction tests that look suspiciously similar/symmetrical/coupled to the code they refer to. I expect a test to earn its right to exist, and therefore add to the size of the codebase and build’s time, by either conveying intent that is difficult to express in the code itself (which is why I love the term ‘example‘) or addressing some other consciously identified risk.
So, now it's your turn - why do you test?







