Pivotal Labs

Chad Woolley's blog



How You Can Learn to Stop Worrying and Love Continous Integration

edit Posted by Chad Woolley on Tuesday August 05, 2008 at 02:23AM

I just had a discussion with a co-Pivot about the resentment that many teams develop about Continuous Integration - especially when the release process requires a green tag from CI, and a broken build is standing in the way.

Slim Pickens, Dr. Strangelove

As anyone who has worked with me will attest, I'm hardcore on CI and consider any team which leaves a build red for longer than a workday to be sorely lacking in discipline.

OK, OK, there are always extenuating circumstances, but I still believe that most resentment of CI stems from underlying antipatterns and smells, rather than problems with CI itself. For example:

  • "The Customer Has To See This Feature RIGHT NOW": Frequent releases are a great thing, but if you cannot wait for a green build to deploy, you have some deeper problems. Often, this is because a team doesn't manage customer expectations well. The customer should understand that CI is a critical part of the Agile process which ensures that only reliable, quality releases get pushed to staging or production. Any problem which is preventing a green build should be fixed before the release is deployed. If the customer is not willing to allow you that time and flexibility, perhaps they are too addicted to new features, and the entire team needs to have a heart-to-heart about Code Debt in the next retrospective.
  • "It Works For Me, But Fails On CI: The important question is which environment is more like production - your development environment or CI? If you are developing on Windows or Mac, and your production box is some other flavor of Xnix, then your CI box should be as close a possible to production. Ideally, you should be able to log on to the CI instance and debug the failing test there. Usually, your CI box is not configured correctly. If it is hard to keep your CI environment in sync with production, then perhaps you should look into automation (because you KNOW you or your sysadmin will probably forget to do the same thing when you push to production, right?). If the problem is that your development environment is not the same as production, and it is a legitimate problem, then CI just saved you some stress on the next deploy.
  • "Intermittent" Failures: Same deal as the prior point. CI runs your tests much more than you do. For web apps, it hopefully runs them in more browsers than you do. In my experience, many "intermittent" bugs are real bugs which are just very hard to isolate. It could be an AJAX bug that only happens when the site is run remotely, not via localhost. It could be a performance problem which only shows up on a slower system, not your fastest-on-the-market dev box. It could be a dependency on an external resource that happens to be unavailable sometimes, such as a web service, remote storage, etc. Again, just being aware of these issues puts you ahead of the game. For browser bugs, dig in and find out WHY it is failing intermittently. It may be a real bug. For intermittent outages of external resources, you may just have to live with it, but you don't have to live with the intermittent failures in CI. Mock out the resource or disable the tests in the CI environment. Yes, this is OK, especially if you leave them enabled in your development environment. Another option is to automatically repeat these tests a few times with a delay, and only fail the entire build if they fail repeatedly. Big services like Amazon or Google might drop a request occasionally, but still respond to a subsequent request.
  • Slow Test Suites: This is an insidious problem, because once your suite is slow, it is often a monumental effort to make it fast again. It is much better to be proactive, and monitor any slow-running tests like a hawk, relentlessly mocking out slow resources or replacing broad functional tests with faster, more targeted unit tests. You can also always split your tests into different suites, running your fastest tests continuously, and the entire slow deploy-test suite only nightly or periodically. As long as your customer isn't addicted to immediate features, it should be fine to only deploy from nightly builds.
  • The Failing Test That "Doesn't Matter": This is my pet peeve. Whenever I break CI, I fix it ASAP. If I ignore a "minor" broken test, the next thing I check in may be a major FUBAR which gets past my local tests for some reason (see prior points). Some who know me might even say it is LIKELY to be a major FUBAR. The point is, I don't trust myself or my local box, I trust CI. Now, if ANOTHER developer breaks the build, and tries to tells me they are not going to fix it because it is a "minor" problem, that really chaps my hide. They are ripping huge holes in my nice safety net, forcing me to expend much more time and attention on the tests that I run on my local environment, and causing me more stress and work in general. Stop making excuses, and fix the damn build NOW, or comment out the failing test.

Now, I'm sure that all of the above points can be debated or shown to be inapplicable in a specific situation. Plus, if you are dealing with imperfect CI and development tools (which is always the case), you will have some degree of pain which is directly attributable to CI. It would be great to hear about some of these situations in the comments.

Bottom Line: Integration is always one of the most painful parts of software development. Doing integration with high quality and low risk is even harder. Most developers who have been on a non-Agile project of any significant size have experienced days-long integration hell and ulcer-inducing all-night production deployments. Continuous Integration doesn't make that pain and stress go away, but it does break it down into small, bite-sized pieces that can be easily handled on a daily basis. All for the low, low cost of being proactive and disciplined, which makes you a better developer anyway.

Best practices for managing external plugins/libraries which live in Git?

edit Posted by Chad Woolley on Tuesday July 29, 2008 at 08:34PM

This question came up at standup. I've put some thought into it, so I thought I'd throw it out for discussion and see what other people think.

At Pivotal, we use svn:externals and the Third-Party Branch Pattern (from the SCM Patterns Book) to manage easy, reliable cross-project updates of common plugins which live elsewhere (rubyforge, etc), without having to manually update in every project or be at the mercy of the RubyForge svn repo going down or being slow. When everything was on SVN, this worked well; we had some rake tasks which made it easy to update the branch from the latest trunk version in the external vendor repository. However, with more stuff like Desert is moving to GitHub.

One option is to just check the entire Git repo into subversion. This is labor-intensive, though. It also loses some of the benefits of the Third-Party branch pattern, such being able to easily preserve local patches while still pulling in changes from a new vendor version. We could probably update our Rake tasks to handle Git as well as SVN.

Another option is to switch all our projects to Git, but we aren't quite ready for that across the board.

Plus, this is a problem even if you move your parent project to Git. For example, there is no easy Git equivalent to having a svn:external which automatically updates to the latest version of an external repository, which is a useful approach for Continuous Integration or automatically pulling the latest Edge Rails into your project on every update.

Here's some links from people working on related issues, but please comment if you have any ideas or something that works well for you:

Evan Phoenix at Mountain West Ruby Conf

edit Posted by Chad Woolley on Monday March 31, 2008 at 05:20AM

I just attended (and thoroughly enjoyed) the Mountain West Ruby Conf, where Evan Phoenix gave a powerful keynote speech.

He talks about the status of Rubinius, and makes some profound observations on modern open source culture and community.

Here's some highlights, but if you participate in open source, and especially if you help run an open source project, I highly recommend that you watch the video:

  • Community: Rubinius' Giant Spec Suite, and its value not only as a living language specification for different implementations of Ruby itself, but as a "gateway drug" which provides a low barrier to get new contributors addicted to open source.
  • Trust: Asking for forgiveness vs. permission, and the Rubinius commit policy, where any accepted patch gets you commit rights. You can always roll back a change, and debate is healthy.
  • Worth: The impact of annoying fifteen year olds who make a lot noise versus "core" contributors.
  • Ego: You are not the project, Mr. Ego! The importance of being wrong and admitting your faults in public.
  • Innovation: Fostering innovation and debate vs closely holding the mythical "Keys to the Castle".

ruby-debug in 30 seconds (we don't need no stinkin' GUI!)

edit Posted by Chad Woolley on Tuesday January 08, 2008 at 08:30AM

Alfonso Bedoya, Treasure of the Sierra Madre

Many people (including me) have complained about the lack of a good GUI debugger for Ruby. Now that some are finally getting usable, I've found I actually prefer IRB-style ruby-debug to a GUI.

There's good tutorial links on the ruby-debug homepage, and a very good Cheat sheet, but I wanted to give a bare-bones HOWTO to help you get immediately productive with ruby-debug.

Install the latest gem

$ gem install ruby-debug

Install the cheatsheet

$ gem install cheat
$ cheat rdebug

Set autolist, autoeval, and autoreload as defaults

$ vi ~/.rdebugrc
set autolist
set autoeval
set autoreload

Run Rails (or other app) via rdebug

$ rdebug script/server

Breakpoint from rdebug

(rdb:1) b app/controllers/my_controller.rb:10

Breakpoint in source

require 'ruby-debug'
debugger
my_buggy_method('foo')

Catchpoint

(rdb:1) cat RuntimeError

Continue to breakpoint

(rdb:1) c

Next Line (Step Over)

(rdb:1) n

Step Into

(rdb:1) s

Continue

(rdb:1) c

Where (Display Frame / Call Stack)

(rdb:1) where

List current line

(rdb:1) l=

Evaluate any var or expression

(rdb:1) myvar.class

Modify a var

(rdb:1) @myvar = 'foo'

Help

(rdb:1) h

There are many other commands, but these are the basics you need to poke around. Check the Cheat sheet for details.

This can also be used directly from any IDE that supports input into a running console (such as Intellij Idea).

That should get you started. So, before you stick in another 'p' to debug, try out ruby-debug instead!

The Power of Versions (Monkey Patches Targeted with Friggin Laser Beams!)

edit Posted by Chad Woolley on Friday January 04, 2008 at 05:32PM

We all love to Monkey Patch Rails and other Ruby apps. However, we sometimes want to target these patches to the specific versions where they are needed.

Here's the easiest way to do this, via RubyGem's built-in version requirement support. The version 0.11.0 should indeed be greater than version 0.9.0:

irb(main):001:0> require 'rubygems'
=> true
irb(main):002:0> Gem::Version::Requirement.new(['> 0.9.0']).satisfied_by?(Gem::Version.new('0.11.0'))
=> true

Notice that you can't do this with string comparison, because with a per-character comparison,1 is not greater than 9:

irb(main):001:0> '0.11.0' > '0.9.0'
=> false

Here's a little class which puts some helper and example methods around this approach (these methods are all in real use for some of our multipart mailer hacks):

module Pivotal
  class VersionChecker
    def self.current_rails_version_matches?(version_requirement)
      version_matches?(Rails::VERSION::STRING, version_requirement)
    end

    def self.version_matches?(version, version_requirement)
      Gem::Version::Requirement.new([version_requirement]).satisfied_by?(Gem::Version.new(version))
    end

    def self.rails_version_is_below_2?
      result = Pivotal::VersionChecker.current_rails_version_matches?('<1.99.0')
      result
    end

    def self.rails_version_is_below_rc2?
      Pivotal::VersionChecker.current_rails_version_matches?('<1.99.1')
    end

    def self.rails_version_is_1991?
      Pivotal::VersionChecker.current_rails_version_matches?('=1.99.1')
    end
  end
end

(note: some angle brackets changed due to code formatting bug)

Here's an example of how you'd use this:

if Pivotal::VersionChecker.rails_version_is_below_2?
  # do some backward compatibility stuff
  # or handle bugs that have been fixed in Rails > 2
end

Note that this is only possible now that Rails has started using a more sensible strategy for versioning edge gems and improved support for using advanced versioning with RAILS_GEM_VERSION.

For many projects, this may be overkill. It is useful at Pivotal, though, where many various projects may be on different rails versions, but still want to use the latest common core libraries (and monkey patches) without having to upgrade Rails for their app.

This isn't only useful for monkey patching. It can be handy for any library that wants to be backward- or forward-compatible with its dependencies. I've used this approach at Pivotal and on my personal projects to have Continuous Integration automatically run my tests against multiple dependency versions, without having to change anything other than the CI project name:

'GemInstaller Continuous Integration automatically running against multiple versions of RubyGems'

There are numerous other related topics for discussion in this area, such as the power of versions or the wisdom of freezing, but I'll save those for future posts. Even if you do freeze the trunk of Rails/plugins/gems, since the version is included in the source, this approach should work barring any conflicts with trunk changes since the last release.

Happy Versioning!

Using Search and Replace Regular Expressions to Convert from Test::Unit to Rspec

edit Posted by Chad Woolley on Saturday June 02, 2007 at 10:34PM

I was just converting some Test::Unit tests to Rspec, and these regexps were handy. In one file, they handled 51 out of 53 lines, saving my fingers a lot of work. Tests can take an infinite variety of formats, so these obviously won't apply to everything, but they do illustrate how to use regexp substitution. This is using TextMate, your regexp implementation may vary...

from -> to
search string
replace string

def test_foo -> it "test_foo" do
def (test_[a-z_]*)
it "$1" do

assert !foo -> foo.should_not be_true
assert !(.*)$
$1.should_not be_true

assert foo -> foo.should be_true
assert (.*)$
$1.should be_true

assert_equal foo, bar -> bar.should == foo
assert_equal (.*), (.*)$
$2.should == $1

Avoiding Constants in Rails

edit Posted by Chad Woolley on Monday April 16, 2007 at 04:51AM

In his post "Redefining Constants" ( http://www.pivotalblabs.com/articles/2007/04/14/redefining-constants ), Brian Takita describes how to redefine Rails constants at test time. He points out that "it's all dirty", and that "...maybe the storage service can be an attribute that can be changed for individual tests.".

In a comment, I suggested that a global configuration object would be a better approach, and here's an example. It still uses a constant (as opposed to a global singleton object), but the constant is an object (a hash) which contain other values and objects. This avoids the need to redefine constants to use different values at test-time.

Create a sample Rails app

$ rails railsdi
$ cd railsdi/
$ ls
$ script/generate controller Sample
$ # create development/test databases

Declare the configuration hash

First, add a constant in boot.rb. Just ignore the warning to not modify boot.rb - it's not talking about you. Put this at the beginning, right after the section that defines RAILS_ENV

boot.rb

REGISTRY = {}

Set per-environment defaults

Set any values or objects you want in the registry:

development.rb

REGISTRY[:key] = "development_value"

test.rb

REGISTRY[:key] = "test_value"

production.rb

REGISTRY[:key] = "production_value"

Verify that the correct values are used in each environment

Make a simple controller and view to verify the values are set per-environment:

sample_controller.rb

class SampleController < ApplicationController
  def index
    @registry_value = REGISTRY[:key]
  end
end

sample/index.rhtml

Rails Environment: <%= RAILS_ENV %>
Registry Value: <%= @registry_value %>

Start up the app in development and production environments, and hit http://localhost:3000/sample

Verify that registry values can be overridden at test time

sample_controller_test.rb

  def test__can_redefine_registry_value
    REGISTRY[:key] = 'overridden_value'
    get :index
    assert_equal 'overridden_value', assigns['registry_value']
  end

Summary

I think this is a pretty good approach, and it feels a lot like testing in an app that uses a Dependency Injection/Registry architecture (in other words, simple to override anything you want). I'd be interested to hear if there are any situations that could not use this approach, and would have to fall back to defining constants in the environment files.

It would also be interesting to hear if anyone has had success integrating a Rails application with a Dependency Injection approach (using Needle or a home-grown solution).

-- Chad

Subversion gotcha - deleted folders not shown in diff by default

edit Posted by Chad Woolley on Tuesday March 20, 2007 at 06:47PM

Say you have two tags you want to diff, and one has a deleted directory. If you do an 'svn diff', you won't see the deleted directory UNLESS you give the '--summarize' option:

svn diff --summarize http://host/project/tags/old_version http://host/project/tags/old_version