Pivotal Labs

How You Can Learn to Stop Worrying and Love Continous Integration

edit Posted by Chad Woolley on Tuesday August 05, 2008 at 02:23AM

I just had a discussion with a co-Pivot about the resentment that many teams develop about Continuous Integration - especially when the release process requires a green tag from CI, and a broken build is standing in the way.

Slim Pickens, Dr. Strangelove

As anyone who has worked with me will attest, I'm hardcore on CI and consider any team which leaves a build red for longer than a workday to be sorely lacking in discipline.

OK, OK, there are always extenuating circumstances, but I still believe that most resentment of CI stems from underlying antipatterns and smells, rather than problems with CI itself. For example:

  • "The Customer Has To See This Feature RIGHT NOW": Frequent releases are a great thing, but if you cannot wait for a green build to deploy, you have some deeper problems. Often, this is because a team doesn't manage customer expectations well. The customer should understand that CI is a critical part of the Agile process which ensures that only reliable, quality releases get pushed to staging or production. Any problem which is preventing a green build should be fixed before the release is deployed. If the customer is not willing to allow you that time and flexibility, perhaps they are too addicted to new features, and the entire team needs to have a heart-to-heart about Code Debt in the next retrospective.
  • "It Works For Me, But Fails On CI: The important question is which environment is more like production - your development environment or CI? If you are developing on Windows or Mac, and your production box is some other flavor of Xnix, then your CI box should be as close a possible to production. Ideally, you should be able to log on to the CI instance and debug the failing test there. Usually, your CI box is not configured correctly. If it is hard to keep your CI environment in sync with production, then perhaps you should look into automation (because you KNOW you or your sysadmin will probably forget to do the same thing when you push to production, right?). If the problem is that your development environment is not the same as production, and it is a legitimate problem, then CI just saved you some stress on the next deploy.
  • "Intermittent" Failures: Same deal as the prior point. CI runs your tests much more than you do. For web apps, it hopefully runs them in more browsers than you do. In my experience, many "intermittent" bugs are real bugs which are just very hard to isolate. It could be an AJAX bug that only happens when the site is run remotely, not via localhost. It could be a performance problem which only shows up on a slower system, not your fastest-on-the-market dev box. It could be a dependency on an external resource that happens to be unavailable sometimes, such as a web service, remote storage, etc. Again, just being aware of these issues puts you ahead of the game. For browser bugs, dig in and find out WHY it is failing intermittently. It may be a real bug. For intermittent outages of external resources, you may just have to live with it, but you don't have to live with the intermittent failures in CI. Mock out the resource or disable the tests in the CI environment. Yes, this is OK, especially if you leave them enabled in your development environment. Another option is to automatically repeat these tests a few times with a delay, and only fail the entire build if they fail repeatedly. Big services like Amazon or Google might drop a request occasionally, but still respond to a subsequent request.
  • Slow Test Suites: This is an insidious problem, because once your suite is slow, it is often a monumental effort to make it fast again. It is much better to be proactive, and monitor any slow-running tests like a hawk, relentlessly mocking out slow resources or replacing broad functional tests with faster, more targeted unit tests. You can also always split your tests into different suites, running your fastest tests continuously, and the entire slow deploy-test suite only nightly or periodically. As long as your customer isn't addicted to immediate features, it should be fine to only deploy from nightly builds.
  • The Failing Test That "Doesn't Matter": This is my pet peeve. Whenever I break CI, I fix it ASAP. If I ignore a "minor" broken test, the next thing I check in may be a major FUBAR which gets past my local tests for some reason (see prior points). Some who know me might even say it is LIKELY to be a major FUBAR. The point is, I don't trust myself or my local box, I trust CI. Now, if ANOTHER developer breaks the build, and tries to tells me they are not going to fix it because it is a "minor" problem, that really chaps my hide. They are ripping huge holes in my nice safety net, forcing me to expend much more time and attention on the tests that I run on my local environment, and causing me more stress and work in general. Stop making excuses, and fix the damn build NOW, or comment out the failing test.

Now, I'm sure that all of the above points can be debated or shown to be inapplicable in a specific situation. Plus, if you are dealing with imperfect CI and development tools (which is always the case), you will have some degree of pain which is directly attributable to CI. It would be great to hear about some of these situations in the comments.

Bottom Line: Integration is always one of the most painful parts of software development. Doing integration with high quality and low risk is even harder. Most developers who have been on a non-Agile project of any significant size have experienced days-long integration hell and ulcer-inducing all-night production deployments. Continuous Integration doesn't make that pain and stress go away, but it does break it down into small, bite-sized pieces that can be easily handled on a daily basis. All for the low, low cost of being proactive and disciplined, which makes you a better developer anyway.

Best practices for managing external plugins/libraries which live in Git?

edit Posted by Chad Woolley on Tuesday July 29, 2008 at 08:34PM

This question came up at standup. I've put some thought into it, so I thought I'd throw it out for discussion and see what other people think.

At Pivotal, we use svn:externals and the Third-Party Branch Pattern (from the SCM Patterns Book) to manage easy, reliable cross-project updates of common plugins which live elsewhere (rubyforge, etc), without having to manually update in every project or be at the mercy of the RubyForge svn repo going down or being slow. When everything was on SVN, this worked well; we had some rake tasks which made it easy to update the branch from the latest trunk version in the external vendor repository. However, with more stuff like Desert is moving to GitHub.

One option is to just check the entire Git repo into subversion. This is labor-intensive, though. It also loses some of the benefits of the Third-Party branch pattern, such being able to easily preserve local patches while still pulling in changes from a new vendor version. We could probably update our Rake tasks to handle Git as well as SVN.

Another option is to switch all our projects to Git, but we aren't quite ready for that across the board.

Plus, this is a problem even if you move your parent project to Git. For example, there is no easy Git equivalent to having a svn:external which automatically updates to the latest version of an external repository, which is a useful approach for Continuous Integration or automatically pulling the latest Edge Rails into your project on every update.

Here's some links from people working on related issues, but please comment if you have any ideas or something that works well for you:

The Law of Demeter is a Piffle

edit Posted by Nick Kallen on Wednesday May 07, 2008 at 09:15PM

One of The Blabs' most controversial articles was Lovely Demeter, Meter Maid, in which Pivotal and Thoughtworks battle over which Agile consultancy has the better understanding of the Law of Demeter, and which has better hair and music taste (seriously).

I have never found this "law" very persuasive.

  • The bizarre, culturally loaded analogy about a paperboy and a wallet says nothing insightful about encapsulation boundaries as they arise in real software systems.
  • The blogosphere's endless scholastic hermeneutics of the law's 4 allowances for message sending is a masturbatory philosophical enterprise with nothing relevant to real-world software.
  • The Mockist's insistence on easy mockability is of dubious merit--build better mocking frameworks!
  • And the few practical, real merits that arise in from following the Law of Demeter are better arrived at using other techniques, such as "Tell, Don't Ask".

Here Are Two Examples, one where I violate the Law of Demeter, and another where I don't.

I wrote this code recently, in flagrant violation of the Law:

cookies[:store_id] = @login.store.id

Suppose @login is not an ActiveRecord object, it does not automatically have a #store_id method. Should I create a delegator for this?

class Login
  def store_id
    store.id
  end
end

This is pretty silly. The store_id is not an attribute of the login; rather it's an attribute of the store, and the store is an attribute of the login. The delegator is needless code cruft to replace a dot with an underscore, it smells of the endless boilerplate Java code of my youth. Demeter be damned.

On the other hand, here is a refactoring I did, incidentally complying with the law of Demeter:

Here is the original, Demeter-violating code:

def find_attribute_given_name(name)
  attributes.detect { |a| a.name_or_alias == name }
end

The call to == here is the violation of Demeter. I later replaced this with:

attributes.detect { |a| a.named?(name) }

The latter complies with the "law". And it's much better code. But was I lead to the improvement to this by Demeter? No, I was lead to it by a better understanding of the encapsulation boundaries of the object (#name_or_alias became private) and by a desire to have my code be more terse and clear. a.named?(name) is the most terse explanation of the intended computation that I can think of.

Demeter be damned.

Ruby Pearls vol. 1 - The Splat

edit Posted by Nick Kallen on Wednesday April 23, 2008 at 05:03AM

Over the next week or so I'll be sharing Ruby idioms and flourishes that I quite like. Today I'd I'll show a few tiny uses of splat! that make me tremble with delight.

Splat! - For Beginners

Splat! is the star (*) operator, typically used in Ruby for defining methods that take an unlimited number of arguments:

def sprintf(string, *args)
end

It can also be used to convert an array to the multiple-argument form when invoking a function:

some_ints = [1,2,3]
sprintf("%i %i %i", *some_ints)

Splat! - For Wizards

Array to Hash Conversion

The best use of splat! for invoking a infinite-arity functions I've ever seen is the recipe for converting an array to a hash. Suppose you have an array of pairs:

array = [[key_1, value_1], [key_2, value_2], ... [key_n, value_n]]

You would like to produce from it the hash: {key1 => value1 ... } You could inject down the array, everybody loves inject, but there is a better way:

Hash[*array.flatten]

Amazing right? This relies on the the fact that the Hash class implements the [] (brackets) operator and behaves thusly:

Hash[key1, value1, ...] = { key1 => value1, ... }

Heads or tails?

Splat! can be used for more than just method definition and invocation. My personal favorite use is destructuring assignment. I read this in Active Record's source code recently:

  def sanitize_sql_array(ary)
    statement, *values = ary
    ...
  end

This is invoked when you do something like User.find(:all, :conditions => ['first_name = ? and last_name = ?', 'nick', 'kallen']). Splat! is used here is to get the head and tail of the conditions array. Of course, you could use always use shift, but the functional style used here is quite beautiful. Consider another example:

first, second, *rest = ary

One final trivium (#to_splat aka #to_ary)

You can actually customize the behavior of the splat operator. In Ruby 1.8, implement #to_ary and in 1.9 it's #to_splat. For example

class Foo
  def to_ary
    [1,2,3]
  end
end

a, *b = Foo.new
a # => 1
b # => [2,3]

This also works for method invocation:

some_method(*Foo.new) == some_method(1,2,3)

When I first learned this at RubyConf I thought this was mind-blowing. I have since never used it.

Ninja Patching jQuery

edit Posted by Nick Kallen on Tuesday April 01, 2008 at 11:23PM

Jonathan and I love jQuery's extended psuedo-selectors:

  • :input - Matches all input, textarea, select and button elements.
  • :text - Matches all input elements of type text.
  • :password - Matches all input elements of type password.
  • :hidden - Matches all elements that are hidden, or input elements of type * "hidden".
  • :visible - Matches all elements that are visible.
  • and so on

These aren't actually part of the CSS spec, but they're incredibly useful and can be chained:

$(':input:visible') // => finds all visible inputs

We wanted to customize the behaviors of :text and :visible:

  • We wanted :text to return both <input type="text"> AND <textarea>
  • We wanted :visible to return elements that aren't directly display:none or visibility:hidden, nor are their parents display:none or visibility:hidden

So, we decided to customize this behavior:

jQuery.extend(jQuery.expr[":"], {
  text    : "(a.tagName=='INPUT' && a.type=='text') || (a.tagName=='TEXTAREA')",
  visible : '"hidden"!=a.type && jQuery.css(a,"display")!="none" && jQuery.css(a,"visibility")!="hidden" && (jQuery(a).parent(":hidden").size() == 0)',
  hidden  : 'document != a && ("hidden"==a.type || jQuery.css(a,"display")=="none" || jQuery.css(a,"visibility")=="hidden" || (jQuery(a).parent(":hidden").size() > 0))'
});

So how would you like to ninja-patch jQuery's custom pseudo-selectors?

Evan Phoenix at Mountain West Ruby Conf

edit Posted by Chad Woolley on Monday March 31, 2008 at 05:20AM

I just attended (and thoroughly enjoyed) the Mountain West Ruby Conf, where Evan Phoenix gave a powerful keynote speech.

He talks about the status of Rubinius, and makes some profound observations on modern open source culture and community.

Here's some highlights, but if you participate in open source, and especially if you help run an open source project, I highly recommend that you watch the video:

  • Community: Rubinius' Giant Spec Suite, and its value not only as a living language specification for different implementations of Ruby itself, but as a "gateway drug" which provides a low barrier to get new contributors addicted to open source.
  • Trust: Asking for forgiveness vs. permission, and the Rubinius commit policy, where any accepted patch gets you commit rights. You can always roll back a change, and debate is healthy.
  • Worth: The impact of annoying fifteen year olds who make a lot noise versus "core" contributors.
  • Ego: You are not the project, Mr. Ego! The importance of being wrong and admitting your faults in public.
  • Innovation: Fostering innovation and debate vs closely holding the mythical "Keys to the Castle".

ruby-debug in 30 seconds (we don't need no stinkin' GUI!)

edit Posted by Chad Woolley on Tuesday January 08, 2008 at 08:30AM

Alfonso Bedoya, Treasure of the Sierra Madre

Many people (including me) have complained about the lack of a good GUI debugger for Ruby. Now that some are finally getting usable, I've found I actually prefer IRB-style ruby-debug to a GUI.

There's good tutorial links on the ruby-debug homepage, and a very good Cheat sheet, but I wanted to give a bare-bones HOWTO to help you get immediately productive with ruby-debug.

Install the latest gem

$ gem install ruby-debug

Install the cheatsheet

$ gem install cheat
$ cheat rdebug

Set autolist, autoeval, and autoreload as defaults

$ vi ~/.rdebugrc
set autolist
set autoeval
set autoreload

Run Rails (or other app) via rdebug

$ rdebug script/server

Breakpoint from rdebug

(rdb:1) b app/controllers/my_controller.rb:10

Breakpoint in source

require 'ruby-debug'
debugger
my_buggy_method('foo')

Catchpoint

(rdb:1) cat RuntimeError

Continue to breakpoint

(rdb:1) c

Next Line (Step Over)

(rdb:1) n

Step Into

(rdb:1) s

Continue

(rdb:1) c

Where (Display Frame / Call Stack)

(rdb:1) where

List current line

(rdb:1) l=

Evaluate any var or expression

(rdb:1) myvar.class

Modify a var

(rdb:1) @myvar = 'foo'

Help

(rdb:1) h

There are many other commands, but these are the basics you need to poke around. Check the Cheat sheet for details.

This can also be used directly from any IDE that supports input into a running console (such as Intellij Idea).

That should get you started. So, before you stick in another 'p' to debug, try out ruby-debug instead!

The Power of Versions (Monkey Patches Targeted with Friggin Laser Beams!)

edit Posted by Chad Woolley on Friday January 04, 2008 at 05:32PM

We all love to Monkey Patch Rails and other Ruby apps. However, we sometimes want to target these patches to the specific versions where they are needed.

Here's the easiest way to do this, via RubyGem's built-in version requirement support. The version 0.11.0 should indeed be greater than version 0.9.0:

irb(main):001:0> require 'rubygems'
=> true
irb(main):002:0> Gem::Version::Requirement.new(['> 0.9.0']).satisfied_by?(Gem::Version.new('0.11.0'))
=> true

Notice that you can't do this with string comparison, because with a per-character comparison,1 is not greater than 9:

irb(main):001:0> '0.11.0' > '0.9.0'
=> false

Here's a little class which puts some helper and example methods around this approach (these methods are all in real use for some of our multipart mailer hacks):

module Pivotal
  class VersionChecker
    def self.current_rails_version_matches?(version_requirement)
      version_matches?(Rails::VERSION::STRING, version_requirement)
    end

    def self.version_matches?(version, version_requirement)
      Gem::Version::Requirement.new([version_requirement]).satisfied_by?(Gem::Version.new(version))
    end

    def self.rails_version_is_below_2?
      result = Pivotal::VersionChecker.current_rails_version_matches?('&lt;1.99.0')
      result
    end

    def self.rails_version_is_below_rc2?
      Pivotal::VersionChecker.current_rails_version_matches?('&lt;1.99.1')
    end

    def self.rails_version_is_1991?
      Pivotal::VersionChecker.current_rails_version_matches?('=1.99.1')
    end
  end
end

(note: some angle brackets changed due to code formatting bug)

Here's an example of how you'd use this:

if Pivotal::VersionChecker.rails_version_is_below_2?
  # do some backward compatibility stuff
  # or handle bugs that have been fixed in Rails > 2
end

Note that this is only possible now that Rails has started using a more sensible strategy for versioning edge gems and improved support for using advanced versioning with RAILS_GEM_VERSION.

For many projects, this may be overkill. It is useful at Pivotal, though, where many various projects may be on different rails versions, but still want to use the latest common core libraries (and monkey patches) without having to upgrade Rails for their app.

This isn't only useful for monkey patching. It can be handy for any library that wants to be backward- or forward-compatible with its dependencies. I've used this approach at Pivotal and on my personal projects to have Continuous Integration automatically run my tests against multiple dependency versions, without having to change anything other than the CI project name:

'GemInstaller Continuous Integration automatically running against multiple versions of RubyGems'

There are numerous other related topics for discussion in this area, such as the power of versions or the wisdom of freezing, but I'll save those for future posts. Even if you do freeze the trunk of Rails/plugins/gems, since the version is included in the source, this approach should work barring any conflicts with trunk changes since the last release.

Happy Versioning!

bookmark_fu: drop-in Iconistan

edit Posted by Joe Moore on Saturday December 22, 2007 at 12:50AM

We just implemented bookmark_fu on Pivots and the experience was very smooth, taking only a few minutes. We how have an "Iconistan" of social bookmarking chiclets for either remembering or promoting content on Digg, reddit, del.icio.us -- almost 20 sites in all.



Install via the normal plugin install process (the -x installs it as an SVN:EXTERNAL):

#> ruby script/plugin install -x svn://rubyforge.org/var/svn/pivotalrb/bookmark_fu/trunk/bookmark_fu

I did have one issue -- the script/plugin install script pulled all the code down but ultimately failed because we have multiple versions of Rails on our development machine (about 5); this seemed to confuse the install script. No problem, though: I ran the install.rb script manually:

#> script/runner vendor/plugins/bookmark_fu/install.rb

Thoughts on Linus Torvalds's Git Talk

edit Posted by Joe Moore on Wednesday December 12, 2007 at 08:00PM

At Pivotal Labs last week we watched Linus Torvald's Google talk about Git, the Source Code Management (SCM) system he wrote and uses to manage the Linux kernel code.

I've watched it twice now and here are some thoughts, based on quotes and themes from the video.

"I Never Care About Just One File"

Linus stated that one of the reasons Git was wonderful for him is that, as a high level code maintainer, he needs to merge thousands of files at once. In fact, he stated that he never cares about just one file.

Not so for me. As an in-the-trenches developer, my whole life is caring about just one file, over and over again. When I merge, I care about each file because, since I work on small teams and with small codebases, there is a fairly high likelihood that my changes will collide with those from another developer.

"The Repository Must Be Decentralized.... You Must Have a Network of Trust"

Linus made the point that central repositories suck for large projects where the morons must not have commit access -- only the super privileged are allowed to commit code back to the repo. He argues that Git is better because it is a decentralized network of repositories -- there is no central master, only Some Dudes who have repositories. Usually there is Some Dude In Charge, like Linus, and everyone tends to pull code from them. To update the "master" code version, Some Dude In Charge pulls code from the repositories owned by Some Other Wicked Smart Dudes, who have most likely pulled code from Some Other Trusted Dudes (And One Gal), and so on. Thus, rather than limit access to just the hand-selected few, everyone has their own local copy of the repository, and the smart merge from the smart who merge from the smart, resulting in some kind of official or de facto version.

While I like the local copy of the repo idea, Pivotal does not work the way Linus describes... but Pivotal is weird, in a good way. We all have full commit rights. Our network of trust is everyone. The Dude In Charge is named Continuous Integration. CI makes the official versions. CI runs the tests. CI makes sure that the deploy process works. I'm sure that we could coerce Git into working in a centralized-like way, where it merges automatically from the individual developers and runs the builds, but I'm not sure if that would be forcing a square peg into a penguin-shaped hole.

"Some Companies Use Git And Don't Even Know It"

Linus described how developers at some companies use Git on their development machines, committing their changes and merging fellow developer's changes with Git, then pushing those changes to central SVN repos. He rather mocked this, but it actually sounds like a good solution: developers merge, so use the tool that's good at that. CI machines and deploy machines love centralized master repositories, so use that for those jobs.

"It Does Not Matter How Easy It Is To Branch, Only How Easy It Is to Merge"

Well said. I never thought about that before but he is completely right. I could never put my finger on why I never branch in SVN, even though it's practically 'free' and easy to do. Now it's obvious: who cares how easy it is to branch when merging sucks? Git is supposed to make merging incredibly easy because Git is content-aware rather than just file-aware... or something like that. I'll believe it when I see it, but if Git really does make merging highly divergent branches easy then I'll give it a try.

Joe's Take

I'd like to try Git, especially if it makes branching and merging those branches as easy as Linus suggests, but I don't think that Pivotal would get as much benefit out of it as large, distributed open source projects. A 'really big' project might have 10 developers, not thousands, and all must have commit rights. Our network of trust goes like this: if you are here, we trust you; if we don't trust you, you have to leave. And the idea of having to merge directly from my fellow developers sounds like a pain in the ass... why would I want to merge from 3 separate pairs when I can pull code from the central repo and be reasonably sure (thanks to CI) that it is clean and green? Hopefully I'll be able to answer those questions soon by using Git on a project.

(Note: originally posted on my personal blog at http://40withegg.com/2007/12/11/thoughts-on-linus-torvalds-s-git-talk)

Other articles: