Collapsing Migrations
(6:30 pm: updated to use mysqldump) (12/14/07: updated to remove db:reset since the Rails 2.0 version now does something different.) (12/15/07: updated to not set ENV['RAILS_ENV'] since that gets passed down to child processes)
There was an old hacker who lived in a shoe; she had so many migrations she didn't know what to do. Every time her build ran clean, she spent a whole minute staring at the screen.
Fortunately, she read this blog post and now her db:setup task is so fast she's started building multiple test environments so she can run tests in parallel!
Figure out what migration to collapse to. This number should be less than or equal to the oldest deployed version of your app. E.g. if most of your deployments are on version 348 but there's one client running a branch that's only up to version 298, then pick 298 (or 297 if you're afraid of off-by-one errors). For this example we will use 100.
Install
lib/tasks/db.rakeandlib/db_tasks.rb(source below)Clear the development database by running
rake db:clear
Dump the development structure by running
rake db:dump
Delete all the migrations up to and including your target version. Here's a sneaky awk script that deletes everything up to and including 100. (Go ahead and run it, it won't bite, and you can always revert.)
ls db/migrate/ | awk '{split($0, a, "_"); if(a[1]<=100) print $0}' | xargs svn rm
Create a new migration called "100_collapsed_migrations.rb" using the following template.
100_collapsed_migrations.rb:
class CollapsedMigrations < ActiveRecord::Migration
def self.up
sql = <<-SQL
# development_structure.sql goes here
SQL
execute("SET FOREIGN_KEY_CHECKS=0")
sql.split(";").each do |statement|
execute(statement)
end
ensure
execute("SET FOREIGN_KEY_CHECKS=1")
end
def self.down
raise IrreversibleMigration
end
end
Open up db/development_dump.sql and copy its entire contents into your clipboard, then paste it above the "SQL" line in your new migration 100.
Search for the statement that creates the schema_info table and remove it.
Mine looks like this:
CREATE TABLE `schema_info` (
`version` int(11) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
Set up your databases and run your tests.
rake db:setup test
Congratulations! Your migrations are now blazingly fast, just like back in the (scaff)old days. You can run "
rake db:setup" any time you get a svn update that looks like it may have done something funky to your schema, rather than shying away from that minute-long migration and just hoping your tests still pass.
Why do we need to use db:dump rather than db:schema:dump? Well, unfortunately, db:schema:dump doesn't dump everything. It misses CONSTRAINT statements and also seems to get the charset wrong (although that may have been a function of how I constructed the db in my test). And db:structure:dump misses any data that may have been added by your migrations.
Here's my current db.rake. Unfortunately, it only works with MySQL, but if you want to make it support your favorite DB (or even your least favorite) then please go right ahead.
Oh, and that part about multiple test environments and parallellized tests? Stay tuned... :-)
db.rake:
require "db_tasks"
namespace :db do
def tasks
(@db_tasks ||= DbTasks.new(self))
end
desc "Drop and recreate database"
task :clear => :environment do
tasks.clear
end
desc "Clear and migrate dev and test databases, and load fixtures into development db"
task :setup => :environment do
tasks.setup
end
desc "Dump the current environment's database schema and data to, e.g., db/development_dump.sql (optional param: FILE=foo.sql)"
task :dump => :environment do
if ENV['FILE']
tasks.dump ENV['FILE']
else
tasks.dump
end
end
desc "Load an sql file (by default db/development_dump.sql). (Optional param: FILE=foo.sql)"
task :load => :environment do
if ENV['FILE']
tasks.load ENV['FILE']
else
tasks.load
end
end
end
db_tasks.rb:
# This creates a duplicate of the database config for a db config as defined in database.yml.
# For example, if the "test" database is named "myapp_test",
# for clone number 0, the new environment is named "test0", and the database is "myapp_test0".
# All other settings are preserved (esp. username and password).
module ActiveRecord
class Base
def self.clone_config(original_config, worker_number)
original = configurations[original_config.to_s]
raise "Could not find conguration '#{original_config}' to clone" if original.nil?
worker_config = original.dup
worker_config["database"] += worker_number.to_s
configurations["#{original_config}#{worker_number}"] = worker_config
end
end
end
class DbTasks
def initialize(rake)
@rake = rake
end
def init
connect_to('development')
clear_database
migrate_database
dump
test_environments.each do |test_db|
if test_db =~ /([0-9]+)$/
clone_test_config($1.to_i)
end
connect_to(test_db)
clear_database
load
end
end
# db:clear -> drop and create db for RAILS_ENV
def clear
clear_database
end
# db:setup -> drop, create, and migrate dbs for test and development environments, and import fixtures into development
def setup
init
connect_to 'development'
load_fixtures
end
def dump(file = "#{RAILS_ROOT}/db/#{environment}_dump.sql")
puts "Dumping #{database} into #{file}"
system "mysqldump #{database} -u#{username} #{password_parameter} --default-character-set=utf8 > #{file}"
end
def load(sql_file = "#{RAILS_ROOT}/db/development_dump.sql")
puts "Loading #{sql_file} into #{database}"
query('SET foreign_key_checks = 0')
sql_file = File.expand_path(sql_file)
IO.readlines(sql_file).join.split(";").each do |statement|
query(statement.strip) unless statement.strip == ""
end
query('SET foreign_key_checks = 1')
end
protected
def clone_test_config(worker_num)
ActiveRecord::Base.clone_config("test", worker_num)
end
def connect_to(environment)
ActiveRecord::Base.establish_connection(environment)
@environment = environment
Object.const_set(:RAILS_ENV, environment)
# Note: don't set ENV['RAILS_ENV'] since that gets passed down to invoked tasks (including 'rake test')
end
def environment
(@environment ||= RAILS_ENV)
end
def test_environments
environments = ['test']
if Object.const_defined?(:TEST_WORKERS)
TEST_WORKERS.times do |worker_num|
environments << "test#{worker_num}"
end
end
environments
end
def load_fixtures
puts "Loading fixtures into #{environment}"
Rake::Task["db:fixtures:load"].invoke
end
def clear_database
puts "Clearing #{environment} database"
sql = "drop database if exists #{database}; create database #{database} character set utf8;"
cmd = Q|mysql -u#{username} #{password_parameter} -e "#{sql}"|
# puts "executing #{cmd.inspect}"
system(cmd)
end
def migrate_database
puts "Migrating #{environment} database"
ActiveRecord::Migration.verbose = false
Rake::Task["db:migrate"].invoke
end
def config(env = environment)
ActiveRecord::Base.configurations[env]
end
def query(sql)
ActiveRecord::Base.connection.execute(sql)
end
def database
config["database"]
end
def username
config["username"]
end
def password
config["password"]
end
def password_parameter
if password.nil? || password.empty?
""
else
"-p#{password}"
end
end
def execute(cmd)
puts "\t#{cmd}"
unless system(cmd)
puts "\tFailed with status #{$?.exitstatus}"
end
end
def system(cmd)
@rake.send(:system, cmd)
end
end








A lot of people use migrations to seed data as well as structure. Regardless of whether this is a good practice, can you tell us whether the structure dump includes data, or just schema?
remove
Sweet, I was just thinking about collapsing our hefty migrations at work. They're getting to be a bit of a pain with lots of seed data migrations for really outdated schemas. This is a good excuse to do it and then fixturize our dev seed data.
remove
The structure dump does not include data, which makes a certain amount of sense because loading fixtures wipes out all data regardless of whether it was added by migrations or later. But it is a flaw with the current code, since some tables may not be fixturized, and I'm working right now on making it use mysqldump rather than db:structure:dump.
Too bad, I was hoping to use the built-in Rails rake tasks as much as possible to make it more shareable (and easier to port to other db engines).
remove
There have been some recent threads on the rails-core list about this topic (db setup and seed data).
We've been using db:setup -- which drops, creates, and migrates test and dev dbs, and loads fixtures into dev -- for about 1-1/2 years now and it works well for us. Some of our use cases are:
Basically, there are so many little things involved with database setup that it's nice to have a single, simple command you can run to make sure you're up to date. "Did you run db:setup?" becomes a common refrain when someone's complaining of a broken build.
Also, we use (abuse?) migrations by doing two things that are apparently not done by the rails core team: seed data (mostly for things like enumerations, but sometimes things like the admin user or the default data set), and extra SQL like referential integrity constraints or fancy indexes (which aren't supported by the db:schema output).
remove
More thread: http://groups.google.com/group/rubyonrails-core/browse_thread/thread/fb324a9f722cdf7e?hl=en
remove
The new Rails task "db:migrate:reset" does almost what db:setup does. It drops, creates and migrates -- but it only does it for one database (by default, development). db:setup does it for test and development, which means it can serve the purpose of "make it work" after a questionable update or during a continuous build.
remove