Unnecessary downtime is bad. To allow for small, frequent deploys, we need to avoid downtime - no matter how short it is. In order to do this, we can’t restart all of the mongrels on all of our app servers at the same time. What we would like is a rolling restart - the ability to restart the mongrels, one app server at a time.

capistrano has the assumption that you want to perform a given task in parallel to all servers within a role. For a rolling restart, we would like to serialize the restart task. Here is way to perform a task serially (across your servers) within cap:

1
2
3
4
5
6
7
8
9
def serialize_task_for(task_name, servers)
  servers.each do |server|
    puts "    Performing #{task_name} for #{server} at #{Time.now}..."
    task(task_name, :hosts => server) do
      yield server
    end
    eval(task_name)
  end
end

Now that we have this, we can do a rolling restart to bounce our mongrels across our app servers:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
namespace :deploy do
  desc "Rolling restart. Restart the mongrels, one app server at a time."
    task :rolling_restart, :roles => :rolling_restart do
      servers = find_servers(:roles => :app)
      serialize_task_for('rolling_restart_for_a_single_server', servers) do |server|
        run("sudo monit restart all -g lumoslabs")
        # wait longer than it normally takes a mongrel to startup
        sleep(70)
        teardown_connections_to(sessions.keys)
        done = false
        while(!done) do
          run("sudo monit summary | grep mongrel | awk '{print $3}' | grep running | wc -l") do |channel, stream, data|
            done = data.split.first.to_i >= num_mongrel_instances
            sleep(10) if !done
          end
          teardown_connections_to(sessions.keys)
        end
      end
  end
end

And if we take this to its logical conclusion, we can now do an uninterrupted deploy to our website:

1
2
3
4
5
6
7
8
9
10
11
namespace :deploy do
  desc "Update the site without taking it offline. Does not run migrations."
  task :uninterrupted do
    transaction do
      update_code
      web.compile_stylesheets
      symlink
    end
    rolling_restart
  end
end

When you should do a rolling restart/uninterrupted deploy, of course, needs to be thought out and is app specific. How to do an uninterrupted deploy with migrations is a future topic.