No downtime deploy with capistrano, Thin and nginx
As I mentioned a couple of weeks ago I’ve been working on a tutorial about thinking through problems in graphs and since it’s a Sinatra application I thought thin would be a decent choice for web server.
In my initial setup I had the following nginx config file which was used to proxy requests on to thin:
/etc/nginx/sites-available/thinkingingraphs.conf
upstream thin {
server 127.0.0.1:3000;
}
server {
listen 80 default;
server_name _;
charset utf-8;
rewrite ^\/status(.*)$ $1 last;
gzip on;
gzip_disable "MSIE [1-6]\.(?!.*SV1)";
gzip_types text/plain application/xml text/xml text/css application/x-javascript application/xml+rss text/javascript application/json;
gzip_vary on;
access_log /var/www/thinkingingraphs/shared/log/nginx_access.log;
error_log /var/www/thinkingingraphs/shared/log/nginx_error.log;
root /var/www/thinkingingraphs/current/public;
location / {
proxy_pass http://thin;
}
error_page 404 /404.html;
error_page 500 502 503 504 /500.html;
}
I had an upstart script which started the thin server...
/etc/init/thinkingingraphs.conf
script
export RACK_ENV=production
export RUBY=ruby
cd /var/www/thinkingingraphs/current
exec su -s /bin/sh vagrant -c '$RUBY -S bundle exec thin -p 3000 start >> /var/www/thinkingingraphs/current/log/production.log 2>&1'
end script
... and then I used the following capistrano script to stop and start the server whenever I was deploying a new version of the application:
config/deploy.rb
namespace :deploy do
task(:start) {}
task(:stop) {}
desc "Restart Application"
task :restart do
sudo "stop thinkingingraphs || echo 0"
sudo "start thinkingingraphs"
end
end
The problem with this approach is that some requests receive a 502 response code while its restarting:
$ bundle exec cap deploy
$ while true; do curl -w %{http_code}:%{time_total} http://localhost/ -o /dev/null -s; printf "\n"; sleep 0.5; done
200:0.076
200:0.074
200:0.095
502:0.003
200:0.696
I wanted to try and make a no downtime deploy script and I came across a couple of posts which helped me work out how to do it.
The first step was to make sure that I had more than one thin instance running so that requests could be sent to one of the other ones while a restart was in progress.
I created the following config file:
/etc/thin/thinkingingraphs.yml
chdir: /var/www/thinkingingraphs/current
environment: production
address: 0.0.0.0
port: 3000
timeout: 30
log: log/thin.log
pid: tmp/pids/thin.pid
max_conns: 1024
max_persistent_conns: 100
require: []
wait: 30
servers: 3
daemonize: true
onebyone: true
One of the other properties that we need to set is 'onebyone' which means that when you restart thin it will take down the thin instances one at a time. This means one of the other two can handle incoming requests.
We’ve set the number of servers to 3 which will spin up 3 instances on ports 3000, 3001 and 3002.
I changed my upstart script to look like this:
/etc/init/thinkingingraphs.conf
script
export RACK_ENV=production
export RUBY=ruby
cd /var/www/thinkingingraphs/current
exec su -s /bin/sh vagrant -c '$RUBY -S bundle exec thin -C /etc/thin/thinkingingraphs.yml start >> /var/www/thinkingingraphs/current/log/production.log 2>&1'
end script
I also had to change the capistrano script to call 'thin restart' instead of stopping and starting the upstart script:
config/deploy.rb
namespace :deploy do
task(:start) {}
task(:stop) {}
desc "Restart Application"
task :restart do
run "cd #{current_path} && bundle exec thin restart -C /etc/thin/thinkingingraphs.yml"
end
end
Finally I had to make some changes to the nginx config file to send on requests to other thin instances if the first attempt failed (due to it being restarted) using the proxy_next_upstream method:
/etc/nginx/sites-available/thinkingingraphs.conf
upstream thin {
server 127.0.0.1:3000 max_fails=1 fail_timeout=15s;
server 127.0.0.1:3001 max_fails=1 fail_timeout=15s;
server 127.0.0.1:3002 max_fails=1 fail_timeout=15s;
}
server {
listen 80 default;
server_name _;
charset utf-8;
rewrite ^\/status(.*)$ $1 last;
gzip on;
gzip_disable "MSIE [1-6]\.(?!.*SV1)";
gzip_types text/plain application/xml text/xml text/css application/x-javascript application/xml+rss text/javascript application/json;
gzip_vary on;
access_log /var/www/thinkingingraphs/shared/log/nginx_access.log;
error_log /var/www/thinkingingraphs/shared/log/nginx_error.log;
root /var/www/thinkingingraphs/current/public;
location / {
proxy_pass http://thin;
proxy_next_upstream error timeout http_502 http_503;
}
error_page 404 /404.html;
error_page 500 502 503 504 /500.html;
}
We’ve also made a change to our upstream definition to proxy requests to one of the thin instances which will be running.
When I deploy the application now there is no downtime:
$ bundle exec cap deploy
$ while true; do curl -w %{http_code}:%{time_total} http://localhost/ -o /dev/null -s; printf "\n"; sleep 0.5; done
200:0.094
200:0.095
200:0.082
200:0.102
200:0.080
200:0.081
The only problem is that upstart now seems to have lost a handle on the thin processes and from what I can tell there isn’t a master process which upstart could get a handle on so I’m not sure how to wire this up.
Any ideas welcome!
About the author
I'm currently working on short form content at ClickHouse. I publish short 5 minute videos showing how to solve data problems on YouTube @LearnDataWithMark. I previously worked on graph analytics at Neo4j, where I also co-authored the O'Reilly Graph Algorithms Book with Amy Hodler.