There are many options for Ruby on Rails hosting, but Heroku is one that we recommend to clients and use internally for Ruby on Rails applications. Using Heroku? There are steps you can take to get more performance out of your Heroku application.
To optimize your application, you first need to measure your application’s performance. There are many ways to do this, including Heroku’s New Relic add-on. If you don’t worry about any other metric, you should at least be concerned with the average response time for your application.
You want this number to be as low as possible. You can monitor this number to see how changes to your code affect the responsiveness of your application. You can also use this number to figure out how many dynos your application will need for a given load.
First, we will need to know how many requests a dyno can serve in a second.
requests_per_second_per_dyno = 1000 / average_response_time_in_milliseconds
Knowing the number of requests per second that a dyno can handle will allow you to figure out how many dynos you will need in order to handle a certain level of traffic. Suppose you know from New Relic that your site gets about 20,300 requests per minute and your average response time is 243 ms.
Doing the math:
requests_per_second_per_dyno = 1000 / 243 # ~4.12 requests per second per dyno requests_per_minute_per_dyno = 4.12 * 60 # 247.2 requests per minute per dyno dynos_needed = 20,300 / 247.2 # 82.12... dynos
So if you want to handle 20,300 requests per minute, you’re going to need at least 82 dynos.
But let’s say you want to handle twice as many requests in a minute. You wouldn’t be able to solve this problem simply by adding more dynos, because Heroku currently limits you to 100 dynos for a web worker. Instead, you have to reduce the average response time of your application. If you could cut this number down to 123 ms per request from 243 ms per request, you’d have doubled your capacity without adding any more dynos.
So how do you decrease response times? Common methods include:
Cache views when possible.
Add database indexes for slow queries where possible.
However, at some point it will become very hard to shave milliseconds off this number and you may wonder what else you can do (besides leaving Heroku).
The Unicorn HTTP server can help you increase the number of requests per second that a dyno can handle. Unicorn allows you to have multiple workers processing requests on one dyno.
How many workers can one dyno have? It depends on the memory usage of your application. To figure out how many workers your dyno can handle, you need to know how much memory a single worker uses. New Relic’s dyno graph will show you this number. Keep in mind that your dyno is limited to 512 MB of memory, so to make use of two workers, your average memory usage for a dyno would need to be at or below 250 MB. The lower your application’s memory usage, the more workers a dyno can handle. If your application can handle 600 requests per minute with one Unicorn worker, it can handle 1200 requests per minute with two workers, 1800 with three workers, and so on.
Increasing the number of Unicorn workers rather than the dynos allows you to mitigate some of the pains associated with random routing, because you’re increasing the chance of routing the request to a free worker.
When configuring Unicorn for Heroku, there are a couple of values you want to pay special attention to:
worker_processes - This tells Unicorn how many workers you want to run per dyno. Use your average memory usage per dyno to figure out what number is best for you. If this number is 1, consider using something else besides Unicorn.
timeout - Heroku times out requests at 30 seconds. This number should be 30 at the maximum. If you don’t want your application waiting 30 seconds to timeout a long-running request, you could set this number even lower.
preload_app - Set this to true. If you are using ActiveRecord, you will want to call
ActiveRecord::Base.connection.disconnect! in the
before_fork block and
ActiveRecord::Base.establish_connection in the
after_fork block. This will insure that the application is preloaded before it is forked.
listen - listen takes a port and a configuration hash. One of the options is
backlog. This number defaults to 1024. The documentation states:
“If you are running Unicorn on multiple machines, lowering this number can help your load balancer detect when a machine is overloaded and give requests to a different machine.”
This is exactly the case with Heroku. You will need to experiment with this number to see what works best for your application, but we have gotten good results by setting this number in the single digits. If you are using the default setting for the backlog, one slow request could potentially affect more than 1,000 requests lined up in the worker’s queue.
If you’re looking for ways to improve your application’s performance on Heroku, first make sure you are measuring performance and looking for ways to optimize your application. If your application’s memory footprint allows, consider using Unicorn to double, triple or maybe even quadruple the number of requests your web dynos can handle. If you decide to give Unicorn a try, be sure to dig in to the tuning docs so you are sure your unicorns are tuned to 11.