I spent weeks researching ways to build an Ajax chat server for a Rails app. The info is out there, but very fragmented. Nowhere did I find a single resource that explained the whole picture, each piece I needed, and why. Hopefully this can be that resource for you. I do not claim to be an expert, but I did build one, and it works very well. I guess all I claim is, “here’s what I did, hope it gives you some ideas.” Please point out errors and make suggestions. YMMV. I assume you have a good grasp of Ruby, Javascript, and Web-related technologies in general.

Because it’s so lengthy, I’ve broken this up into two parts. Part 1 deals with the server side, while Part 2 deals with tying it into the client side.

Up front I’ll tell you that you’ll need a separate app (ideally on a subdomain) to handle the long polling. It probably should not be Rails, and it should not run on Apache. What we’ll end up with is an Async Sinatra app running on Thin, reverse-proxied through Nginx.

We’ll touch on

WebSockets will one day save us all

WebSockets are the future, providing full-duplex asynchronous push communications between Web browser and Web server, as well as food, shelter and love to everyone on Earth. Unfortunately the future isn’t here yet. Ajax long-polling is a cobbled-together hack until WebSockets and flying cars arrive. When they do, I recommend em-websocket .

Short vs. long polling

Regular, or what I’ll call “short” polling is the youngest kid on Christmas morning pestering, “Can I open my presents now? Can I open my presents now?? Can I open my presents now???!!?” The webserver parents get so overwhelmed that they eventually shut down and stop responding. Long polling is the disinterested teenager who, with headphones on, requests “Tell me when we’re going to open presents.” That one question just hangs there until the parents are ready.

Nginx

Apache’s a great Web server. It’s a wunderkind of the Open Source world, probably only rivaled in success by GNU/Linux. But it is not an asynchronous server . It’s a process-based server, and long polling will bring it to its knees as Apache forks after process for the onslaught of long-running requests.

Apache can do almost anything imaginable; Nginx does the six things you need, and 20x faster . Nginx handles reverse-proxies and load-balancing through TCP or Unix sockets , URL rewrites , SSL , gzipping , easy Cache-Control headers – everything most Web apps need. And of course it’s a great general-purpose Webserver. Here’s a good intro to the basics.

But the killer feature here is that it can easily handle thousands of concurrent requests while using only MB’s of memory. Your Nginx virtual host would look something like below. Brilliantly simple, isn’t it?

upstream  polling-app {
  server  unix:/path/to/thin.sock;
  # Sockets are faster than your TCP/IP stack. Use them if you can!
  #server  127.0.0.1:3000;
}

server {
  listen  80; ## listen for ipv4
  server_name  polling.myapp.com;

  access_log  /var/log/nginx/polling.access.log;
  error_log  /var/log/nginx/polling.error.log;

  location / {
    root /path/to/root/not/sure/it/matters/because/no/files/are/served;
    proxy_pass http://polling-app;
  }
}

Personally, I’ve dropped Apache and switched everything over to Nginx. But if you’re not comfortable doing that, I’d recommend running Nginx in front on 80/443. Your polling app would look like the above example. For everything else, you can switch Apache to use port 8080 (or whatever), and have n Nginx virtual hosts reverse-proxying to your n Apache virtual hosts over port 8080.

Thin

Mongrel? Passenger? Unicorn, Rainbows! or Zbatery? Thin is one of those guys . It’s a Rack app server, running your app’s code behind your Webserver. (Heck, it can even act as a full Webserver with SSL support!) Thin handles requests asynchronously with EventMachine. It can handle a lot at once, which is why it and Nginx are a great pair for this. (Rainbows! and Zbatery might work as drop-ins, but I have more experience with Thin.) Thin’s also a Ruby gem, making it wicked-super-easy to install. In fact it’s probably the least complicated piece of this whole thing. Just install it, write a small config file for your polling app, and you’re done.

Here’s a brief tutorial I wrote on configuring your Thin apps and getting them to start automatically when your system boots up.

Thin is a little unique in that it can be bound to a port or a socket. Since Nginx can reverse-proxy to a socket, and sockets are generally faster than climbing through the TCP/IP stack, I’d
recommend using them if possible. You can find details in thin -h .

EventMachine

EventMachine is what makes all of this possible; a working understanding is critical. The main page of their docs is very good, so give it a read .

  EventMachine::run do
    puts "There's a job to do!"

    job = lambda do
      i = 0
      while i < 10000
        i += 1
      end
      i
    end
    callback = lambda do |num|
      puts "Job done; it counted to #{num}!"
    end

    puts "Starting job..."
    EventMachine::defer job, callback
    puts "Job started!"
    puts "Let's do other stuff while that's running!"
    puts "Other stuff..."
  end

That’s a dump example, but it should get the point across. While it’s counting to 10,000, you can do other stuff. Whenever it’s done it will print out the result. Read over the docs for a whole lot more info.

Sinatra and async_sinatra

Sinatra will be the meat (or tofu, if that’s your thing) of our polling server. It’s a micro-framework written in Ruby. Comparing it to Rails you might say it handles routes and controllers, but everything else is up to you or optional Rack Middleware. It has a great intro and docs , so I’ll let you peruse those at your leisure. But because I’m such a good chap, here’s a small example:

  # Defines a GET action at "/hello"
  get '/hello' do
    sleep 10
    'Hello!'
  end

  # Defines a POST action at "/bienvendidos"
  post '/bienvendidos' do
    'Bienvendidos!'
  end

Notice sleep 10 . Pretend that’s instead a very important, intensive operation that takes about 10 seconds. If you GET /hello and then immediately POST to _/bienvendidos, your bienvendidos request will have to wait on /hello to finish. Put a pin in that.

Async Sinatra is a small yet powerful gem allowing Sinatra to dip down into Thin’s eventmachine-driven innards and deliver responses asynchronously. In short, this means we can easily handle a whole bunch of long-running connections at once. Converting the above example, we would have

  # Defines a GET action at "/hello"
  aget '/hello' do
    big_job = lambda { sleep 10 }
    result = lambda { 'Hello!' }
    EM.defer big_job, result
  end

  # Defines a POST action at "/bienvendidos"
  apost '/bienvendidos' do
    'Bienvendidos!'
  end

Pull that pin out. If you try the same test here, /bienvendidos will return right away while /hello works in the background. As you may have guess, EM is just a handy alias to EventMachine .

My App

Now that you have all the pieces, I’ll show you how I put them together. To understand where my code is coming form, and where yours may want to differ, a brief explanation of what I’m polling and how it’s being used is in order. The larger purpose of the Rails app is unimportant, but one requirement was a real-time, persistent group chat/message board/notification area which I called “Walls.” Groups of users have access to certain Walls. Messages posted to these Walls are stored in the database and can be reviewed at any time. But when users are signed in, they should be able to communicate in real time (long-polling).

For efficiency, I have only one job hitting the database every 1 sec. This job stores a hash like {1 => 57, 2 => 67, 3 => 355}, where 1, 2 and 3 are Wall ids and 57, 67 and 355 are the latest message ids from those Walls. For even more efficiency, this job only runs when any clients are connected. We’ll call this The Global Hash.

Each browser connection (polling request) sends a similar hash containing the latest message ids it has. We’ll call this The Local Hash. While the client’s connected, every 0.5 sec, the polling request sees if The Global Hash has any newer message ids than The Local Hash. If so, it grabs those messages from the db and returns them to the browser.

In the code below, you’ll notice the AppPoller class does most of the heavy lifting. That code is very application-specific and probably wouldn’t do you much good. With that in mind, I’m only showing you the Sinatra code, which should be more than enough to give you some ideas.

config.ru

require './app'
run Pollster

app.rb

require 'rubygems'
require 'sinatra/async'
# Requires EventMachine, your db connection, your "AppPoller" class, etc.
require './config/boot'

# When the reactor starts...
EM.next_tick do
  # Run this every 1 second
  EM.add_periodic_timer(1) do
    # IF anyone's connected, poll the database for new messages.
    # Take the last message id from each wall and store it in a hash like {1 => 56, 2 => 77}
    AppPoller.poll! if AppPoller.has_clients?
  end
end

class Pollster < Sinatra::Base
  register Sinatra::Async

  # Create a new HTTP verb called OPTIONS.
  # Browsers (should) send an OPTIONS request to get Access-Control-Allow-* info.
  def self.http_options(path, opts={}, &block)
    route 'OPTIONS', path, opts, &block
  end

  # Ideally this would be in http_options below. But not all browsers send
  # OPTIONS pre-flight checks correctly, so we'll just send these with every
  # response. I'll discuss what some of them mean in Part 2.
  before do
    response.headers['Access-Control-Allow-Origin'] = 'myapp.com' # If you need multiple domains, just use '*'
    response.headers['Access-Control-Allow-Methods'] = 'GET, POST, OPTIONS'
    response.headers['Access-Control-Allow-Headers'] = 'X-CSRF-Token' # This is a Rails header, you may not need it
  end

  # We need something to respond to OPTIONS, even if it doesn't do anything
  http_options '/' do
    halt 200
  end

  # The root path will serve as a kind of "ping" for our clients.
  # We'll respond to everything with JSON.
  aget '/' do
    response.headers['Content-Type'] = 'application/json'
    body '{"ack": "huzzah!"}'
  end

  # Technically we should use GET, but POST makes it less susceptible to abuse
  apost '/' do
    response.headers['Content-Type'] = 'application/json'

    # Find the user. This is left as an exercise for you.
    user = AppPoller.get_user(params[:session_id])
    unless user
      body '{"errors": ["Invalid user!"]}'
      halt 400
    end

    # Find/parse the last post ids.
    # This is a hash like {1 => 56, 2 => 77} where 1 and 2 are Wall id's,
    # and 56 and 77 are the latest message id's this user has for those
    # walls.
    # user.resolve_last_post_ids is for security, stripping out any 
    # walls the user isn't supposed to have access to. Another exercise for you.
    last_post_id = user.resolve_last_post_ids(params[:last_post_id])
    unless last_post_id.any?
      body '{"errors": ["Invalid parameters!"]}'
      halt 400
    end

    # This is the job that will keep checking for new messages for this
    # user's walls
    pollster = proc do
      AppPoller.add_client user
      time, new_posts = 0, false

      # After a minute, most browsers or proxies will have severed the connection,
      # and we don't want this job running forever.
      until time > 60
        # This just compares the user's latest post_id's to the global hash, so it's very cheap.
        new_posts = AppPoller.posts_since?(last_post_id)
        break if new_posts
        sleep 0.5
        time += 0.5
      end

      # If there were new posts, grab them from the database
      new_posts ? AppPoller.posts_since(last_post_id) : []
    end

    # This job takes the new posts (if any), converts them to JSON,
    # and sends the response.
    callback = proc do |new_posts|
      AppPoller.drop_client user
      walls = {:walls => {}}
      new_posts.each do |p|
        walls[:walls][p.wall_id] ||= {:posts => []}
        walls[:walls][p.wall_id][:posts] << p.to_hash
      and
      body walls.to_json
    end

    # Begin asynchronous work
    EM.defer(pollster, callback)
  end
end

You can run this with rackup config.ru.

Stay tuned for Part 2!