Sunday, October 16, 2011

Writing a minimalistic web-server using event machine

I wanted a small and efficient web-server that can handle large amount of http request that is easy to play around with. The EventMachine is a good candidate together with the http parser provided by Thin. I didn't know much about thin when I started this exercise, but it is also built on top of EventMachine and is gaining in popularity according to this State of the stack article.

But anyway, if you want a simple bare-bones web server with no fluff, continue reading...

Before we start with the actual implemtnation, we need to install some ruby gems. This is easily done by running the following commands in your shell

gem install eventmachine
gem install thin

The server is pretty standard setup using eventmachine and looks like this

# FILE: server.rb

require 'rubygems'
require 'eventmachine'
require_relative './thin_http_parser.rb'

# Freeze some HTTP header names & values
KEEPALIVE = "Connection: Keep-Alive\r\n".freeze

class RequestHandler < EM::Connection
  def post_init
    @parser = RequestParser.new
  end

  def receive_data(data)
    handle_http_request if @parser.parse(data)
  end

  def handle_http_request
    p [@parser.env, @parser.body.string]
    keep_alive = @parser.persistent?

    data = "OK"
    send_data("HTTP/1.1 200 OK\r\nContent-Type: text/html\r\nContent-Length: #{data.bytesize}\r\n#{ keep_alive  ? KEEPALIVE.clone : nil}\r\n#{data}")
    
    if keep_alive
      post_init
    else
      close_connection_after_writing
    end
  end
end

host,port = "0.0.0.0", 8083
puts "Starting server on #{host}:#{port}, #{EM::set_descriptor_table_size(32768)} sockets"
EM.run do
  EM.start_server host, port, RequestHandler
  if ARGV.size > 0
    forks = ARGV[0].to_i
    puts "... forking #{forks} times => #{2**forks} instances"
    forks.times { fork }
  end
end

At last, we need an http parser. I tried both writing my own in pure ruby and the EM::P::HeaderAndContentProtocol parser. But in then end I used the thin parser since it was the fastet parser and it is easy to integrate. This is the code for the parser

# FILE: thin_http_parser.rb

require 'rubygems'
# Uncomment this line if 'thin_parser' is not found
# require 'thin' 
require 'thin_parser'
require 'stringio'

# Freeze some HTTP header names & values
HTTP_VERSION      = 'HTTP_VERSION'.freeze
HTTP_1_0          = 'HTTP/1.0'.freeze
CONTENT_LENGTH    = 'CONTENT_LENGTH'.freeze
CONNECTION        = 'HTTP_CONNECTION'.freeze
KEEP_ALIVE_REGEXP = /\bkeep-alive\b/i.freeze
CLOSE_REGEXP      = /\bclose\b/i.freeze

# Freeze some Rack header names
RACK_INPUT        = 'rack.input'.freeze

# A request sent by the client to the server.
class RequestParser
  INITIAL_BODY      = ''

  # Force external_encoding of request's body to ASCII_8BIT
  INITIAL_BODY.encode!(Encoding::ASCII_8BIT) if INITIAL_BODY.respond_to?(:encode!)

  # CGI-like request environment variables
  attr_reader :env

  # Unparsed data of the request
  attr_reader :data

  # Request body
  attr_reader :body

  def initialize
    @parser   = Thin::HttpParser.new
    @data     = ''
    @nparsed  = 0
    @body     = StringIO.new(INITIAL_BODY.dup)
    @env      = {
      RACK_INPUT        => @body,
    }
  end

  # Parse a chunk of data into the request environment
  # Returns +true+ if the parsing is complete.
  def parse(data)
    if @parser.finished?  # Header finished, can only be some more body
      body << data
    else                  # Parse more header using the super parser
      @data << data
      @nparsed = @parser.execute(@env, @data, @nparsed)
    end

    if finished?   # Check if header and body are complete
      @data = nil
      @body.rewind
      true         # Request is fully parsed
    else
      false        # Not finished, need more data
    end
  end

  # +true+ if headers and body are finished parsing
  def finished?
    @parser.finished? && @body.size >= content_length
  end

  # Expected size of the body
  def content_length
    @env[CONTENT_LENGTH].to_i
  end

  # Returns +true+ if the client expect the connection to be persistent.
  def persistent?
    # Clients and servers SHOULD NOT assume that a persistent connection
    # is maintained for HTTP versions less than 1.1 unless it is explicitly
    # signaled. (http://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html)
    if @env[HTTP_VERSION] == HTTP_1_0
      @env[CONNECTION] =~ KEEP_ALIVE_REGEXP

    # HTTP/1.1 client intends to maintain a persistent connection unless
    # a Connection header including the connection-token "close" was sent
    # in the request
    else
      @env[CONNECTION].nil? || @env[CONNECTION] !~ CLOSE_REGEXP
    end
  end
end
The server is started by running
ruby server.rb

It is also possible to fork the process in order to increase the performance by adding an integer at the end of the start line. If I want to start 4 instances of server, I currently run

ruby server.rb 2

There is not much of an idea to run a lot more instances than you have cores on the machine, since it will only slow things down. And be aware that "fork" doesn't work out of the box on windows OS.

With this setup, I easily get 100.000+ reqs/seconds using ab with keep-alive switch (-k) on a medium core i5 processor.

To summarize my little hacking is that I really enjoy the simple and yet powerful API that EventMachine offers. You can really write compact code and yet good performing servers with Ruby and EventMachine

3 comments:

  1. Please post your whole ab command please. I've tried with below command and get:

    ab -k -c 50 -n 100000 http://127.0.0.1:8083

    Requests per second: 28541.31 [#/sec] (mean)

    ReplyDelete
  2. really enjoyed reading your different articles. They are so informative and thanks man for code for the parser ....
    Aaron

    ReplyDelete
  3. Excellent, Nils, great work.
    I get 20K rps with epoll without forking.

    But I cannot fork if I use epoll. And without epoll, I cannot have more than a few hundred file descriptors (otherwise it is sloooow), so 100,000 rps is not very usefull.

    I don't think there is any solution, unfortunately. Again, great work.

    ReplyDelete