Sunday, October 16, 2011

Writing a minimalistic web-server using event machine

I wanted a small and efficient web-server that can handle large amount of http request that is easy to play around with. The EventMachine is a good candidate together with the http parser provided by Thin. I didn't know much about thin when I started this exercise, but it is also built on top of EventMachine and is gaining in popularity according to this State of the stack article.

But anyway, if you want a simple bare-bones web server with no fluff, continue reading...

Before we start with the actual implemtnation, we need to install some ruby gems. This is easily done by running the following commands in your shell

gem install eventmachine
gem install thin

The server is pretty standard setup using eventmachine and looks like this

# FILE: server.rb

require 'rubygems'
require 'eventmachine'
require_relative './thin_http_parser.rb'

# Freeze some HTTP header names & values
KEEPALIVE = "Connection: Keep-Alive\r\n".freeze

class RequestHandler < EM::Connection
  def post_init
    @parser = RequestParser.new
  end

  def receive_data(data)
    handle_http_request if @parser.parse(data)
  end

  def handle_http_request
    p [@parser.env, @parser.body.string]
    keep_alive = @parser.persistent?

    data = "OK"
    send_data("HTTP/1.1 200 OK\r\nContent-Type: text/html\r\nContent-Length: #{data.bytesize}\r\n#{ keep_alive  ? KEEPALIVE.clone : nil}\r\n#{data}")
    
    if keep_alive
      post_init
    else
      close_connection_after_writing
    end
  end
end

host,port = "0.0.0.0", 8083
puts "Starting server on #{host}:#{port}, #{EM::set_descriptor_table_size(32768)} sockets"
EM.run do
  EM.start_server host, port, RequestHandler
  if ARGV.size > 0
    forks = ARGV[0].to_i
    puts "... forking #{forks} times => #{2**forks} instances"
    forks.times { fork }
  end
end

At last, we need an http parser. I tried both writing my own in pure ruby and the EM::P::HeaderAndContentProtocol parser. But in then end I used the thin parser since it was the fastet parser and it is easy to integrate. This is the code for the parser

# FILE: thin_http_parser.rb

require 'rubygems'
# Uncomment this line if 'thin_parser' is not found
# require 'thin' 
require 'thin_parser'
require 'stringio'

# Freeze some HTTP header names & values
HTTP_VERSION      = 'HTTP_VERSION'.freeze
HTTP_1_0          = 'HTTP/1.0'.freeze
CONTENT_LENGTH    = 'CONTENT_LENGTH'.freeze
CONNECTION        = 'HTTP_CONNECTION'.freeze
KEEP_ALIVE_REGEXP = /\bkeep-alive\b/i.freeze
CLOSE_REGEXP      = /\bclose\b/i.freeze

# Freeze some Rack header names
RACK_INPUT        = 'rack.input'.freeze

# A request sent by the client to the server.
class RequestParser
  INITIAL_BODY      = ''

  # Force external_encoding of request's body to ASCII_8BIT
  INITIAL_BODY.encode!(Encoding::ASCII_8BIT) if INITIAL_BODY.respond_to?(:encode!)

  # CGI-like request environment variables
  attr_reader :env

  # Unparsed data of the request
  attr_reader :data

  # Request body
  attr_reader :body

  def initialize
    @parser   = Thin::HttpParser.new
    @data     = ''
    @nparsed  = 0
    @body     = StringIO.new(INITIAL_BODY.dup)
    @env      = {
      RACK_INPUT        => @body,
    }
  end

  # Parse a chunk of data into the request environment
  # Returns +true+ if the parsing is complete.
  def parse(data)
    if @parser.finished?  # Header finished, can only be some more body
      body << data
    else                  # Parse more header using the super parser
      @data << data
      @nparsed = @parser.execute(@env, @data, @nparsed)
    end

    if finished?   # Check if header and body are complete
      @data = nil
      @body.rewind
      true         # Request is fully parsed
    else
      false        # Not finished, need more data
    end
  end

  # +true+ if headers and body are finished parsing
  def finished?
    @parser.finished? && @body.size >= content_length
  end

  # Expected size of the body
  def content_length
    @env[CONTENT_LENGTH].to_i
  end

  # Returns +true+ if the client expect the connection to be persistent.
  def persistent?
    # Clients and servers SHOULD NOT assume that a persistent connection
    # is maintained for HTTP versions less than 1.1 unless it is explicitly
    # signaled. (http://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html)
    if @env[HTTP_VERSION] == HTTP_1_0
      @env[CONNECTION] =~ KEEP_ALIVE_REGEXP

    # HTTP/1.1 client intends to maintain a persistent connection unless
    # a Connection header including the connection-token "close" was sent
    # in the request
    else
      @env[CONNECTION].nil? || @env[CONNECTION] !~ CLOSE_REGEXP
    end
  end
end
The server is started by running
ruby server.rb

It is also possible to fork the process in order to increase the performance by adding an integer at the end of the start line. If I want to start 4 instances of server, I currently run

ruby server.rb 2

There is not much of an idea to run a lot more instances than you have cores on the machine, since it will only slow things down. And be aware that "fork" doesn't work out of the box on windows OS.

With this setup, I easily get 100.000+ reqs/seconds using ab with keep-alive switch (-k) on a medium core i5 processor.

To summarize my little hacking is that I really enjoy the simple and yet powerful API that EventMachine offers. You can really write compact code and yet good performing servers with Ruby and EventMachine

Tuesday, June 30, 2009

Global Gaming Factory is buying The Pirate Bay and Peerialism

Now it is public, we (Peerialism) and The Pirate Bay is getting bought by Global Gaming Factory.

http://www.aktietorget.se/NewsItem.aspx?ID=52051 (in Swedish)

Update: Our experience with Global Gaming Factory is so far not that good, http://news.cnet.com/8301-1023_3-10314557-93.html

Sunday, January 4, 2009

Using Google Federated Login in your Rails Application

I'm in the progress of building an Android application with a Ruby on Rails backend. On Android, one of the first things you need to do is to tie the phone to a Google account. So to make it easier for the end users, I thought that I maybe could skip my own user account management and instead piggy back on the Google accounts.

After some initial research I found that Google recently released a single sign-on using OpenID. There are some sites that use the Google Federated Login, e.g. www.zoho.com, www.buxfer.com and http://www.plaxo.com/openid (look for the Google login button). However, it turned out that it wasn't really OpenID, it was something that resembles of OpenID. Even thought it isn't pure OpenID, it does everything I want anyway, so I started to code the login using the "official" OpenID Authentication plugin.

From Google Federated Login you can currently only get the email of a user, but it was impossible to get the email using the OpenID Authentication plugin. After some detective work it was clear that Google Federated Login uses AX attributes, not the SReg attributes that is used by default in the plugin. So my solution is to patch the plugin with the code below. You can add the code to your session controller and it will work with the Google OpenID URI.

  #
  # If we want to get the GMail address for a user using Google Federated Login,
  # we need to work with AX attributes, not SReg attributes which is used by
  # default.
  #
  # To solve this Ax/SReg attribute problem we patch the OpenIdAuthentication
  # module to use AX attributes when talking to the Google OpenID server
  #
  # This patch is based on the source from github[1], January 4, 2009
  #
  # 1. http://github.com/rails/open_id_authentication/commits/master
  #
  module ::OpenIdAuthentication
    require 'openid/extensions/ax'

    private
    def add_simple_registration_fields(open_id_request, fields)
      if is_google_federated_login?(open_id_request)
        ax_request = OpenID::AX::FetchRequest.new
        # Only the email attribute is currently supported by google federated login
        email_attr = OpenID::AX::AttrInfo.new('http://schema.openid.net/contact/email', 'email', true)
        ax_request.add(email_attr)
        open_id_request.add_extension(ax_request)
      else
        sreg_request = OpenID::SReg::Request.new
        sreg_request.request_fields(Array(fields[:required]).map(&:to_s), true) if fields[:required]
        sreg_request.request_fields(Array(fields[:optional]).map(&:to_s), false) if fields[:optional]
        sreg_request.policy_url = fields[:policy_url] if fields[:policy_url]
        open_id_request.add_extension(sreg_request)
      end
    end

def complete_open_id_authentication
      params_with_path = params.reject { |key, value| request.path_parameters[key] }
      params_with_path.delete(:format)
      open_id_response = timeout_protection_from_identity_server { open_id_consumer.complete(params_with_path, requested_url) }
      identity_url     = normalize_identifier(open_id_response.display_identifier) if open_id_response.display_identifier

case open_id_response.status
      when OpenID::Consumer::SUCCESS
        if is_google_federated_login?(open_id_response)
          yield Result[:successful], params['openid.identity'], OpenID::AX::FetchResponse.from_success_response(open_id_response)
        else
          yield Result[:successful], identity_url, OpenID::SReg::Response.from_success_response(open_id_response)
        end
      when OpenID::Consumer::CANCEL
        yield Result[:canceled], identity_url, nil
      when OpenID::Consumer::FAILURE
        yield Result[:failed], identity_url, nil
      when OpenID::Consumer::SETUP_NEEDED
        yield Result[:setup_needed], open_id_response.setup_url, nil
      end
    end

def is_google_federated_login?(request_response)
      return request_response.endpoint.server_url == "https://www.google.com/accounts/o8/ud"
    end
  end

And in the create method (following the example given in the README for the plugin), I have currently hard-coded the OpenID URI to 'https://www.google.com/accounts/o8/id', but you could get it from a form as well. Note that we have two cases to get the email depending if a Google OpenID URI or a regular OpenID URI was used.

  def create
    openid_url = 'https://www.google.com/accounts/o8/id'
    authenticate_with_open_id(openid_url, {:required => [ 'email' ] }) do |result, identity_url, registration|
      case result.status
      when :missing
        failed_login "Sorry, the OpenID server couldn't be found"
      when :invalid
        failed_login "Sorry, but this does not appear to be a valid OpenID"
      when :canceled
        failed_login "OpenID verification was canceled"
      when :failed
        failed_login "Sorry, the OpenID verification failed"
      when :successful
        if registration.class.to_s == "OpenID::AX::FetchResponse"
          email = registration['http://schema.openid.net/contact/email']
        else
          email = registration['email']
        end
        # Find (or create user) based on identity_url
# Note that email is not set when the user has selected 'always remember' in the Google login page for subsequent logins
      end
    end
  end

Update January 9: The code was updated to solve the Google 'always remember' problem

Update January 15: The most important technical issue in using the Google Federated Login API

Thursday, September 6, 2007

An indispensable bash function for Cygwin

Put the following code in your .bashrc file:
function open() {
   if [ "$1" = "" ]
   then
      rundll32 url.dll,FileProtocolHandler .
   else
      rundll32 url.dll,FileProtocolHandler `cygpath -w "$1"`
   fi
}
Now you can use the open command.
  • Type open http://www.franzens.org to fire up your web browser and point it to this site.
  • Type open some_document.doc to start Microsoft Word (or Open Office) with specified document.
  • Type open some_document.txt to start notepad with specified document.
  • Type open to launch the explorer in current directory.

  • Type open //some_network_path/foo/bar to launch the explorer in specified directory.
I think you see the pattern above. I use the open command all the time and it is really useful.

Saturday, May 12, 2007

Visualize Models 1.1

Visualize Models is a small rake script that will generate .png images for Ruby On Rails models (i.e. the database tables) that will display the table/column information. The associations between tables is based on the default Ruby On Rails naming convention (i.e. <table>_id columns). See image below for the typo blog models:

Install as a plugin. From your rails application root, run
ruby script/plugin install svn://rubyforge.org//var/svn/visualizemodels/visualize_models
Run the plugin with:
rake visualize_models
This plugin depends on GraphViz, which you can find here.

If you're a Mac user, the Darwin port of GraphViz seems to work better. Just do:

 sudo port install graphviz

This program has its roots in the Annotate Models by Dave Thomas. See also the RubyForge project page for additional info.

In addition to Visualize Models there are two (that I'm aware of) other projects that you can use to visualize the database in Ruby On Rails:

If you know about any additional resources to visualize rails models, please drop me a mail at nils@<this domain>.

Update: Fixed the ActiveSupport::Inflector in SVN now as noted in the comments. Visualize_models works out of the box for rails 2.2.2 and later now