acts_as_sphinx plugin

Posted on March 23, 2007

We can't imagine a web site without at least a rudimentary search functionality. What a frustration when we shop for a particular product or looking for an answer on a community web site and can't find exactly what we are looking for. The search is the paramount for any online business. If a customer can't find a product she's looking for, she would go somewhere else and you lose a sale.

Unfortunately, most of the databases don't provide full-text search capabilities or provide a minimal support which is not enough in most of the cases.

Entering fixtures with TextMate

Posted on March 15, 2007

Probably the most boring task when writing Rails test cases is entering new fixtures. With a little help from TextMate this process can be made more pleasant.

  • Open Bundle Editor and create a new command.
  • Name it, for example, 'rails: new fixture'
  • Use this script as the command body:

Debugging Rails application

Posted on September 14, 2006

In this article I am going to describe several ways you can debug your Rails application with ruby-debug starting from simplest and proceeding with more sophisticated setup.

Simplest of all

The simplest and most obvious way is to start your application with rdebug script:

    $ rdebug ./script/server webrick

This method doesn't work with lighttpd since the latter starts several child Rails processes.

When you execute this command you end up with debugger prompt where you can set up your breakpoints.

Debugging without rdebug script

First you need to activate the debugger in your development.rb file:

    ./config/environments/development.rb:

    ...
    require 'ruby-debug'
    Debugger.start

Now you can use Kernel#debugger method to activate debugger:

    def my_action
      debugger
      ...
    end

Remote debugging.

If for some reason you want to connect remotely to debugger, you can start it with remote debugging enabled:

    $ rdebug -s ./script/server webrick

and then connect to it with

    $ rdebug -c
    Connected.
    /usr/local/lib/ruby/1.8/webrick/server.rb:91: if svrs = IO.select(@listeners, nil, nil, 2.0)
    (rdb:1)

Once again, at this point you can add breakpoints and proceed.

Adding breakpoints directly in your code

What if you don't want to setup breakpoints all the time from the debugger's command prompt and you want to use Kernel#debugger method instead. In this case when you start a debugger, you should notify it not stop when a remote client is connected with -n option:

    $ rdebug -sn ./script/server webrick

Remote debugging without rdebug script

As always activate your debugger in development.rb file:

    ./config/environments/development.rb:

    ...
    require 'ruby-debug'
    Debugger.start_remote

Here you can control it with two options:

    ./config/environments/development.rb:

    ...
    require 'ruby-debug'
    Debugger.wait_connection = true
    Debugger.stop_on_connect = true
    Debugger.start_remote
  • Debugger.wait_connection options makes debugger wait until you connect to it remotely.

  • Debugger.stop_on_connect drop you to the debugger prompt as soon as you connect to it.

The method I use most of the time

All described so far methods suffer from one thing: the debugger is always activated. It can slow down your application significantly. What if I want to activate it only in a particular place in my code? As always there are two ways to do that. But first you should only require debugger in development.rb file without starting it:

    ./config/environments/development.rb:

    ...
    require 'ruby-debug'
  • Debugger#start takes a block:

    def my_action
      ...
      # at this point the debugger is still disabled
      Debugger.start do
        # here debugger is enabled
        # below goes the code you want to debug
        ...
      end
      # at this point the debugger is disabled
      ...
    end
    
  • Use Module#debug_method (available since version 0.4.2):

    class MyController < ApplicationController
      def my_action
        ...
      end
      debug_method :my_action
    end
    

    Now whenever you reach my_action method, the debugger is activated.

Rails vulnerability

Posted on August 10, 2006

Yesterday Rails core team announced the release of Rails 1.1.5 version which supposedly fixes a major security vulnerability. Unfortunately, they didn't disclose what the actual problem was. I don't know about you, but I find it an appropriate and very frustrating. I'm a sys admin of a commercial web site and I must know what kind of problem I'm facing.

What even more frustrating is that 1.1.5 release introduced another huge security vulnerability. Just enter URL in your browser consisting a name of any standard Ruby library and your rails application will happily load this library. For example, if you want to bring down any web site powered by Rails 1.1.5, just run this:

    # wget http://<your-website>/debug

several times. This URL makes Rails to load the standard debug.rb library which halts dispatcher process waiting for a terminal input.

I hope the next time rails core team will be more open about security threats. An extra pair of eyes wouldn't hurt with the patch evaluation.

Below is the patch that fixes this hole in Rails 1.1.5:

    Index: actionpack/lib/action_controller/routing.rb
    ===================================================================
    --- actionpack/lib/action_controller/routing.rb (revision 4745)
    +++ actionpack/lib/action_controller/routing.rb (working copy)
    @@ -270,10 +270,11 @@
           protected
             def safe_load_paths #:nodoc:
               if defined?(RAILS_ROOT)
    +            extended_root = Regexp.escape(File.expand_path(RAILS_ROOT))
                 $LOAD_PATH.select do |base|
                   base = File.expand_path(base)
                   extended_root = File.expand_path(RAILS_ROOT)
    -              base.match(/\A#{Regexp.escape(extended_root)}\/*#{file_kinds(:lib) * '|'}/) || base =~ %r{rails-[\d.]+/builtin}
    +              base.match(/\A#{extended_root}\/*(#{file_kinds(:lib) * '|'})/) || base =~ %r{rails-[\d.]+/builtin}
                 end
               else
                 $LOAD_PATH

Asynchronous Email Delivery

Posted on July 13, 2006

It's a quite common for an action to finish by sending a confirmation email to the client. Usually you don't want to put this logic directly in your model. Models deal with a business rules and shouldn't care about emails or any other infrastructure nonsense (unless of cause this is the business domain of your application). In Rails you define your business logic in terms of ActiveRecord::Base and the delivery of emails in terms of ActionMailer::Base. In order to decouple these two, Rails provides a mechanism of ActiveRecord observers. These glue objects listen for changes in the model and react accordingly (usually using after_save method). If we need to perform a complex operation that involves updating several tables, it is better to wrap it with a database transaction. What I didn't pay attention to is that observer's after_save method is also a part of the transaction, meaning the transaction will remain open until this method finish.

This lead to a potential problem. What happens when your email server is under heavy load or you have some kind of DNS problem and the address resolution takes unusual amount of time? Your transactions will stay open for a long time and some of them will eventually time out.

Tutorial on ruby-debug

Posted on July 11, 2006

Preface

Overcomplicated specifications lead to overcomplicated implementations. Lately I've been fixing issues with ActionWebService framework - a soon to be removed part of Ruby on Rails. I have to use SOAP in the web application I'm developing. My client needs to keep his product inventory in a good shape and he requested to implement some of the inventory management functionality using handheld devices. There is a .Net environment available for this kind of devices and it works very well. So I desperately needed a functional web service implementation for Rails. That why I volunteered to fix AWS. And that's when I found out that ruby-breakpoint just doesn't cut it. Don't get me wrong, soap4r is a fine piece of software, but SOAP is difficult and obscure, which leads to complicated libraries that implement it. The situation is even worse when such libraries have almost no documentation whatsoever.

So instead of just stopping at some point in your program to examine the environment (the facility offered by ruby-breakpoint library), ruby-debug extension offers the full-fledged debugger for Ruby. The main difference between ruby-debug and the standard debug.rb library is the speed of the execution. Major problem with debug.rb is that it uses Kernel#set_trace_func method, which requires creation of Binding object for each hook invocation. It is fine for small scripts, but for the real world applications like Rails ones, debug.rb is almost impossible to use. You just sit and watch how Ruby interpreter creates enormous amount of Binding objects, just in order to destroy them with the immediate garbage collection cycle. It also explains that ruby-debug doesn't support watchpoints for the same reason.

Faster debugger for Ruby

Posted on July 07, 2006

I've been watching a thread on ruby-talk about the possibility to have a faster debugger for Ruby.

Now when Ruby 1.8.4 has a better C based API for tracing code execution, it is possible to significantly speed up debug.rb. I've been playing with this idea for last two days and came up with ruby-debug extension:

Follow these easy steps:

  • Download extension ruby-debug-0.1.gem.
  • Install it with sudo gem install ruby-debug-0.1.gem
  • Use either

        $ rdebug _your-script_
    

    Or for your Rails application:

    1. Add to your config/environments/development.rb

       require 'ruby-debug'
      
    2. Use Kernel#debugger method to interrupt your application and invoke debugger

       def myaction
        ...
        debugger
        ...
       end
      

Enjoy.

Update. I've created a project on rubyforge.org for this extension and uploaded a bugfix version 0.1.2. Now you don't have to download the gem file from this site. Just use

    $ gem install ruby-debug

command to install it.

Apache2 with mod_fcgid

Posted on April 06, 2006

I've been running Apache 1.3.x with mod_fastcgi for almost a year and a half and it's been a very stable combination. I could restart apache with


  apachectl graceful

and fcgi manager would send USR1 signal to all fastcgi children.

This mechanism is completely broken with Apache2. When I restart apache, for some reason it sends the signal only to one of its fastcgi children, the rest remain in memory doing nothing. Unfortunately, it seems that mod_fastcgi is not being actively maintained now.

Welcome mod_fcgid!. According to the mod_fcgid website:

mod_fcgid has a new process management strategy, which concentrates on reducing the number of fastcgi server, and kick out the corrupt fastcgi server as soon as possible.

The default parameters are pretty much aggressive in terms that fcgid would kill children that were idle for more than 5 minutes and it would always kill those that lived more that 1 hour. This is not exactly what I want for my long running Rails dispachers. I've been tweaking configuration options until I came up with:


  
    AddHandler fcgid-script .fcgi
    SocketPath /var/lib/apache2/fcgid/sock
    IPCCommTimeout 120
    IPCConnectTimeout 10
    MaxProcessCount 40
    ProcessLifeTime 86400
    IdleTimeout 1800
    DefaultMaxClassProcessCount 8
    DefaultInitEnv RAILS_ENV production
  

Now I can freely use apache2ctl graceful to restart my application without worrying that there might be some processes left wasting memory. Also this gives me ability to use Capistrano to automate my deployment process. Check it out, it rocks!

Rails 1.1 is out!

Posted on March 27, 2006

Very impressive list of new features!

Extend your Ruby with Rails

Posted on March 17, 2006

Isn't it nice when you can extend your programming language to the point where you almost speak a human language instead of some obscure one? Check this out:

1
2
3
4
5
6
7
8

$ ./script/console
  >> 30.minutes.ago
  => Wed Feb 15 16:02:08 EST 2006
  >> (1.hour.ago..30.minutes.ago).to_s
  => "Wed Feb 15 15:32:14 EST 2006..Wed Feb 15 16:02:14 EST 2006"
  >> (1.hour.ago..30.minutes.ago).to_s(:db)
  => "BETWEEN '2006-02-15 15:32:19' AND '2006-02-15 16:02:19'"

Rails and Response Streaming

Posted on March 17, 2006

During my usual server maintenance procedure the other day I noticed that rails dispatch.fcgi processes take unusual amount of memory. By unusual I meant that instead of 25M of RAM for each process, they occupy about 150M each. I started the investigation process.

The first thing I did I created a small plugin with logged the current status of the RSS and VSZ memory and calculated a difference from these numbers and numbers from the previous request. Note that I've implemented it only for Linux platform. Here is the code itself:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

  module MemoryLogging
    def self.included(controller)
      controller.after_filter :log_mem_stat  if PLATFORM =~ /linux/
    end
  
    private

    def log_mem_stat
      vm_info = File.read("/proc/self/status").grep(/^Vm(RSS|Size)/).map{|l| 
        l.chomp.gsub(/\t/, ' ').gsub(/ +/, ' ')
      }
      rss, size = vm_info.map{|l| l.scan(/\d+/).first.to_i}
      info = "PID #{$$}, #{vm_info.join(', ')}"
      info << ", RSS Diff #{rss - $vm_rss} kB" if $vm_rss
      info << ", Size Diff #{size - $vm_size} kB" if $vm_size
      logger.info "Virtual Memory #{self.class.name}##{action_name} (#{info})"
      $vm_rss, $vm_size = rss, size
    end
  end

With this plugin installed, I let the application run for awhile and then I searched for actions that resulted in the significant memory leap. It turned out that problem lied in our reports controller. At the end of each day our client goes to the admin and prints all pending invoices for the past day and there might be hundreds of them. And here is the problem: Rails doesn't use response streaming; it builds response string all in memory and only when it's all done it sends it to the browser.

Here comes the solution.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

  def print_invoices
    orders = Orders.find :all, :conditions => ["created_on", Date.today]
    render :text => lambda { |response, out| 
      out << render_template(<<-EOS)
        <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
        "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
        <html>
          <head>
            <meta http-equiv="Content-Language" content="en-us" />
            <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
            <%= stylesheet_link_tag 'admin', :media => 'screen' %>
            <%= stylesheet_link_tag 'print', :media => 'print' %>
            <title>Order Invoices</title>
          </head>
          <body>
      EOS
      orders.each do |order|
        out << render_partial :object => 'order', order
      end
      out << render_template(<<-EOS)
          </body>
        </html>
      EOS
    }
  end

Unfortunately, you can use layouts when you do the response streaming this way. Simply because when Rails renders a template with a layout, it first renders template itself and only then it renders the layout and substitutes @content_for_layout variable with the result calculated on the first step.

Now with this print_invoice action in place, our dispatchers are back to the normal memory consumption.