After my previous post, How a Ruby Method Becomes a Rails Action, I got quite a few emails asking about the best way to read the Rails source code. Here's one from Peter, a long-time reader of the blog:
You have mentioned that you enjoy just reading the Rails source code. I am interested in going through the Rails source code but the code base is so large, I'm not quite sure where to start. Would you have a suggestion for someone like me, just learning Rails, to tackle this?
It just seems so overwhelming that I'm not quite sure where to start and how to proceed.
While I don't think there's one best way to read the Rails codebase, I've found a few techniques that are much more effective than the common strategy many employ: clone and open the Rails repository, start reading, get overwhelmed, and give up.
I hope they will allow you to efficiently understand the functionality you're exploring without getting bogged down in irrelevant details that are not important to the feature you're trying to make sense of.
But first, why should you read the Rails codebase? and why it's important?
As developers, we spend a bulk of our time reading the source code. You read code because you have to add new features or fix bugs. You also read the code to understand the underlying system to make changes to it. Sometimes, like me, you may even read code just for the pleasure.
Yet, reading code is never taught in school, even though it's a very effective technique to learn programming and become a better software developer, in my humble opinion.
If you know how to effectively read the source code (not just of Rails, but any system in general), you gain a deeper understanding of the system at a level that you just can't get as a user of that system.
Just as reading high-quality writing improves your vocabulary, improves your taste, and make you a better writer, reading high-quality source code teaches you new patterns and practices, data structures, algorithms, coding styles, and even a new domain-specific language (DSL).
What's more, you get to learn from the best. As an example, check out this PR from Jeremy, I'm sure you'll learn a thing or two that you didn't before.
Still not convinced? Let's try an example from a different industry. Imagine that you're a salesperson pitching something to a prospect. You have never done this before, so you have no reference point. All you have done so far is to read the features of thing you're selling and memorize the sales script.
After you've done the initial spiel (an elaborate or glib speech or story, typically one used by a salesperson), the prospect comes up with an objection that you can't counter. As expected, you're stunned and have no idea how to react.
Now consider this: You're an experienced salesman who's done hundreds of these sales calls and encountered thousands of objections. By the time the prospect raises their concern, you already know how to address it. You handle the objection gracefully and make the sale, not despite, but because of the objection.
The same principle applies to programming. When you're building software you run into various problems, many of which you may have never seen before. However, if you're read and learned a ton of source code, you start to build the mental models for common causes of bugs, performance issues, and best practices. You learn and remember how some gem author ran into this problem and how they resolved it, and so on.
All this to say: Reading code is important.
With that prelude out of the way, let's look at two concrete techniques you can use to read the Rails source code.
The core idea behind both techniques is to focus on one feature or a method at a time, and not only to passively read the source, but actively step through the specific method you're interested in with debugger.
This keeps the scope small, and lets you inspect the local variables, follow the conditional path, and learn exactly what's happening behind the scene.
Bundle Open Gem
Let's start with a concrete example. Imagine you want to understand how background jobs work in Rails. You've read the Rails guides and skimmed through a few tutorials, and understand the basic usage:
# A background job to remind users
class SendReminderJob < ApplicationJob
def perform(user)
# send a reminder to the user
end
end
# Somewhere else in your app...
SendReminderJob.perform_later(user)
Plain and simple.
However, you want to go a level deeper to understand exactly how Rails accomplishes this. Specifically, you want to understand what happens when you call the perform_later
method.
One way is to just open the Rails codebase, do a global search for the perform_later
method, find the method definition, and try to make sense of the code.
However, a better approach is to put a breakpoint and step through the method's control flow.
But how can you put a breakpoint inside the Rails source code? You can't just open the Rails repository you cloned and put the breakpoint in it, since your application is not using that particular Rails project. It uses a bundled Rails project stored somewhere under the Ruby installation.
So how can you open that specific Rails installation, add a breakpoint in it, and step through the exact Rails method your app is using?
Let's ask DHH:
Simply run bundle open activejob
and Bundler will open the exact version of the ActiveJob framework your application is using.
bundle open rails
as Rails itself is a collection of various sub-frameworks (or gems) like ActiveJob, ActiveRecord, and so on. You have to open the specific gem you're interested in.Let's try this:
$ bundle open activejob
And voila! Bundler opens the gem in a new editor window.
This is the exact source code your app is using.
Next step is to find the perform_later
method. So you'll do a global search for def perform_later
and find two instances of the method.
Let's imagine we have no idea which perform_later
method is actually getting called. So let's add breakpoints in both methods by editing the source files.
# lib/active_job/enqueuing.rb
def perform_later(...)
debugger # π add this line
job = job_or_instantiate(...)
enqueue_result = job.enqueue
yield job if block_given?
enqueue_result
end
# lib/active_job/configured_job.rb
def perform_later(...)
debugger # π add this line
@job_class.new(...).enqueue @options
end
By the way, I am using the debug
gem for debugging Ruby code, and highly recommend that you use it as well. Rails includes this gem out of the box, and for any other codebase, you can run bundle add debug
to install and add it to your gemfile.
Next, I'll go back to my Rails app, open the Rails console, and enqueue a sample job.
$ bin/rails console
irb(main):001> GreetUserJob.perform_later
And just like magic, the Ruby interpreter will halt execution on on of the breakpoints you inserted:
Turns out, it's the perform_later
method in the Enqueuing
module that's getting called, whenever we enqueue a background job.
Sweet, we learned something new!
At this point, you are inside the Rails codebase. You can step through the entire Rails source code as you'd in your regular application.
- To step into a method, type
s
, - To step out of a method, type
u
, - To continue, type
c
. - To show the state, type
i
How great is that?
And of course, you have access to all local variables as usual. Just type the name of the variable and the debugger will show you its value.
debug
gem's documentation.Try reading the perform_later
method on your own. Explore what paths it's taking, what variables it's setting, what instances it's creating, and what value it's returning. I am sure you'll learn a ton.
Let's move on to the second code-reading technique now.
Run and Follow the Tests
One of the best ways to understand any codebase is to read the tests, and Rails is no exception.
The obvious benefit of a test is to make sure that the code still works after making a change. However, another hidden benefit is that a test can help us get familiar with the codebase quickly. You can execute the code without launching the application in the browser or running the complete program.
What's more, a test allows you to run a specific feature in isolation, which helps you understand the relevant code without worrying about irrelevant details.
Rails puts very high importance on tests. The source code for the framework contains thousands and thousands of high-quality tests that thoroughly test the framework code that hundreds of people contribute to.
In this section, you'll learn how to:
- Run the entire Rails test suite
- Run the tests specific to a module
- Run all tests in a file
- Run a single test
First, make sure you have cloned the Rails repository on your computer.
$ git clone https://github.com/rails/rails.git
$ cd rails
Run the Entire Rails Test Suite
To run all the tests in the entire repository, you can run the following command from the rails directory.
$ bundle exec rake test
But we are not going to do that, as it will take a while, and for our purpose, which is to understand the source of a particular feature, and not contribute to Rails (yet!), we need to know how to run a single test, or multiple tests in a file we are trying to understand.
So letβs start by running all the tests in the Action Pack module, a core Rails framework. It contains the source for Rails Controllers and Views, essential parts of any web application.
Running All Tests in a Module
We will switch to the Action Pack directory and run the same command to run all tests for this framework.
$ cd actionpack
$ bin/test # or, bundle exec rake test
# Running:
...
Finished in 6.537328s, 589.9964 runs/s, 2762.1377 assertions/s.
3857 runs, 18057 assertions, 0 failures, 0 errors, 0 skips
As you can see, it took about 6.5 seconds to run almost 4000 tests.
That said, often you won't even need to run all the tests in a sub-framework. Since you'll be working with a particular feature at any time, you only need to run the tests for that particular feature, which we'll explore next.
Running All Tests in a File
We can go further and run the tests in a specific file that we are trying to understand.
$ actionpack git:(main) β bin/test test/controller/request_forgery_protection_test.rb
Running 286 tests in parallel using 4 processes
Run options: --seed 29532
# Running:
..............................................................................................................................................................................................................................................................................................
Finished in 0.726885s, 393.4598 runs/s, 1349.5945 assertions/s.
286 runs, 981 assertions, 0 failures, 0 errors, 0 skips
You can go even further and run a specific test.
Running a Single Test
Running all tests in a file is great, but we can also run a single test, which is what we will be doing a lot, while reading the source.
To run a single test, run the same command, passing the line number of the test:
$ activejob git:(main) β bin/test test/cases/queuing_test.rb:15
Using inline
Run options: --seed 16622
# Running:
.
Finished in 0.061077s, 16.3728 runs/s, 16.3728 assertions/s.
1 runs, 1 assertions, 0 failures, 0 errors, 0 skips
At this point, you might be wondering what does running tests have to do with reading the codebase.
Running the test is only one part of the equation, and we won't run tests in isolation. To understand the source code, when we run the tests, we will debug it by putting breakpoints in the source, following the control flow, pausing the execution, and examining the state of the variables at any particular moment in time.
Again, let's imagine that I want to understand how ActiveJob enqueues a job to be executed in background using the perform_later
method.
First, I will require the debug
gem in the test helper as follows:
# activejob/test/helper.rb
# frozen_string_literal: true
require "active_job"
require "debug" # add this line
Next, I'll insert a breakpoint in the perform_later
method.
# lib/active_job/enqueuing.rb
def perform_later(...)
debugger
job = job_or_instantiate(...)
enqueue_result = job.enqueue
yield job if block_given?
enqueue_result
end
Finally, I'll run a test that uses the perform_later
method, such as the following one.
# active_job/test/cases/queuing_test.rb
class QueuingTest < ActiveSupport::TestCase
test "run queued job" do
HelloJob.perform_later
assert_equal "David says hello", JobBuffer.last_value
end
end
I want to run the single test, so I'll use the following command:
$ bin/test test/cases/queuing_test.rb:15
And voila! The test halts at our breakpoint.
Once again, you are free to explore the flow of control, local variables, global state, and much more, to your heart's content! This is the part I find most enjoying, to be honest. Just follow the code, wherever it takes you.
To re-iterate what we learned in section one, I can hit n
a few times to step forward, or s
to step into a function. Pressing i
shows me the available variables. At any point, you can type whereami
to see your whereabouts in the codebase. To step out of a function, type u
and to continue, press c
. Easy peasy.
Check out following post to learn more about debugging:
By the way, if all this talk about ActiveJob has made you curious about how it actually works, you may find the following post helpful.
And also check the rails-internals tag on this blog, which contains all the posts where we try to make sense of a particular Rails feature by reading its source code.
A Few More Tips
Start your reading with small programs.
If you've never read any open-source codebase before, don't directly jump into Rails, as you'll be overwhelmed and quit soon.
Instead, start with a simple Ruby gem, such as rack, or mail. Personally, I am currently reading the source code of the solid_queue gem from 37signals and learning a ton. Once you find yourself comfortable reading smaller pieces of code, move on to the larger ones.
When you work on a feature that you don't quite understand, make a note of coming back later and read how it's implemented. Run its tests. This provides you with not only the immediate feedback on the way the code is supposed to work but also a sense of achievement and motivation.
Next, try making a few changes to the code to test your understanding. When you open a gem using bundle open gem
, you can modify the source code, save, and execute it. Begin with small changes and gradually increase their scope. Your active involvement with real code give you much more confidence and a level of comfort that cannot be gained by just reading the code.
Finally, since Rails makes a ton of use of Ruby's metaprogramming techniques, it's good to know some basics. You don't have to be an expert, just enough to understand what you're reading and how it works.
For a cursory overview, check out my notes on Paolo Perrotta's classic, Metaprogramming Ruby 2.
That's a wrap. I hope you found this article helpful and you learned something new.
As always, if you have any questions or feedback, didn't understand something, or found a mistake, please leave a comment below or send me an email. I reply to all emails I get from developers, and I look forward to hearing from you.
If you'd like to receive future articles directly in your email, please subscribe to my blog. Your email is respected, never shared, rented, sold or spammed. If you're already a subscriber, thank you.