A higher order function is a function that
- takes another function as a parameter,
- returns a function, or
- does both.
It's a very important and powerful concept in functional programming, and Ruby has a first-class support for higher-order functions, via its delicious flavors like blocks, procs, and lambdas.
However, if you're a new programmer, it can be really confusing to wrap your head around just when you might want to use higher order functions, i.e. write functions that accept other functions as parameters.
Alternatively, when do you use blocks, procs, or lambdas in Ruby?
After writing the above post, I got a few emails asking me to explain how anonymous functions are different from regular functions and when to use them.
In short, you know they are an important concept, and all the examples make sense, but you've no idea when you might use them in real-world.
I banged my head against a brick wall trying to figure out how higher-order functions work for a long time, and there didn't seem to be any concise, clear explanation on just how does it work. So here's my attempt at explaining them with a very simple example.
Let's try to implement the ubiquitous and omnipresent map
method in Ruby, from the first principles.
Map is used to transform each item in an array into something else. Given some array of items and a function, map
applies that function to every item and returns a new array containing the transformed (mapped) elements.
I hope that by the end of the post, you'll have a solid understanding of when you might need to write a function that takes other functions (blocks, procs, or lambdas in Ruby) as a parameter.
Imagine you have a list of email subscribers as a list of plain-old Ruby objects (the Subscriber
class) and we need to get a list of their email addresses.
We can implement this features (without writing any fancy code) like this:
def collect_emails(subscribers)
emails = []
subscribers.each do |subscriber|
emails << subscriber.email
end
emails
end
emails = collect_emails(subscribers)
Now imagine you also have a list of products and we want to know the price of each item. Your might write something like this:
def collect_prices(products)
prices = []
products.each do |product|
prices << product.price
end
prices
end
prices = collect_prices(products)
If you look carefully, you'll notice that they're very similar. In both cases, we perform following operations:
- create an empty array,
- iterate over another list of items,
- create a new item by performing some operation on each item in that list,
- append the result of the previous operation to the new array, and finally
- return that array
There's only one real difference between them: the operation we are performing on each item in the list.
In the first example we're calling the email
method from the item.
email = subscriber.email
In the second example, we're extracting the price
of a product.
price = product.price
Let's generalize the names of everything except the two blocks of code that are different. We get following functions.
def collect_emails(items)
results = []
items.each do |item|
result = item.email # code that changes
results << result
end
results
end
def collect_prices(items)
results = []
items.each do |item|
result = item.price # code that changes
results << result
end
results
end
emails = collect_emails(subscribers)
prices = collect_prices(products)
The code still works as expected.
Let's remove the duplication by extracting the part that's changing into a separate function that's stored in a variable. Specifically, we'll achieve this by extracting those chunks of code into Ruby lambdas or procs. These are anonymous functions.
# using lambda
email_collector = ->(subscriber) { subscriber.email }
price_collector = ->(product) { product.price }
# using proc
email_collector = Proc.new { |subscriber| subscriber.email }
price_collector = proc { |product| product.price }
We're simply storing the code that we want to execute later in a separate variable. Nothing fancy.
Here's the resulting examples. I'll use the lambda, as it's my favorite.
def collect_emails(items)
results = []
email_collector = ->(subscriber) { subscriber.email } # Code that changes
items.each do |item|
result = email_collector.call(item)
results << result
end
results
end
def collect_prices(items)
results = []
price_collector = ->(product) { product.price } # Code that changes
items.each do |item|
result = price_collector.call(item)
results << result
end
results
end
emails = collect_emails(subscribers)
prices = collect_prices(products)
We're getting close to completely remove the duplicated code.
Let's extract the big chunk of code that's repeated in both functions. We'll do this by parameterizing the {email/price}_collector
variable, which is a lambda. I'll call the new parameter collector
, and pass it from the code that calls our functions.
def collect(items, collector)
results = []
items.each do |item|
result = collector.call(item)
results << result
end
results
end
email_collector = ->(subscriber) { subscriber.email }
emails = collect(subscribers, email_collector)
price_collector = ->(product) { product.price }
prices = collect(products, price_collector)
We can further simplify the usage by eliminating the temporary variables as follows:
emails = collect subscribers, ->(subscriber) { subscriber.email }
prices = collect products, ->(product) { product.price }
Let's use blocks which are pretty. This also lets us eliminate the second parameter collector
. We can simply yield
the item after checking if the block was provided.
The yield
method will call the provided block and forward all its arguments to the block.
def collect(items)
results = []
items.each do |item|
result = yield(item) if block_given?
results << result
end
results
end
emails = collect(subscribers) { |subscriber| subscriber.email }
prices = collect(products) { |product| product.price }
Congratulations, we've implemented a higher order function called collect
, also known as map
.
def map(items)
results = []
items.each do |item|
result = yield(item) if block_given?
results << result
end
results
end
Ruby already implements the map
and collect
methods on arrays, so we can directly call it on the subscribers
and products
.
emails = subscribers.collect { |subscriber| subscriber.email }
prices = products.map { |product| product.price }
And that's how you can create a higher-order function that accepts other function (blocks, procs, or lambdas in Ruby) to invoke it at some later time in execution.
That's a wrap. I hope you liked this article and you learned something new. If you're new to the blog, check out the start here page for a guided tour or browse the full archive to see all the posts I've written so far.
As always, if you have any questions or feedback, didn't understand something, or found a mistake, please leave a comment below or send me an email. I reply to all emails I get from developers, and I look forward to hearing from you.
If you'd like to receive future articles directly in your email, please subscribe to my blog. If you're already a subscriber, thank you.