When I used to program in C# (or even Java before that), one of the topics that always puzzled me was when to use which class. There are literally thousands and thousands of classes in the core language, framework, and the standard library. For example, here are the five types that implement the IDictionary
interface in C#.
Hashtable
SortedList
SortedList<TKey, TValue>
Dictionary<TKey, TValue>
ConcurrentDictionary<TKey, TValue>
Of course, there's an excellent use case for each, and I don't doubt the decisions of the language and framework creators (huge respect for Anders Hejlsberg, creator of C#). But as a programmer building run-of-the-mill CRUD web applications, having so many choices can be really daunting and confusing. When do you choose which type? What if you made a wrong choice?
In contrast, Ruby has a single Hash
class to manage key-value pairs. It's a very flexible data structure. It can act as a data object, a dictionary, a hash table, a sorted list, and much more. Like almost everything in Ruby, it's a sharp knife that you can use to cut yourself, but can also use to cook great food; I mean, write good programs.
This post explores the Hash data structure in-depth, and lists some of the obscure but useful operations in day-to-day programming.
What is a Hash?
A Hash is a general-purpose data structure for storing pairs of data. In programming terms, a Hash is a data type that associates (maps) values to keys.
Here's a simple Ruby hash that has two keys, :name
and :price
(both are Ruby symbols), with the values 'book'
and 10
respectively.
product = { name: 'book', price: 10 }
An important thing to note with the above syntax is that it's actually a shorthand that was added in newer versions of Ruby, to simplify using symbol keys.
Here's the original hash syntax for the same code above:
product = { :name => 'book', :price => 10 }
In this case, the keys are explicitly created as Ruby symbols, i.e. :name
and :price
.
In addition to symbols, Ruby hashes can have almost any object as the key, e.g. strings and integers. If you want to have any other values as keys, then you have to use the older syntax, which uses arrows.
# older syntax
data = { "product" => "iPhone", 600 => "Price (in dollars)" }
values = { 0 => 'first', 1 => 'second' } # just use arrays instead
You can also define user-defined objects as keys. But it's rare, and deserves a blog post of its own. Check out the documentation to learn more.
A Hash is similar to an Array, but stores a list of key-value pairs. In an array, each element has a positional index, but a Hash key can be anything, and the key has to be unique, and each key has a single mapped value.
A different way to think of a Hash as a chest with drawers that have a unique label (key) on them, and can store files (values). Whenever you need a file (value), just open the drawer with proper label.
Now, a drawer can't have duplicate labels, right? How would you know which drawer to open? Similarly, if you try to create a hash with duplicate keys, Ruby will warn you and only store the last key value pair.
person = { name: 'Akshay', name: 'AK' }
# Output
warning: key :name is duplicated and overwritten on line 20
To retrieve the data, use the array notation. If the key exists, it will return the value, otherwise it will return nil
.
person[:name]
=> "Akshay"
data["item"]
=> "iPhone"
data[100]
=> "One Hundred"
Alternatively, you can use the fetch
method on a hash, as follows:
person.fetch(:name)
The fetch
method also lets you provide a default value, if the key doesn't exist.
product.fetch(:name) # key not found
product.fetch(:name, 'iPhone') # iPhone
Prefer fetch over []
In addition to being able to provide a default value, another nice thing about fetch
is that if a key doesn't exist, it will immediately let you know, instead of returning nil
. This will prevent null reference errors further down the chain.
person.fetch(:names)
# `fetch': key not found: :names (KeyError)
# Did you mean? :name
For example, if you use []
to access a non-existent key, you won't know it until you try to call a method on the nil
object and it fails.
props = { width: '60', height: '40' }
shape = props[:shape] # nil
shape.print
# Output
main.rb:24:in `<main>': private method `print' called for nil:NilClass (NoMethodError)
Delete a Key
To delete a key, simply call the delete
method on hash, passing the name of the key.
product = { name: 'iPhone', price: '500' }
product.delete(:price) # "500"
product # { :name => "iPhone" }
If you try to delete a non-existent key, Ruby won't complain.
Nested Hash
A hash can be nested. To access the inner values, you can use the brackets, or use the dig
method.
product = { phone: { model: 'iPhone' } }
puts product[:phone][:model] # iPhone
puts product.dig(:phone, :model) # iPhone
Default Values
If a hash doesn't contain the key, it returns nil
. However, you can set the default value for the hash using its default
property. Alternatively, you can also use a block when initializing the hash object.
person = { name: "Akshay", age: 31 }
person[:city] # nil
person.default = "-" # "-"
person[:city] # "-"
person = Hash.new { |hash, key| "Default value for #{key}" }
person[:city] # "Default value for city"
If you already have a bunch of variables, you can create a new hash with those variables as keys, as follows:
width = '10px'
height = '20px'
border = 'rounded'
properties = { width:, height:, border: }
puts properties # {:width=>"10px", :height=>"20px", :border=>"rounded"}
Iterating Over a Hash
The Hash class includes the Enumerable
module. Additionally, a hash preserves the order of the entries. This effectively makes a hash act like a list (or an array). This is useful when you want to loop over a hash with each
, each_key
, each_pair
, each_value
, keys
and values
. Let's look at each method along with an example.
The each
method lets you loop over the key-value pair. If you provide a single argument to the block, the pair will be passed as an array.
properties = { width: '30px', height: '10px', color: 'green' }
properties.each do |prop|
p prop
end
# Output
[:width, "30px"]
[:height, "10px"]
[:color, "green"]
If you provide two arguments, the key-value pair will be spread over those two. If you want to be more expressive, use each_pair
, which works the same.
properties = { width: '30px', height: '10px', color: 'green' }
properties.each do |prop, val|
puts "#{prop}: #{val}"
end
# OR
properties.each_pair do |prop, val|
puts "#{prop}: #{val}"
end
# Output
width: 30px
height: 10px
color: green
To access keys and values separately, use either keys
, each_key
, values
or each_value
method, which work as you expect.
properties = { width: '30px', height: '10px', color: 'green' }
p properties.keys
p properties.values
properties.each_key { |key| puts key }
properties.each_value { |value| puts value }
# Output
[:width, :height, :color]
width
height
color
["30px", "10px", "green"]
30px
10px
green
Passing Hash to Functions
You can pass a hash to a function (but you probably knew that).
def process_payment(payment)
print payment.keys # [:product, :price, :apple_care]
end
payment = {
product: 'iPhone 13 mini',
price: 800.00,
apple_care: false
}
process_payment(payment)
But did you know that the curly braces may be omitted when the last argument in a method call is a Hash?
def process_payment(user, payment)
puts user
p payment.keys
end
user = 'Akshay'
process_payment(user, product: 'iPhone 13 mini', price: 800.00, apple_care: false)
# Output
Akshay
[:product, :price, :apple_care]
That said, just because you can, doesn't mean you should. In fact, this behavior is deprecated in the latest versions of Ruby. A better solution is to use the double-splat operator (**
), which we'll see next.
For a great example demonstrating why it was deprecated, please check out this comment on Hacker News.
Double Splat Operator
This probably deserves a separate blog post of its own, but you can use the double splat operator to 'unpack' hashes into other hashes.
properties = {
width: '30px',
height: '10px'
}
style = { **properties, border: 'none' }
p style
# Output
{:width=>"30px", :height=>"10px", :border=>"none"}
Additionally, you can also use it to capture all keyword arguments to a method (which can also be a simple hash).
def process_payment(user, **payment)
puts user
p payment.keys
end
user = 'Akshay'
process_payment(user, product: 'iPhone 13 mini', price: 800.00, apple_care: false)
# Output
Akshay
[:product, :price, :apple_care]
Useful Hash Methods
In this section, we'll take a look at some of the common but useful methods on Hash. For all the examples that follow, we'll use following properties
hash.
properties = {
width: '30px',
height: '10px',
color: 'green',
display: :flex,
options: nil
}
any?(key)
Returns true
if any key-value pair satisfies a given condition. Otherwise, returns false
.
properties.any? { |key, value| value == :flex } # true
compact
Returns a copy of the hash with all nil-valued entries removed. The original hash is not modified. To modify the original hash, use compact!
.
properties.compact
# {:width=>"30px", :height=>"10px", :color=>"green", :display=>:flex}
empty?
Returns true
if there are no hash entries, false
otherwise.
properties.empty? # false
h = {}
h.empty? # true
Hash.new.empty? # true
merge
Merges another hash into this hash. You can simply provide the key:value
pairs as argument, too.
properties.merge({ radius: '5px' })
# OR
properties.merge(radius: '5px')
# {:width=>"30px", :height=>"10px", :color=>"green", :display=>:flex, :options=>nil, :radius=>"5px"}
hash.eql? obj
Returns true
only if the following conditions are true:
obj
is aHash
hash
andobj
have the same keys (order doesn't matter)- For each key,
hash[key].eql? obj[key]
This is different from equal?
which returns true if and only if both values refer to the same object.
new_props = { width: '30px', height: '10px', color: 'green', display: :flex, options: nil }
# false, as they're different objects
properties.equal?(new_props)
# true, their shape is same
properties.eql?(new_props)
except(*keys)
Returns a new Hash without the entries for the given keys
properties.except(:display)
properties.except(:display, :options)
# Output
{:width=>"30px", :height=>"10px", :color=>"green", :options=>nil}
{:width=>"30px", :height=>"10px", :color=>"green"}
reject
This is similar to except
, in the sense that it removes the keys from the hash. The main difference is that you can pass a block to it, which is executed for each key-value pair. All pairs satisfying the condition will be removed.
properties.reject { |key, value| value == :flex }
# {:width=>"30px", :height=>"10px", :color=>"green", :options=>nil}
As always, it won't change the original hash, for which you've to use reject!
.
filter
andselect
These methods selectively filter key-value pairs satisfying a given condition and return a new hash.
properties.filter { |key, value| value == :flex }
# {:display=>:flex}
fetch_values(*keys)
ORhash.values_at(:k1, :k2)
Returns an array containing the values for the given keys
.
Additionally, you can pass a block to fetch_values
. It will be called for each missing key. The return value of the block is used for the key's value.
properties.fetch_values(:width, :height) # ["30px", "10px"]
properties.fetch_values(:height, :radius) { |key| key.to_s } # ["10px", "radius"]
properties.values_at(:width, :height) # ["30px", "10px"]
has_key?
,member?
,include?
, andkey?
All of these methods check if the hash contains the given key.
properties.has_key? :width # true
properties.key? :shape # false
properties.member? :color # true
properties.include? :object # false
has_value?
,value?
Check if the hash contains the given value.
properties.has_value? 'green' # true
properties.value? :flex # true
properties.value? :random # false
length
orsize
Returns the number of entries in the hash.
properties.length # 5
properties.size # 5
count
It returns the number of entries just like length
and size
, but it also takes a block and returns the count of entries satisfying the block condition.
properties.count # 5
properties.count { |k, v| v.to_s.include?('px') } # 2
slice(*keys)
Returns a new hash containing the entries for the given keys
.
properties.slice(:width, :height)
# Output
{:width=>"30px", :height=>"10px"}
transform_values
Returns a new hash with the values transformed by a block that accepts each value. It doesn't modify the original hash. For changing the original hash, use transform_values!
method.
data = { a: 100, b: 200 }
new_data = data.transform_values { |v| v * 2 }
new_data # {:a=>200, :b=>400}
data # {:a=>100, :b=>200}
flatten
Returns an array that is a 1-dimensional flattening of the hash.
properties.flatten
# [:width, "30px", :height, "10px", :color, "green", :display, :flex, :options, nil]
I'll stop now, but there're a whole lot of other operations you could perform on a hash. Check out the Hash and Enumerable documentation for a comprehensive reference.
By the way, if you found this article useful, you might enjoy these:
That's a wrap. I hope you liked this article and you learned something new. If you're new to the blog, check out the start here page for a guided tour or browse the full archive to see all the posts I've written so far.
As always, if you have any questions or feedback, didn't understand something, or found a mistake, please leave a comment below or send me an email. I look forward to hearing from you.
If you'd like to receive future articles directly in your email, please subscribe to my blog. If you're already a subscriber, thank you.