It was the day we were moving. I was observing how the "Packers and Movers" professionals packed our furniture. For example, the King size bed shown below had to be accommodated within a space of about 6-7 inches inside a van. While I kept wondering how they'd manage this, they dismantled the bed. And in went the camel through the needle's eye very neatly.
That's when I realized the computing world is not very different from the real world. They dismantled the bed for transportation and then reassembled at the destination. Similarly, in the computing world, we deconstruct objects or data structures in a format that enables easy storage/transfer and reconstruct them whenever required. This is nothing but serialization.
In short, serialization is turning a complex "3-D" object into a single long "2-D" string. This can then be stored anywhere or made to travel across the web easily.
We will be delving into the following three topics in this series.
- Serialization in Ruby
- Serialization in Rails - for Storage
- Serialization in Rails - for Data Transfer via APIs
In this article, we will learn about how serialization works in Ruby.
Serialization in Ruby
You may come across instances where you would need to save Ruby objects in a file or send data to another program across the web.
To pull this off, Ruby provides two different mechanisms for serializing objects. These are based on the format/rules used for dismantling and assembling -
- Binary
- Human-Readable
a. YAML
b. JSON
1. Binary Format
Ruby supports binary serialization through the Marshal
module available in its standard library.
The marshalling library transforms the collection of Ruby objects into a stream of bytes that we humans can't decipher but Ruby can. The Marshal.dump
method is used to convert the object to a byte stream and the Marshal.load or Marshal.restore
method reconstructs the object.
Below is a class representation of the above real-life example in Ruby.
We'll create an object of the above class and serialize it using Marshal.
As you can see, even though the encoded string looks like gibberish, the reconstructed string is the same as the original. This type of serialization can be used when we are not concerned with being able to read the encoded data.
Note that it took overall 0.19
ms.
2. Human-Readable Format
a. YAML (YAML Ain't Markup Language)
YAML is a human-readable serialization standard that uses spaces and dashes for representing object data.
YAML supports serialization of objects of any class in Ruby. The YAML module in Ruby is an alias of the Psych module, which is the default YAML parser since Ruby 1.9.3. The YAML.dump
and YAML.load
methods are used for encoding and decoding the objects.
Let's serialize the same object using YAML
and benchmark it.
Notice that the serialized object is so easy to read. But it took almost 56 ms
.
b. JSON (JavaScript Object Notation)
JSON is also a human-readable data interchange format that needs no introduction. We are familiar with the JSON format for serialization as it has become a popular choice for data exchange on the web.
Ruby has the JSON library which provides similar methods like load
and dump
along with to_json
and parse
methods to parse data to/from JSON.
Let's serialize the same object again using JSON and benchmark it.
If you notice, this is faster than YAML (Only took 0.37
ms)
Below is a comparison of the three formats.
Each serialization format has its own perks and uses. The choice of a format would mainly depend on what works best for your project case. As this blog summarizes, choose Marshal for speed, choose JSON for speed plus human-readability and YAML for human-readability and small data-sets.
Hope this article was able to throw some light on the Ruby serialization formats. We will see how these formats are leveraged by Rails for storage and data transfer in the coming parts.
Thank you for reading.