Rails is a developer-friendly web application framework that enables developers to do more with less code, but it isn’t always clear exactly what’s going on under the covers. One area where I’ve had a hard time was understanding how Rails parses the Request query parameters and the Form variables.
So what's a query parameter?
Query parameters are an optional set of key-value pairs that appear after the question mark in the URL. For example, name=contra
is a query parameter in https://example.com/over/there?name=contra
.
And what's a form variable?
When we submit a form on the web, the form sends data to the server as form variables. For example, player[game_attributes][name]
and player[game_attributes][release_year]
are form variables in the snippet below.
<p>
Task Name: <input name="player[game_attributes][name]">
Task Duration: <input name="player[game_attributes][release_year]">
</p>
Parameters Parsing
Several server-side frameworks are designed to handle form data if the inputs are named in a certain way. For e.g., in Rails, if you have an input named user[name]
, and is submitted as a part of a form, then the server would parse and convert it to a Hash as such: { "user" => { "name" => } }
. This allows the server program to access the input value params["user"]["name"]
, where params are the variable through which the parsed Hash is accessible.
Rails uses Rack for parameters parsing, so it’s the same even for other Ruby frameworks such as Sinatra and Padrino.
What's happening under the covers?
The function in Rack that’s doing all the work in generating the magical params
is parse_nested_query. The documentation is not very comprehensive, so we can experiment a bit to figure out what’s going on under the covers.
Here's what the method looks like:
def parse_nested_query(qs, d = '&;')
params = {}
(qs || '').split(/[#{d}] */n).each do |p|
k, v = unescape(p).split('=', 2)
normalize_params(params, k, v)
end
return params
end
Experimental Setup
Since Rack is written in ruby, we can pop open the irb and do the initial setup.
ayush-kiproshs-MacBook ~ $ irb
2.7.0 :001 > require "rack"
=> true
2.7.0 :002 > def parse(query_string)
2.7.0 :003 > Rack::Utils.parse_nested_query(query_string)
2.7.0 :004 > end
=> :parse
We need to require Rack and create a parse method for our ease of experimentation. Now as we are ready with the initial setup let's begin with the cases.
Case 1: When the fields are named without square brackets
If there are no square brackets, the value directly gets assigned to the parameter.
2.6.6 :013 > query_string = "game_1=contra&game_2=mario"
2.6.6 :014 > parse(query_string)
=> {"game_1"=>"contra", "game_2"=>"mario"}
Case 2: When using square brackets to name attributes
If the parameter is enclosed under square brackets, the value gets assigned to the parameter in a nested structure.
2.6.6 :015 > query_string = "player[game]=contra"
2.6.6 :016 > parse(query_string)
=> {"player"=>{"game"=>"contra"}}
Case 3: When the field name ends with empty square brackets
If the field name is ending with empty square brackets, the parameter is treated as an empty array, and values get appended to it.
2.7.0 :017 > query_string = "player[games][]=contra&
player[games][]=mario"
2.7.0 :018 > parse(query_string)
=> {"player"=>{"games"=>["contra", "mario"]}}
Case 4: When parsing objects(Combination of the above three cases)
On analysing the above cases we know that Rails reads the query parameters from left to right in the query-string, and creates a new object each time it sees a repeated attribute.
2.7.0 :021 > query_string = "player[games][][name]=contra&
player[games][][release_year]=1987&
player[games][][name]=mario&
player[games][][release_year]=1983"
2.7.0 :022 > parse(query_string)
=> {"player"=>{"games"=>[{"name"=>"contra", "release_year"=>"1987"}, {"name"=>"mario", "release_year"=>"1983"}]}}
Case 5: Case of Parameters Corruption
We know from the above cases that a new parameter is created each time the parser encounters a repeated attribute. But if the order of query parameters in the query string is not correct, the parser will return a corrupted params object. See examples below for better understanding.
2.7.0 :023 > query_string = "player[games][][name]=contra&
player[games][][name]=mario&
player[games][][release_year]=1987&
player[games][][release_year]=1983"
2.7.0 :024 > parse(query_string)
=> {"player"=>{"games"=>[{"name"=>"contra"}, {"name"=>"mario", "release_year"=>"1987"}, {"release_year"=>"1983"}]}}
Parameters in query parsing are strictly order-dependent. And it is guaranteed that the browsers will send the form parameters in the same order in which they appear in the source.
Refer W3C specification to learn about browser standards.
multipart/form-data The parts are sent to the processing agent in the same order the corresponding controls appear in the document stream. Part boundaries should not occur in any of the data; how this is done lies outside the scope of this specification.