Ruby 3.2 enhances Regexp security with ReDoS protections

What is ReDoS?

Regular expression Denial of Service (ReDoS) is a security vulnerability that can occur in a regular expression (regex) when the regex is applied to a long string. This attack is designed to make a system or network unavailable to its intended users.

An example occurrence of a ReDoS

Imagine that a website has a form that accepts user input and uses a regex to validate the input. The regex is designed to only allow alphanumeric characters in the input, so it looks like this: /^[a-zA-Z0-9]+$/.
An attacker could potentially craft a string of input that consists of a very long sequence of characters, such as this:
'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'.
When this string is passed through the regex, it will take a very long time to validate, potentially causing the system to become unresponsive or crash. This would prevent legitimate users from accessing the website, effectively denying them service.
To prevent this attack, it is important to use regexes that are designed to be efficient and not susceptible to ReDoS attacks. This can involve using certain regex constructs and patterns that are known to be efficient, and avoiding certain constructs and patterns that can cause regexes to be slow or vulnerable to ReDoS attacks. It is also important to test regexes for efficiency and security before deploying them to the production environment.

How to prevent ReDoS in Ruby < 3.2.0 ?

To prevent ReDoS attacks in a Ruby on Rails application, there are several steps that you can take:

Avoid using the match method in your regexes, as this method can be slow and can lead to ReDoS attacks. Instead, use the scan or grep methods, which are faster and more efficient.
Avoid using regexes with nested quantifiers such as (a+)* or with unbounded repetitions such as a* or a+ in your Rails application. These types of regexes can be very slow and can lead to ReDoS attacks.
Use the /o option in your regexes to prevent them from being compiled multiple times. This can improve the efficiency of your regexes and can prevent ReDoS attacks.
Test your regexes for efficiency and security before deploying them in a production environment. This can help you identify any potential ReDoS vulnerabilities and can allow you to fix them before they can be exploited by an attacker.

Regexp improvements introduced in Ruby 3.2.0 to prevent ReDoS.

Ruby 3.2.0 introduced two improvements that significantly mitigate ReDoS.

Improved Regexp matching algorithm

Since Ruby 3.2, the matching algorithm for Regexp has been significantly enhanced by using a memoization technique:

This technique improves the performance of regexp matching by allowing most regexp matches to be completed in linear time.
This means that the time it takes to complete a regexp match will be directly proportional to the length of the input string, and will not increase exponentially as the input string gets longer.
This can help prevent ReDoS attacks, which are a type of security vulnerability that can occur when a regexp is applied to a long string of input.
Also, this optimization may consume memory proportional to the input length for each match. This means that the amount of memory used by the optimization will increase as the length of the input string increases.
This should not cause any practical problems because the memory allocation is usually delayed, and a normal regexp match should consume at most 10 times as much memory as the input length.

Example:-

With 3.1.0

  :001 > require 'benchmark'
  => true
  :002 > Benchmark.realtime { /^a*b?a*$/ =~ "a" * 50000 + "x" }
  => 24.171763999998802

With 3.2.0

  :001 > require 'benchmark'
  => true
  :002 > Benchmark.realtime { /^a*b?a*$/ =~ "a" * 50000 + "x" }
  => 0.007867999986046925

Preventing ReDoS attacks with regexp timeouts

This allows you to specify a timeout for regexp matching. Two different APIs that can be used to set a timeout for regexp matching are:

Regexp.timeout=

This is the process-global configuration of timeout for regexp matching. It allows you to specify a timeout that will apply to all regexp matches in your Ruby application. For example, you can use it like this:
```
  # Set a timeout of 1 second for regexp matching
  Regexp.timeout = 1.0

  regexp = Regexp.new("^[a-zA-Z0-9]+$")

  # Perform a regexp match
  regexp.scan(string)
```
In this example, the Regexp.timeout global configuration is set to 1 second. This means that any regexp match performed in the application will have a timeout of 1 second. If the match takes longer than 1 second to complete, it will raise a Regexp::TimeoutError.
timeout keyword of Regexp.new

This API allows you to specify a timeout for a specific regexp object. This is useful when you want to use different timeout settings for different regexps in your application. For example, you can use it like this:
```
  # Create a regexp with a timeout of 1 second
  regexp = Regexp.new("^[a-zA-Z0-9]+$", timeout: 1.0)

  # Perform a regexp match
  regexp.scan(string)
```
In this example, the timeout keyword is used with the Regexp.new method to specify a timeout of 1 second for the regexp object. This means that any match performed with this regexp object will have a timeout of 1 second. If the match takes longer than 1 second to complete, it will raise a Regexp::TimeoutError.

Ruby 3.2.0 enhances Regexp performance and security with ReDoS protections

What is ReDoS?

An example occurrence of a ReDoS

How to prevent ReDoS in Ruby < 3.2.0 ?

Regexp improvements introduced in Ruby 3.2.0 to prevent ReDoS.

Improved Regexp matching algorithm

Preventing ReDoS attacks with regexp timeouts

`Regexp.timeout=`

`timeout` keyword of `Regexp.new`

References

All about "Data" Simple Immutable Value Objects in Ruby 3.2

Improvements to in_order_of active record query method in Rails 7.1

What is ReDoS?

An example occurrence of a ReDoS

How to prevent ReDoS in Ruby < 3.2.0 ?

Regexp improvements introduced in Ruby 3.2.0 to prevent ReDoS.

Improved Regexp matching algorithm

Preventing ReDoS attacks with regexp timeouts

Regexp.timeout=

timeout keyword of Regexp.new

References

`Regexp.timeout=`

`timeout` keyword of `Regexp.new`