Ruby 3.2.0 enhances Regexp performance and security with ReDoS protections

What is ReDoS? Regular expression Denial of Service (ReDoS) is a security vulnerability that can occur in a regular expression (regex) when the regex is applied to a long string. This attack is designed to make a system or network unavailable to its intended users. An example occurrence of a ReDoS Imagine that a website has a form that accepts user input and uses a regex to validate the input. The regex is designed to only allow alphanumeric characters in the input, so it looks like this: /^[a-zA-Z0-9]+$/. An attacker could potentially craft a string of input that consists of

Regular Expressions - Greedy vs non-greedy

By default, regular expression matching is greedy, which means they try to match as many matches as possible in a given string. Lets see an example considering HTML snippet - <p>Hello</p><span>Awesome</span><p>World</p>. Our task is to extract first p tag. i.e pattern matching should return <p>Hello</p>. Immediate solution is to write regex - /<p>.*<\/p>/. But it would match the whole string. Greedy The reason it matches whole string is

Introduction to Regular Expressions (aka regex) (Part 1)

A regular expression is a pattern describing a certain amount of text and is a type of shorthand to describe a search pattern. It is used to find text which matches a pattern within a larger text, to replace the matching text or to split the matching text into groups. Regular expressions power of extracting specific text from documents resides in their ability to replace many lines of code with as little as one line. Some terms used in regular expressions: Literal - A literal is a character we use in a search or matching expression. For example, to find