Using regular expressions is security-sensitive. It has led in the past to the following vulnerabilities:

Evaluating regular expressions against input strings is potentially an extremely CPU-intensive task. Specially crafted regular expressions such as (a+)+s will take several seconds to evaluate the input string aaaaaaaaaaaaaaaaaaaaaaaaaaaaabs. The problem is that with every additional a character added to the input, the time required to evaluate the regex doubles. However, the equivalent regular expression, a+s (without grouping) is efficiently evaluated in milliseconds and scales linearly with the input size.

Evaluating such regular expressions opens the door to Regular expression Denial of Service (ReDoS) attacks. In the context of a web application, attackers can force the web server to spend all of its resources evaluating regular expressions thereby making the service inaccessible to genuine users.

This rule flags any execution of a hardcoded regular expression which has at least 3 characters and at least two instances of any of the following characters: *+{.

Example: (a+)*

Ask Yourself Whether

There is a risk if you answered yes to any of those questions.

Recommended Secure Coding Practices

Check whether your regular expression engine (the algorithm executing your regular expression) has any known vulnerabilities. Search for vulnerability reports mentioning the one engine you're are using.

Use if possible a library which is not vulnerable to Redos Attacks such as Google Re2.

Remember also that a ReDos attack is possible if a user-provided regular expression is executed. This rule won't detect this kind of injection.

Sensitive Code Example

import java.util.regex.Pattern;

class BasePattern {
  String regex = "(a+)+b"; // a regular expression
  String input; // a user input

  void foo(CharSequence htmlString) {
    input.matches(regex);  // Sensitive
    Pattern.compile(regex);  // Sensitive
    Pattern.compile(regex, Pattern.CASE_INSENSITIVE);  // Sensitive

    String replacement = "test";
    input.replaceAll(regex, replacement);  // Sensitive
    input.replaceFirst(regex, replacement);  // Sensitive

    if (!Pattern.matches(".*<script>(a+)+b", htmlString)) { // Sensitive
    }
  }
}

This also applies for bean validation, where regexp can be specified:

import java.io.Serializable;
import javax.validation.constraints.Pattern;
import javax.validation.constraints.Email;
import org.hibernate.validator.constraints.URL;

class BeansRegex implements Serializable {
  @Pattern(regexp=".+@(a+)+b")  // Sensitive
  private String email;

  @Email(regexp=".+@(a+)+b")  // Sensitive
  private String email2;

  @URL(regexp="(a+)+b.com") // Sensitive
  private String url;
  // ...
}

Exceptions

Calls to String.split(regex) and String.split(regex, limit) will not raise an exception despite their use of a regular expression. These methods are used most of the time to split on simple regular expressions which don't create any vulnerabilities.

Some corner-case regular expressions will not raise an issue even though they might be vulnerable. For example: (a|aa)+, (a|a?)+.

It is a good idea to test your regular expression if it has the same pattern on both side of a "|".

See

Deprecated

This rule is deprecated; use {rule:java:S5852}, {rule:javasecurity:S2631} instead.