Skip to content

Escape pattern before creating a regular expression from it#123

Closed
parndt wants to merge 1 commit intobbenno:masterfrom
parndt:bugfix/regexp-escaping
Closed

Escape pattern before creating a regular expression from it#123
parndt wants to merge 1 commit intobbenno:masterfrom
parndt:bugfix/regexp-escaping

Conversation

@parndt
Copy link

@parndt parndt commented Nov 20, 2024

By creating a regular expression from what might be user input we are
allowing the system to run arbitrary regular expressions, which could be
a security concern.

We can simply escape it to avoid potential problems. Benchmarking shows
that this runs comparatively quickly:

            original      1.822k (± 1.8%) i/s  (548.72 μs/i) -      9.150k in   5.022464s
             escaped      1.813k (± 5.4%) i/s  (551.64 μs/i) -      9.180k in   5.085108s

@parndt
Copy link
Author

parndt commented Nov 20, 2024

After opening and looking to add a test it occurred to me that maybe it's a necessary feature that the input is not escaped, as the caller can easily do this before calling search i.e. Languages.search(Regexp.escape(input), case_sensitive: false).. so I'm going to close this PR.

Thanks for the gem!

@parndt parndt closed this Nov 20, 2024
@parndt parndt deleted the bugfix/regexp-escaping branch November 20, 2024 01:24
@parndt
Copy link
Author

parndt commented Nov 20, 2024

@bbenno or we could do this:

   def search(pattern, case_sensitive: true)
      unless case_sensitive
        pattern = Regexp.escape(pattern) if pattern.is_a?(String)
        pattern = Regexp.new(pattern, Regexp::IGNORECASE).freeze
      end

      all.select { |l| l.name.match? pattern }
    end

Let me know if you want this patch and I'll reopen with tests.

@bbenno
Copy link
Owner

bbenno commented Nov 20, 2024

Hi @parndt,

Thank you for your contribution. Your initiative is highly appreciated.

I propose delineating the security concerns into two distinct categories: Regular Expression Denial of Service (ReDoS) prevention and input validation.

Regarding ReDoS prevention, the current design permits the passing of arbitrary regular expressions as arguments, which is an intentional feature rather than a flaw. Escaping the pattern argument would undermine this functionality. Therefore, ReDoS mitigation should be addressed at the application level, not within the library itself.

Concerning input validation, if pattern is a string, passing an unvalidated string from an untrusted source, which is then automatically converted to a regular expression, could lead to unexpected behaviour for the library user. To address this, I suggest modifying the function's interface to accept only regular expressions as arguments. This change would clarify that it is the application's responsibility to prepare the search pattern concerning case sensitivity, fixed encoding and security considerations like input validation.

def search(pattern)
  raise(ArgumentError, "Pattern must be a String") unless pattern.is_a?(Regexp)

  all.select { |l| l.name.match?(pattern) }
end

Any thoughts on this?

@parndt
Copy link
Author

parndt commented Nov 20, 2024

thanks @bbenno that is great feedback

Do you mean like this:

def search(pattern)
  raise ArgumentError, "Pattern must be a Regexp" unless pattern.is_a?(Regexp)

  all.select { |l| l.name.match?(pattern) }
end

I think this is probably a good idea. It was definitely surprising that the string turned into a Regexp, until I read the source code.

@bbenno
Copy link
Owner

bbenno commented Nov 21, 2024

OK, this will be included in the next version that will be released 🔜.

bbenno added a commit that referenced this pull request Nov 21, 2024
bbenno added a commit that referenced this pull request Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants