Skip to content

Ruby: port js/hardcoded-data-interpreted-as-code#9896

Merged
nickrolfe merged 4 commits intomainfrom
nickrolfe/hardcoded_code
Aug 26, 2022
Merged

Ruby: port js/hardcoded-data-interpreted-as-code#9896
nickrolfe merged 4 commits intomainfrom
nickrolfe/hardcoded_code

Conversation

@nickrolfe
Copy link
Copy Markdown
Contributor

No description provided.

@github-actions
Copy link
Copy Markdown
Contributor

QHelp previews:

ruby/ql/src/queries/security/cwe-506/HardcodedDataInterpretedAsCode.qhelp

Hard-coded data interpreted as code

Interpreting hard-coded data (such as string literals containing hexadecimal numbers) as code or as an import path is typical of malicious backdoor code that has been implanted into an otherwise trusted code base and is trying to hide its true purpose from casual readers or automated scanning tools.

Recommendation

Examine the code in question carefully to ascertain its provenance and its true purpose. If the code is benign, it should always be possible to rewrite it without relying on dynamically interpreting data as code, improving both clarity and safety.

Example

As an example of malicious code using this obfuscation technique, consider the following simplified Ruby version of a snippet of backdoor code that was discovered in a dependency of the popular JavaScript event-stream npm package:

def e(r)
  [r].pack 'H*'
end

# BAD: hexadecimal constant decoded and interpreted as import path
require e("2e2f746573742f64617461")

While this shows only the first few lines of code, it already looks very suspicious since it takes a hard-coded string literal, hex-decodes it and then uses it as an import path. The only reason to do so is to hide the name of the file being imported.

References

@nickrolfe nickrolfe force-pushed the nickrolfe/hardcoded_code branch from 885b525 to 6356b20 Compare July 26, 2022 15:05
@nickrolfe nickrolfe marked this pull request as ready for review July 27, 2022 10:54
@nickrolfe nickrolfe requested a review from a team as a code owner July 27, 2022 10:54
@aibaars
Copy link
Copy Markdown
Contributor

aibaars commented Jul 28, 2022

@nickrolfe
Copy link
Copy Markdown
Contributor Author

That's a useful example. The query won't flag that, for a few reasons:

  1. The string won't be considered a Source, given the current regexp.
  2. I assume we don't model flow through the call to Zlib::Inflate.inflate.
  3. The decompressed data is not actually executed. It's just written to a file.

I should at least fix 1.

alexrford
alexrford previously approved these changes Aug 5, 2022
Copy link
Copy Markdown
Contributor

@alexrford alexrford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall - I agree that it would be good to extend the DefaultSource to cover sequences of arbitrary character codes as in the metasploit example.

Copy link
Copy Markdown
Contributor

@alexrford alexrford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@nickrolfe nickrolfe merged commit 898689f into main Aug 26, 2022
@nickrolfe nickrolfe deleted the nickrolfe/hardcoded_code branch August 26, 2022 12:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants