Added regex for new npm secret format by marmegh · Pull Request #588 · microsoft/sarif-pattern-matcher

marmegh · 2021-12-14T07:28:04Z

Changes

Added regex for new npm secret format

For significant contributions please make sure you have completed the following items:

ReleaseHistory.md updated for non-trivial changes
Added unit tests

eddynaka · 2021-12-14T15:42:57Z

 - SDK: Exposing `automationId`, `automationGuid`, and `postUri` in the
  `analyze` command.
  [#586](https://github.com/microsoft/sarif-pattern-matcher/pull/586)
+- NFD: Adding additional regex for 'SEC101/017.NpmAuthorToken'


NFD

what is NFD? should this be FNC? #Closed

Just a typo. Thank you

eddynaka · 2021-12-14T15:43:44Z

  $SEC101/015.AkamaiCredentials=(?si)https:\/\/(?P<host>[\w\-\.]+)\.akamaiapis\.net.{0,150}(?:(?:client_token.{0,10}(?:[^a]|^)(?P<id>akab[\w\-]+).{0,50})|(?:access_token.{0,10}(?:[^\w\-]|^)(?P<resource>akab[\w\-]+).{0,200})|(?:(?:client_secret).{0,10}(?:[^0-9a-z\/\+]|^)(?P<secret>[0-9a-z\/\+]{43}=))){3}
  $SEC101/016.StripeApiKey=(?:[^rs]|^)(?P<secret>(?:r|s)k_(?:live|test)_(?i)[0-9a-z]{24,99})(?:[^0-9a-z]|$)
  $SEC101/017.NpmAuthorToken=(?i)npm.{0,100}[^0-9a-z](?-i)(?P<secret>[0-9a-z]{8}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{12})[^0-9a-z]
+  $SEC101/017.NpmAuthorTokenV2=(?:[^s]|^)(?P<secret>(?:npm)_(?i)[a-zA-Z0-9]{36})


[a-zA-Z0-9]

since you added ?i, you can change this to:
0-9a-z (we normally add numbers first and letters last) #Closed

eddynaka · 2021-12-14T15:44:08Z

  $SEC101/015.AkamaiCredentials=(?si)https:\/\/(?P<host>[\w\-\.]+)\.akamaiapis\.net.{0,150}(?:(?:client_token.{0,10}(?:[^a]|^)(?P<id>akab[\w\-]+).{0,50})|(?:access_token.{0,10}(?:[^\w\-]|^)(?P<resource>akab[\w\-]+).{0,200})|(?:(?:client_secret).{0,10}(?:[^0-9a-z\/\+]|^)(?P<secret>[0-9a-z\/\+]{43}=))){3}
  $SEC101/016.StripeApiKey=(?:[^rs]|^)(?P<secret>(?:r|s)k_(?:live|test)_(?i)[0-9a-z]{24,99})(?:[^0-9a-z]|$)
  $SEC101/017.NpmAuthorToken=(?i)npm.{0,100}[^0-9a-z](?-i)(?P<secret>[0-9a-z]{8}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{12})[^0-9a-z]
+  $SEC101/017.NpmAuthorTokenV2=(?:[^s]|^)(?P<secret>(?:npm)_(?i)[a-zA-Z0-9]{36})


(?:npm)

you don't need to create a group.
#Closed

eddynaka · 2021-12-14T15:44:50Z

  $SEC101/015.AkamaiCredentials=(?si)https:\/\/(?P<host>[\w\-\.]+)\.akamaiapis\.net.{0,150}(?:(?:client_token.{0,10}(?:[^a]|^)(?P<id>akab[\w\-]+).{0,50})|(?:access_token.{0,10}(?:[^\w\-]|^)(?P<resource>akab[\w\-]+).{0,200})|(?:(?:client_secret).{0,10}(?:[^0-9a-z\/\+]|^)(?P<secret>[0-9a-z\/\+]{43}=))){3}
  $SEC101/016.StripeApiKey=(?:[^rs]|^)(?P<secret>(?:r|s)k_(?:live|test)_(?i)[0-9a-z]{24,99})(?:[^0-9a-z]|$)
  $SEC101/017.NpmAuthorToken=(?i)npm.{0,100}[^0-9a-z](?-i)(?P<secret>[0-9a-z]{8}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{12})[^0-9a-z]
+  $SEC101/017.NpmAuthorTokenV2=(?:[^s]|^)(?P<secret>(?:npm)_(?i)[a-zA-Z0-9]{36})


(?:[^s]|^)

instead of ^s, you could just use ^n, because anything that isn't the 'n' could be a match. #Closed

with that, we would have:
(?:[^n]|^)

eddynaka · 2021-12-14T15:46:34Z

  $SEC101/015.AkamaiCredentials=(?si)https:\/\/(?P<host>[\w\-\.]+)\.akamaiapis\.net.{0,150}(?:(?:client_token.{0,10}(?:[^a]|^)(?P<id>akab[\w\-]+).{0,50})|(?:access_token.{0,10}(?:[^\w\-]|^)(?P<resource>akab[\w\-]+).{0,200})|(?:(?:client_secret).{0,10}(?:[^0-9a-z\/\+]|^)(?P<secret>[0-9a-z\/\+]{43}=))){3}
  $SEC101/016.StripeApiKey=(?:[^rs]|^)(?P<secret>(?:r|s)k_(?:live|test)_(?i)[0-9a-z]{24,99})(?:[^0-9a-z]|$)
  $SEC101/017.NpmAuthorToken=(?i)npm.{0,100}[^0-9a-z](?-i)(?P<secret>[0-9a-z]{8}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{12})[^0-9a-z]
+  $SEC101/017.NpmAuthorTokenV2=(?:[^s]|^)(?P<secret>(?:npm)_(?i)[a-zA-Z0-9]{36})


we normally add the end border as well:
(?:[^0-9a-z]|$) #Closed

michaelcfanning · 2021-12-14T16:01:19Z

@@ -165,6 +165,13 @@
          "MessageArguments": { "secretKind": "NPM API key" },


API key

why does this say 'API key' but the rule name is author token? #Closed

michaelcfanning · 2021-12-14T16:08:09Z

+        {
+          "Id": "SEC101/017",
+          "Name": "DoNotExposePlaintextSecrets/NpmAuthorToken",
+          "ContentsRegex": "$SEC101/017.NpmAuthorTokenV2",


NpmAuthorTokenV2

we should add a checksum validation for this secret kind, that's the substance of the token change (moving to v2).

We need to separate this into a brand new rule, as we are likely to handle the 'identifiable' secrets differently (or someone needs to propose a convention for marking these as identifiable).

For now, try allocating a new rule id, SEC101/050 it appears? We should call the rule 'IdentifiableNpmAuthorToken' or 'NpmIdentifiableAuthorToken'. The former would sort all identifiable secret types together, the latter would keep the secret type bundled with the platform.

We could consider adding a new property denoting these as identifiable in adding to making it clear in the rule name. This would be valuable if we special-case behaviors in the engine itself (e.g., the engine could automatically elevate all 'identifiable' secrets to errors in the static analysis phase.

A future change, though, no need to worry about it now. Let's do handle the checksum. I just noticed we don't do this in the GH PAT either, so you'll need to follow up with me offline on what to do.

#Resolved

Hello, we do the checksum in the internal version.

I've split this out into a separate rule and added the supporting unit test and functional test files. Let me know what you think. I tried to organize the commits for easy reversal/changes.

…g supporting tests

eddynaka · 2021-12-15T09:27:43Z

+
+namespace Microsoft.CodeAnalysis.Sarif.PatternMatcher.Plugins.Security
+{
+    public class IdentifiableNpmAuthorTokenValidator : DynamicValidatorBase


IdentifiableNpmAuthorTokenValidator

create a helper that has:

uri

checkinformation

dynamic validation logic

internal classes

change this validator + npmtokenvalidator to use that helper

this will reduce the code duplication #Closed

eddynaka · 2021-12-15T09:28:16Z


+  <ItemGroup>
+    <None Remove="TestData\SecurePlaintextSecrets\ExpectedOutputs\SEC101_050.IdentifiableNpmAuthorToken.sarif" />
+    <None Remove="TestData\SecurePlaintextSecrets\Inputs\SEC101_050.IdentifiableNpmAuthorToken.ps1" />


you can delete this, since we automatically embed everything under TestData #Closed

eddynaka · 2021-12-15T09:30:20Z

+
+namespace Microsoft.CodeAnalysis.Sarif.PatternMatcher.Plugins.Security.Validators
+{
+    public class IdentifiableNpmAuthorTokenValidatorTests


IdentifiableNpmAuthorTokenValidatorTests

once u create the helper, u might be able to merge the test logic for both rules. #Closed

eddynaka · 2021-12-15T09:30:31Z

+{
+    public class IdentifiableNpmAuthorTokenValidatorTests
+    {
+


you can remove this empty line #Closed

eddynaka · 2021-12-15T09:30:44Z

+
+namespace Microsoft.CodeAnalysis.Sarif.PatternMatcher.Plugins.Security.Validators
+{
+    public class IdentifiableNpmAuthorTokenValidatorTests


add the summary (check other test validators, pls) #Closed

eddynaka · 2021-12-15T09:31:10Z

+                Content = new StringContent(publishResponseJson)
+            };
+
+            var ValidEmptyContentResponse = new HttpResponseMessage(HttpStatusCode.OK)


ValidEmptyContentResponse

verify casing. for local variables we normally start with lower case #Closed

eddynaka · 2021-12-15T09:31:43Z

+                ValidationState currentState = identifiableNpmAuthorTokenValidator.IsValidDynamic(ref fingerprint,
+                                                                                      ref message,
+                                                                                      keyValuePairs,
+                                                                                      ref resultLevelKind);


review indentation. we normally indent based on the first parameter #Closed

eddynaka · 2021-12-15T09:32:43Z

The change looks good!
Just a few refactors to prevent code duplication and some style nits.

In reply to: 994572554

eddynaka · 2021-12-15T09:33:39Z

+
+        protected override IEnumerable<ValidationResult> IsValidStaticHelper(IDictionary<string, FlexMatch> groups)
+        {
+            FlexMatch secret = groups["secret"];


secret

to reduce false-positives, we should validate the checksum:
https://github.blog/changelog/2021-09-23-npm-has-a-new-access-token-format/ #Closed

Check sec101/102 or sec101/006 (internal version)

…ests

eddynaka · 2021-12-17T12:05:47Z

+        {
+            FlexMatch secret = groups["secret"];
+
+            if (groups.TryGetNonEmptyValue("checksum", out FlexMatch checksum))


if (groups.TryGetNonEmptyValue("checksum", out FlexMatch checksum))

since this is required, you can follow what we did for the secret. If we don't have that key, it will throw an exception because that is not expected. #Closed

eddynaka · 2021-12-17T12:07:46Z

+                                                       client);
+        }
+
+        private static string Base62EncodeUint32(uint value, int minimumLength = 6)


Base62EncodeUint32

should we move this to crc class, make it static and add more tests to complete its coverage? what are ur thoughts? #Resolved

This kind of code should exist in Microsoft.Security.Utilities. That package should have API that accepts an arbitrary alphabet (such as a base62 encoding scheme) and provides encoding and decoding capabilities.

We will need this API for many classes of secret.

Looks like that functionality is still pending. May I proceed with Eddy's suggestion and ensure that a task to incorporate this is added to the backlog?

We're now using the CustomAlphabetEncoder class to achieve this.

eddynaka · 2022-01-12T11:17:07Z

@@ -74,73 +39,9 @@ protected override ValidationState IsValidDynamicHelper(ref Fingerprint fingerpr
                                                                ref ResultLevelKind resultLevelKind)
        {
            string secret = fingerprint.Secret;


string secret = fingerprint.Secret;

you can remove this. #Closed

eddynaka · 2022-01-12T11:18:07Z

+            // Validate checksum to avoid false positives.
+            string randomPart = secret.Value.String.Substring(4, 30);
+            uint checksumValue = Crc32.Calculate(randomPart);
+            var encoder = new CustomAlphabetEncoder();


var encoder = new CustomAlphabetEncoder();

should we create only one encoder and re-use it? #Closed

Thanks, this also helped me catch some other cleanup items.

eddynaka · 2022-01-12T11:19:06Z


-            [JsonProperty("total")]
-            public int Total { get; set; }
+            return NpmAuthorTokenHelper.ValidateTokens(ref fingerprint, ref message, ref resultLevelKind, client);


ref fingerprint, ref message, ref resultLevelKind, client

following the identifiablenpm, can you wrap the parameters? #Closed

eddynaka · 2022-01-12T11:21:57Z

+    public class IdentifiableNpmAuthorTokenValidatorTests
+    {
+        [Fact]
+        public void IdentifiableNpmAuthorTokenValidatorTests_MockHttpTests()


Tests

nit: remove Tests from this.
The idea that we follow:
ClassThatWeAreTesting_Something #Closed

eddynaka · 2022-01-12T11:23:48Z

@@ -0,0 +1,3 @@
+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
+
+"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"


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

here, you created a test that pass the checksum.
Can you also add a new one that do not pass?
Also, add a comment. Something like:

# This is a pattern that matches the regex but the checksum is invalid npm_abcd....

#Closed

eddynaka · 2022-01-12T22:50:21Z

+{
+    public class IdentifiableNpmAuthorTokenValidator : DynamicValidatorBase
+    {
+        private readonly CustomAlphabetEncoder encoder = new CustomAlphabetEncoder();


new CustomAlphabetEncoder();

can you move this to the constructor of this class?
we try to move initializations to the constructor.

Good note, please provide the rationale for the standard. 'The reason is' that it's harder to understand initialization behavior when assignments are inlined like this: instance variables can be declared anywhere in the class. You really notice the problem with this pattern when debugging a ctor with initializations of this kind that are scattered around: the debugger hopes from place to place. With an explicit constructor, everything is sequential.

eddynaka

michaelcfanning

Added regex for new npm secret format

ba5507a

marmegh requested review from eddynaka and michaelcfanning as code owners December 14, 2021 07:28

Added note to release history

7dec680

eddynaka reviewed Dec 14, 2021

View reviewed changes

michaelcfanning reviewed Dec 14, 2021

View reviewed changes

marmegh added 4 commits December 14, 2021 09:55

fixed typo in release history

d7d3f74

updated new regex string

9b0de4b

Updated secretKind for Npm Auth Tokens

6b2a0d8

Splitting IdentifiableNpmAuthorToken out to a separate rule and addin…

080cb03

…g supporting tests

eddynaka reviewed Dec 15, 2021

View reviewed changes

marmegh added 6 commits December 15, 2021 23:02

Move shared npm dynamic validation logic to NpmAuthorTokenHelper

7c2893e

remove empty line

08c6790

Added summary comment for test class

8fa34b6

corrected variable casing

8ffc9b2

correct indentation

3b6bf24

removed unneeded reference from csproj

38e9c78

marmegh added 4 commits December 15, 2021 23:48

variable casing fixes

e865e96

Moved NpmAuthorToken test cases to be shared between both validator t…

3087813

…ests

Added checksum to IdentifiableNpmAuthorToken regex

08206f1

Added checksum validation

cc568ed

eddynaka reviewed Dec 17, 2021

View reviewed changes

marmegh added 4 commits January 11, 2022 16:11

Add the Microsoft.Security.Utilities nuget package

3e02b0b

Update to use the CustomAlphabetEncoder class to encode the npm checksum

3dcb1ef

Merge branch 'main' into users/marmegh/npmUpdate

ea9b63b

Updated comment and removed unnecessary blank lines.

ce0495c

eddynaka reviewed Jan 12, 2022

View reviewed changes

PR feedback code cleanup

ad08b7b

eddynaka reviewed Jan 12, 2022

View reviewed changes

eddynaka approved these changes Jan 12, 2022

View reviewed changes

michaelcfanning approved these changes Jan 14, 2022

View reviewed changes

michaelcfanning merged commit d35e876 into main Jan 14, 2022

eddynaka deleted the users/marmegh/npmUpdate branch March 8, 2022 00:33

		@@ -165,6 +165,13 @@
		"MessageArguments": { "secretKind": "NPM API key" },

		@@ -0,0 +1,3 @@
		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

		"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" No newline at end of file

Conversation

marmegh commented Dec 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Uh oh!

eddynaka Dec 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eddynaka Dec 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eddynaka Dec 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eddynaka Dec 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eddynaka Dec 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

michaelcfanning Dec 14, 2021 • edited by eddynaka Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

michaelcfanning Dec 14, 2021 • edited by marmegh Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eddynaka Dec 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eddynaka Dec 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eddynaka Dec 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eddynaka Dec 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eddynaka Dec 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eddynaka Dec 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eddynaka Dec 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eddynaka commented Dec 15, 2021 • edited by marmegh Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eddynaka Dec 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eddynaka Dec 17, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

marmegh commented Dec 14, 2021 •

edited

Loading

eddynaka Dec 14, 2021 •

edited

Loading

eddynaka Dec 14, 2021 •

edited

Loading

eddynaka Dec 14, 2021 •

edited

Loading

eddynaka Dec 14, 2021 •

edited

Loading

eddynaka Dec 14, 2021 •

edited

Loading

michaelcfanning Dec 14, 2021 •

edited by eddynaka

Loading

michaelcfanning Dec 14, 2021 •

edited by marmegh

Loading

eddynaka Dec 15, 2021 •

edited

Loading

eddynaka Dec 15, 2021 •

edited

Loading

eddynaka Dec 15, 2021 •

edited

Loading

eddynaka Dec 15, 2021 •

edited

Loading

eddynaka Dec 15, 2021 •

edited

Loading

eddynaka Dec 15, 2021 •

edited

Loading

eddynaka Dec 15, 2021 •

edited

Loading

eddynaka commented Dec 15, 2021 •

edited by marmegh

Loading

eddynaka Dec 15, 2021 •

edited

Loading

eddynaka Dec 17, 2021 •

edited

Loading

eddynaka Dec 17, 2021 •

edited by marmegh

Loading

eddynaka Jan 12, 2022 •

edited

Loading

eddynaka Jan 12, 2022 •

edited

Loading

eddynaka Jan 12, 2022 •

edited

Loading

eddynaka Jan 12, 2022 •

edited

Loading

eddynaka Jan 12, 2022 •

edited

Loading