Skip to content

Added regex for new npm secret format#588

Merged
michaelcfanning merged 21 commits into
mainfrom
users/marmegh/npmUpdate
Jan 14, 2022
Merged

Added regex for new npm secret format#588
michaelcfanning merged 21 commits into
mainfrom
users/marmegh/npmUpdate

Conversation

@marmegh

@marmegh marmegh commented Dec 14, 2021

Copy link
Copy Markdown
Collaborator

Changes

Added regex for new npm secret format

For significant contributions please make sure you have completed the following items:

  • ReleaseHistory.md updated for non-trivial changes
  • Added unit tests

Comment thread Src/ReleaseHistory.md Outdated
- SDK: Exposing `automationId`, `automationGuid`, and `postUri` in the
`analyze` command.
[#586](https://github.com/microsoft/sarif-pattern-matcher/pull/586)
- NFD: Adding additional regex for 'SEC101/017.NpmAuthorToken'

@eddynaka eddynaka Dec 14, 2021

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NFD

what is NFD? should this be FNC? #Closed

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a typo. Thank you

$SEC101/015.AkamaiCredentials=(?si)https:\/\/(?P<host>[\w\-\.]+)\.akamaiapis\.net.{0,150}(?:(?:client_token.{0,10}(?:[^a]|^)(?P<id>akab[\w\-]+).{0,50})|(?:access_token.{0,10}(?:[^\w\-]|^)(?P<resource>akab[\w\-]+).{0,200})|(?:(?:client_secret).{0,10}(?:[^0-9a-z\/\+]|^)(?P<secret>[0-9a-z\/\+]{43}=))){3}
$SEC101/016.StripeApiKey=(?:[^rs]|^)(?P<secret>(?:r|s)k_(?:live|test)_(?i)[0-9a-z]{24,99})(?:[^0-9a-z]|$)
$SEC101/017.NpmAuthorToken=(?i)npm.{0,100}[^0-9a-z](?-i)(?P<secret>[0-9a-z]{8}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{12})[^0-9a-z]
$SEC101/017.NpmAuthorTokenV2=(?:[^s]|^)(?P<secret>(?:npm)_(?i)[a-zA-Z0-9]{36})

@eddynaka eddynaka Dec 14, 2021

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[a-zA-Z0-9]

since you added ?i, you can change this to:
0-9a-z (we normally add numbers first and letters last) #Closed

$SEC101/015.AkamaiCredentials=(?si)https:\/\/(?P<host>[\w\-\.]+)\.akamaiapis\.net.{0,150}(?:(?:client_token.{0,10}(?:[^a]|^)(?P<id>akab[\w\-]+).{0,50})|(?:access_token.{0,10}(?:[^\w\-]|^)(?P<resource>akab[\w\-]+).{0,200})|(?:(?:client_secret).{0,10}(?:[^0-9a-z\/\+]|^)(?P<secret>[0-9a-z\/\+]{43}=))){3}
$SEC101/016.StripeApiKey=(?:[^rs]|^)(?P<secret>(?:r|s)k_(?:live|test)_(?i)[0-9a-z]{24,99})(?:[^0-9a-z]|$)
$SEC101/017.NpmAuthorToken=(?i)npm.{0,100}[^0-9a-z](?-i)(?P<secret>[0-9a-z]{8}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{12})[^0-9a-z]
$SEC101/017.NpmAuthorTokenV2=(?:[^s]|^)(?P<secret>(?:npm)_(?i)[a-zA-Z0-9]{36})

@eddynaka eddynaka Dec 14, 2021

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(?:npm)

you don't need to create a group.
#Closed

$SEC101/015.AkamaiCredentials=(?si)https:\/\/(?P<host>[\w\-\.]+)\.akamaiapis\.net.{0,150}(?:(?:client_token.{0,10}(?:[^a]|^)(?P<id>akab[\w\-]+).{0,50})|(?:access_token.{0,10}(?:[^\w\-]|^)(?P<resource>akab[\w\-]+).{0,200})|(?:(?:client_secret).{0,10}(?:[^0-9a-z\/\+]|^)(?P<secret>[0-9a-z\/\+]{43}=))){3}
$SEC101/016.StripeApiKey=(?:[^rs]|^)(?P<secret>(?:r|s)k_(?:live|test)_(?i)[0-9a-z]{24,99})(?:[^0-9a-z]|$)
$SEC101/017.NpmAuthorToken=(?i)npm.{0,100}[^0-9a-z](?-i)(?P<secret>[0-9a-z]{8}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{12})[^0-9a-z]
$SEC101/017.NpmAuthorTokenV2=(?:[^s]|^)(?P<secret>(?:npm)_(?i)[a-zA-Z0-9]{36})

@eddynaka eddynaka Dec 14, 2021

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(?:[^s]|^)

instead of ^s, you could just use ^n, because anything that isn't the 'n' could be a match. #Closed

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with that, we would have:
(?:[^n]|^)

$SEC101/015.AkamaiCredentials=(?si)https:\/\/(?P<host>[\w\-\.]+)\.akamaiapis\.net.{0,150}(?:(?:client_token.{0,10}(?:[^a]|^)(?P<id>akab[\w\-]+).{0,50})|(?:access_token.{0,10}(?:[^\w\-]|^)(?P<resource>akab[\w\-]+).{0,200})|(?:(?:client_secret).{0,10}(?:[^0-9a-z\/\+]|^)(?P<secret>[0-9a-z\/\+]{43}=))){3}
$SEC101/016.StripeApiKey=(?:[^rs]|^)(?P<secret>(?:r|s)k_(?:live|test)_(?i)[0-9a-z]{24,99})(?:[^0-9a-z]|$)
$SEC101/017.NpmAuthorToken=(?i)npm.{0,100}[^0-9a-z](?-i)(?P<secret>[0-9a-z]{8}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{4}-[0-9a-z]{12})[^0-9a-z]
$SEC101/017.NpmAuthorTokenV2=(?:[^s]|^)(?P<secret>(?:npm)_(?i)[a-zA-Z0-9]{36})

@eddynaka eddynaka Dec 14, 2021

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we normally add the end border as well:
(?:[^0-9a-z]|$) #Closed

@@ -165,6 +165,13 @@
"MessageArguments": { "secretKind": "NPM API key" },

@michaelcfanning michaelcfanning Dec 14, 2021

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API key

why does this say 'API key' but the rule name is author token? #Closed

{
"Id": "SEC101/017",
"Name": "DoNotExposePlaintextSecrets/NpmAuthorToken",
"ContentsRegex": "$SEC101/017.NpmAuthorTokenV2",

@michaelcfanning michaelcfanning Dec 14, 2021

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NpmAuthorTokenV2

we should add a checksum validation for this secret kind, that's the substance of the token change (moving to v2).

We need to separate this into a brand new rule, as we are likely to handle the 'identifiable' secrets differently (or someone needs to propose a convention for marking these as identifiable).

For now, try allocating a new rule id, SEC101/050 it appears? We should call the rule 'IdentifiableNpmAuthorToken' or 'NpmIdentifiableAuthorToken'. The former would sort all identifiable secret types together, the latter would keep the secret type bundled with the platform.

We could consider adding a new property denoting these as identifiable in adding to making it clear in the rule name. This would be valuable if we special-case behaviors in the engine itself (e.g., the engine could automatically elevate all 'identifiable' secrets to errors in the static analysis phase.

A future change, though, no need to worry about it now. Let's do handle the checksum. I just noticed we don't do this in the GH PAT either, so you'll need to follow up with me offline on what to do.

#Resolved

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, we do the checksum in the internal version.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've split this out into a separate rule and added the supporting unit test and functional test files. Let me know what you think. I tried to organize the commits for easy reversal/changes.


namespace Microsoft.CodeAnalysis.Sarif.PatternMatcher.Plugins.Security
{
public class IdentifiableNpmAuthorTokenValidator : DynamicValidatorBase

@eddynaka eddynaka Dec 15, 2021

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IdentifiableNpmAuthorTokenValidator

create a helper that has:

  • uri
  • checkinformation
  • dynamic validation logic
  • internal classes

change this validator + npmtokenvalidator to use that helper

this will reduce the code duplication #Closed


<ItemGroup>
<None Remove="TestData\SecurePlaintextSecrets\ExpectedOutputs\SEC101_050.IdentifiableNpmAuthorToken.sarif" />
<None Remove="TestData\SecurePlaintextSecrets\Inputs\SEC101_050.IdentifiableNpmAuthorToken.ps1" />

@eddynaka eddynaka Dec 15, 2021

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can delete this, since we automatically embed everything under TestData #Closed


namespace Microsoft.CodeAnalysis.Sarif.PatternMatcher.Plugins.Security.Validators
{
public class IdentifiableNpmAuthorTokenValidatorTests

@eddynaka eddynaka Dec 15, 2021

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IdentifiableNpmAuthorTokenValidatorTests

once u create the helper, u might be able to merge the test logic for both rules. #Closed

{
public class IdentifiableNpmAuthorTokenValidatorTests
{

@eddynaka eddynaka Dec 15, 2021

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can remove this empty line #Closed


namespace Microsoft.CodeAnalysis.Sarif.PatternMatcher.Plugins.Security.Validators
{
public class IdentifiableNpmAuthorTokenValidatorTests

@eddynaka eddynaka Dec 15, 2021

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the summary (check other test validators, pls) #Closed

Content = new StringContent(publishResponseJson)
};

var ValidEmptyContentResponse = new HttpResponseMessage(HttpStatusCode.OK)

@eddynaka eddynaka Dec 15, 2021

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ValidEmptyContentResponse

verify casing. for local variables we normally start with lower case #Closed

ValidationState currentState = identifiableNpmAuthorTokenValidator.IsValidDynamic(ref fingerprint,
ref message,
keyValuePairs,
ref resultLevelKind);

@eddynaka eddynaka Dec 15, 2021

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

review indentation. we normally indent based on the first parameter #Closed

@eddynaka

eddynaka commented Dec 15, 2021

Copy link
Copy Markdown
Collaborator

The change looks good!
Just a few refactors to prevent code duplication and some style nits.


In reply to: 994572554


In reply to: 994572554


protected override IEnumerable<ValidationResult> IsValidStaticHelper(IDictionary<string, FlexMatch> groups)
{
FlexMatch secret = groups["secret"];

@eddynaka eddynaka Dec 15, 2021

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

secret

to reduce false-positives, we should validate the checksum:
https://github.blog/changelog/2021-09-23-npm-has-a-new-access-token-format/ #Closed

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check sec101/102 or sec101/006 (internal version)

{
FlexMatch secret = groups["secret"];

if (groups.TryGetNonEmptyValue("checksum", out FlexMatch checksum))

@eddynaka eddynaka Dec 17, 2021

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (groups.TryGetNonEmptyValue("checksum", out FlexMatch checksum))

since this is required, you can follow what we did for the secret. If we don't have that key, it will throw an exception because that is not expected. #Closed

client);
}

private static string Base62EncodeUint32(uint value, int minimumLength = 6)

@eddynaka eddynaka Dec 17, 2021

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Base62EncodeUint32

should we move this to crc class, make it static and add more tests to complete its coverage? what are ur thoughts? #Resolved

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kind of code should exist in Microsoft.Security.Utilities. That package should have API that accepts an arbitrary alphabet (such as a base62 encoding scheme) and provides encoding and decoding capabilities.

We will need this API for many classes of secret.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like that functionality is still pending. May I proceed with Eddy's suggestion and ensure that a task to incorporate this is added to the backlog?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're now using the CustomAlphabetEncoder class to achieve this.

@@ -74,73 +39,9 @@ protected override ValidationState IsValidDynamicHelper(ref Fingerprint fingerpr
ref ResultLevelKind resultLevelKind)
{
string secret = fingerprint.Secret;

@eddynaka eddynaka Jan 12, 2022

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string secret = fingerprint.Secret;

you can remove this. #Closed

// Validate checksum to avoid false positives.
string randomPart = secret.Value.String.Substring(4, 30);
uint checksumValue = Crc32.Calculate(randomPart);
var encoder = new CustomAlphabetEncoder();

@eddynaka eddynaka Jan 12, 2022

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

var encoder = new CustomAlphabetEncoder();

should we create only one encoder and re-use it? #Closed

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this also helped me catch some other cleanup items.


[JsonProperty("total")]
public int Total { get; set; }
return NpmAuthorTokenHelper.ValidateTokens(ref fingerprint, ref message, ref resultLevelKind, client);

@eddynaka eddynaka Jan 12, 2022

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ref fingerprint, ref message, ref resultLevelKind, client

following the identifiablenpm, can you wrap the parameters? #Closed

public class IdentifiableNpmAuthorTokenValidatorTests
{
[Fact]
public void IdentifiableNpmAuthorTokenValidatorTests_MockHttpTests()

@eddynaka eddynaka Jan 12, 2022

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests

nit: remove Tests from this.
The idea that we follow:
ClassThatWeAreTesting_Something #Closed

@@ -0,0 +1,3 @@
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

"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" No newline at end of file

@eddynaka eddynaka Jan 12, 2022

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

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

here, you created a test that pass the checksum.
Can you also add a new one that do not pass?
Also, add a comment. Something like:

# This is a pattern that matches the regex but the checksum is invalid
npm_abcd....

#Closed

{
public class IdentifiableNpmAuthorTokenValidator : DynamicValidatorBase
{
private readonly CustomAlphabetEncoder encoder = new CustomAlphabetEncoder();

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new CustomAlphabetEncoder();

can you move this to the constructor of this class?
we try to move initializations to the constructor.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good note, please provide the rationale for the standard. 'The reason is' that it's harder to understand initialization behavior when assignments are inlined like this: instance variables can be declared anywhere in the class. You really notice the problem with this pattern when debugging a ctor with initializations of this kind that are scattered around: the debugger hopes from place to place. With an explicit constructor, everything is sequential.

@eddynaka eddynaka left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@michaelcfanning michaelcfanning left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@michaelcfanning michaelcfanning merged commit d35e876 into main Jan 14, 2022
@eddynaka eddynaka deleted the users/marmegh/npmUpdate branch March 8, 2022 00:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants