[Enhancement] PowerShell - Optimize Entropy Calculation, Add Normalized Entropy, Add Pipeline Benchmark#16707
Merged
[Enhancement] PowerShell - Optimize Entropy Calculation, Add Normalized Entropy, Add Pipeline Benchmark#16707
Conversation
|
Pinging @elastic/sec-windows-platform (Team:Security-Windows Platform) |
|
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
mauri870
reviewed
Jan 5, 2026
mauri870
approved these changes
Jan 5, 2026
Member
mauri870
left a comment
There was a problem hiding this comment.
LGTM, but I'm not very proficient with PowerShell. The code looks fine, but it needs a deeper look from the Windows team.
|
|
||
| double normalizedEntropy = 0.0; | ||
| if (length > 1) { | ||
| double maxEntropy = Math.log((double) length) * invLog2; // max bits if every character is unique |
Contributor
There was a problem hiding this comment.
I think the normalized entropy calculation looks good 👍
Few notes for posterity:
- For the line
double maxEntropy = Math.log((double) length) * invLog2; // max bits if every character is uniqueI think it makes sense to use length here. Typical normalized entropy calculations (like that for R/Posterior ref) would use something akin toseenCountinstead oflength. However, this is expecting the input to be more akin to categories whereaandaare equivalent regardless of their position in the script block. In our case, I think we want the position to mater as well, so each value is by definition unique makinglengththe correct number to use here (as is correctly done in the code). - The pre-output check
else if (normalizedEntropy > 1.0) normalizedEntropy = 1.0;I think is technically not necessary, as this should not occur. However, I think we should keep this check as it could catch floating point rounding issues without impacting the integrity of the data result (code is correct as is).
nfritts
approved these changes
Jan 23, 2026
🚀 Benchmarks reportTo see the full report comment with |
💚 Build Succeeded
History
cc @w0rk3r |
gogochan
approved these changes
Jan 23, 2026
|
Package windows - 3.4.0 containing this change is available at https://epr.elastic.co/package/windows/3.4.0/ |
jakubgalecki0
pushed a commit
to jakubgalecki0/integrations
that referenced
this pull request
Feb 19, 2026
…ed Entropy, Add Pipeline Benchmark (elastic#16707) * [Enhancement] PowerShell - Optimize Entropy Calculation, Add Normalized Entropy, Add Pipeline Benchmark * Update test-powershell-operational-events.json-expected.json * Update changelog.yml * rename benchmark file
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Proposed commit message
Summary
Related issue:
This PR:
powershell.file.script_block_entropy_normalized= entropy_bits / log2(script_block_length) (0–1).Old pipeline:
Improved pipeline:
Complete benchmark output
Old:
Improved:
Checklist
changelog.ymlfile.