packages/windows/data_stream/powershell_operations: don't split tokens on hyphen#1931
packages/windows/data_stream/powershell_operations: don't split tokens on hyphen#1931efd6 merged 1 commit intoelastic:masterfrom
Conversation
a59d507 to
78c1cf6
Compare
💚 Build Succeeded
Expand to view the summary
Build stats
Test stats 🧪
🤖 GitHub commentsTo re-run your PR in the CI, just comment with:
|
There was a problem hiding this comment.
I guess we don't support analyzers defined in fields files:
�[36mkibana_1 |�[0m {"type":"log","@timestamp":"2021-10-18T00:00:08+00:00","tags":["error","plugins","fleet"],"pid":1228,"message":"Error: Error installing windows 1.2.4: illegal_argument_exception: [illegal_argument_exception] Reason: composable template [logs-windows.powershell_operational] template after composition with component templates [logs-windows.powershell_operational@custom, .fleet_component_template-1] is invalid\n at ensureInstalledPackage (/usr/share/kibana/x-pack/plugins/fleet/server/services/epm/packages/install.js:193:11)\n at runMicrotasks (<anonymous>)\n at processTicksAndRejections (internal/process/task_queues.js:95:5)\n at async Promise.all (index 0)\n at PackagePolicyService.create (/usr/share/kibana/x-pack/plugins/fleet/server/services/package_policy.js:133:33)\n at createPackagePolicyHandler (/usr/share/kibana/x-pack/plugins/fleet/server/routes/package_policy/handlers.js:109:27)\n at Router.handle (/usr/share/kibana/src/core/server/http/router/router.js:163:30)\n at handler (/usr/share/kibana/src/core/server/http/router/router.js:124:50)\n at exports.Manager.execute (/usr/share/kibana/node_modules/@hapi/hapi/lib/toolkit.js:60:28)\n at Object.internals.handler (/usr/share/kibana/node_modules/@hapi/hapi/lib/handler.js:46:20)\n at exports.execute (/usr/share/kibana/node_modules/@hapi/hapi/lib/handler.js:31:20)\n at Request._lifecycle (/usr/share/kibana/node_modules/@hapi/hapi/lib/request.js:370:32)\n at Request._execute (/usr/share/kibana/node_modules/@hapi/hapi/lib/request.js:279:9)"}
Did you try using the ingest pipeline to approach this problem?
There was a problem hiding this comment.
I haven't, but I think a keyword analyzer here and lowercase and split on this pattern in the ingest should work. Nope.
There was a problem hiding this comment.
@ruflin Do you think we need to support analyzers or is there any workaround available?
There was a problem hiding this comment.
Analyzers do appear to work given that we allow settings declared the data stream manifest.yml (source: https://github.com/elastic/package-spec/blob/a0687c0dc7a9da3fc540bdc7c5df2d7d84ae6713/versions/1/data_stream/manifest.spec.yml#L174-L176).
I tested this and it passed the system test.
diff --git a/packages/windows/data_stream/powershell_operational/fields/fields.yml b/packages/windows/data_stream/powershell_operational/fields/fields.yml
index 2049ba44..ae35dff3 100644
--- a/packages/windows/data_stream/powershell_operational/fields/fields.yml
+++ b/packages/windows/data_stream/powershell_operational/fields/fields.yml
@@ -105,10 +105,7 @@
example: "50d2dbda-7361-4926-a94d-d9eadfdb43fa"
- name: script_block_text
type: text
- analyzer:
- powershell:
- type: pattern
- pattern: "[\\W&&[^-]]+"
+ analyzer: powershell_script_analyzer
description: >
Text of the executed script block.
diff --git a/packages/windows/data_stream/powershell_operational/manifest.yml b/packages/windows/data_stream/powershell_operational/manifest.yml
index 08b887b3..8eca400c 100644
--- a/packages/windows/data_stream/powershell_operational/manifest.yml
+++ b/packages/windows/data_stream/powershell_operational/manifest.yml
@@ -1,5 +1,13 @@
type: logs
title: Windows Powershell/Operational logs
+elasticsearch:
+ index_template:
+ settings:
+ analysis:
+ analyzer:
+ powershell_script_analyzer:
+ type: pattern
+ pattern: '[\W&&[^-]]+'
streams:
- input: winlog
template_path: winlog.yml.hbsThe logs-windows.powershell_operational@settings component template is created as
78c1cf6 to
3bcf0f7
Compare
|
Pinging @elastic/security-external-integrations (Team:Security-External Integrations) |
|
I think we would want the same analyzer applied to these as well. Right? |
3bcf0f7 to
d4fe240
Compare
|
The other question is whether the search analyzer should also be provided. |
I think it does need a {
"tokens" : [
{
"token" : "invoke-webrequest",
"start_offset" : 1,
"end_offset" : 18,
"type" : "word",
"position" : 0
},
{
"token" : "-uri",
"start_offset" : 19,
"end_offset" : 23,
"type" : "word",
"position" : 1
},
{
"token" : "https",
"start_offset" : 25,
"end_offset" : 30,
"type" : "word",
"position" : 2
},
{
"token" : "aka",
"start_offset" : 33,
"end_offset" : 36,
"type" : "word",
"position" : 3
},
{
"token" : "ms",
"start_offset" : 37,
"end_offset" : 39,
"type" : "word",
"position" : 4
},
{
"token" : "pscore6-docs",
"start_offset" : 40,
"end_offset" : 52,
"type" : "word",
"position" : 5
},
{
"token" : "links",
"start_offset" : 55,
"end_offset" : 60,
"type" : "word",
"position" : 6
},
{
"token" : "href",
"start_offset" : 61,
"end_offset" : 65,
"type" : "word",
"position" : 7
}
]
}{
"tokens" : [
{
"token" : "invoke",
"start_offset" : 0,
"end_offset" : 6,
"type" : "<ALPHANUM>",
"position" : 0
},
{
"token" : "webrequest",
"start_offset" : 7,
"end_offset" : 17,
"type" : "<ALPHANUM>",
"position" : 1
}
]
} |
d4fe240 to
2d4d50d
Compare
2d4d50d to
64ed53a
Compare
…s on hyphen Co-authored-by: Andrew Kroh <andrew.kroh@elastic.co>
64ed53a to
d8cd9c1
Compare

What does this PR do?
The change replaces the simple tokenizer with a custom tokenizer that splits on word boundaries that do not include hyphen.
Checklist
changelog.ymlfile.manifest.ymlfile to point to the latest Elastic stack release (e.g.^7.13.0).Author's Checklist
How to test this PR locally
Related issues
Screenshots