[O11y][Nginx] Rally benchmark nginx.error#8762
Conversation
🚀 Benchmarks reportTo see the full report comment with |
🌐 Coverage report
|
|
Hi @aliabbas-elastic, please update your branch with the latest contents from main branch. There was an important PR merged updating the CI pipelines. Thanks! |
… into nginx_benchmark_error
| "created": "{{ generate "event.created" | date "2006-01-02T15:04:05.000Z" }}", | ||
| "dataset": "nginx.error", | ||
| "ingested": "{{ generate "event.ingested" | date "2006-01-02T15:04:05.000Z" }}", | ||
| "timezone": "{{ $eventTimezone }}" |
There was a problem hiding this comment.
How will this map to the timestamp in the message? I expect this to have the effect as there are multiple time zones, that events will be ingested in different hours. See also #8777 (comment)
There was a problem hiding this comment.
@ruflin I saw the error logs. They don't contain the timezone in the log timestamp so the timezone would be set according to the elastic-agent's timezone. Please refer the sample log
2019/11/05 14:50:44 [error] 54053#0: *3 open() "/usr/local/Cellar/nginx/1.10.2_1/html/adsasd" failed (2: No such file or directory), client: 127.0.0.1, server: localhost, request: "GET /pysio HTTP/1.1", host: "localhost:8080"
Currently we can set the timezone to a static value(+00:00). Also saw your comment of setting timezone variable which is a good thing to implement. Let me know if this is the correct thing to do right now.
There was a problem hiding this comment.
all the timestamps generated in the log appear to be in UTC ('Z')? perhaps a static +0000 would be best here.
packages/nginx/_dev/benchmark/rally/error-benchmark/template.ndjson
Outdated
Show resolved
Hide resolved
packages/nginx/_dev/benchmark/rally/error-benchmark/template.ndjson
Outdated
Show resolved
Hide resolved
|
💚 CLA has been signed |
e8c94cc to
51a224d
Compare
| "created": "{{ generate "event.created" | date "2006-01-02T15:04:05.000Z" }}", | ||
| "dataset": "nginx.error", | ||
| "ingested": "{{ generate "event.ingested" | date "2006-01-02T15:04:05.000Z" }}", | ||
| "timezone": "{{ $eventTimezone }}" |
There was a problem hiding this comment.
all the timestamps generated in the log appear to be in UTC ('Z')? perhaps a static +0000 would be best here.
ruflin
left a comment
There was a problem hiding this comment.
I suggest to get this in and follow up with @aspacca on the timestamp part.
| "event": { | ||
| "agent_id_status": "verified", | ||
| "dataset": "nginx.error", | ||
| "timezone": "+00:00" |
There was a problem hiding this comment.
@ruflin Wanted to share some points here as I am not able to use the config $timestamp.Format "MST" here. Firstly, I looked into the documents generated. Here is the one such.
Sample Event
{
"@timestamp": "2024-01-11T12:26:38.253Z",
"agent": {
"ephemeral_id": "palethief",
"id": "ef5e274d-4b53-45e6-943a-a5bcf1a6f523",
"name": "denimwhimsey",
"type": "filebeat",
"version": "8.8.0"
},
"data_stream": {
"dataset": "nginx.error",
"namespace": "ep",
"type": "logs"
},
"ecs": {
"version": "8.5.1"
},
"elastic_agent": {
"id": "palethief",
"snapshot": false,
"version": "8.8.0"
},
"event": {
"agent_id_status": "verified",
"dataset": "nginx.error",
"timezone": "IST"
},
"host": {
"architecture": "x86_64",
"containerized": false,
"hostname": "docker-fleet-agent",
"id": "66392b0697b84641af8006d87aeb89f1",
"ip": [
"19.203.220.253"
],
"mac": [
"02-42-AC-12-00-07"
],
"name": "docker-fleet-agent",
"os": {
"codename": "focal",
"family": "debian",
"kernel": "5.15.49-linuxkit",
"name": "Ubuntu",
"platform": "ubuntu",
"type": "linux",
"version": "20.04.5 LTS (Focal Fossa)"
}
},
"input": {
"type": "log"
},
"log": {
"file": {
"path": "/var/log/nginx/error.log"
},
"offset": 0
},
"message": "2024/01/11 12:26:38 [notice] 47858#85131: *33397 twister antelope cougar saver mark salmon gorilla thorn fang fly voice wasp carver elk spirit viper carp venomking",
"tags": [
"nginx-error"
]
}Unknown time-zone ID: IST
However the timezone with value +00:00 are working absolutely fine. Any suggestions here if anything is wrong in the config? Attaching the date processor as well which is failing.
- date:
if: ctx.event.timezone != null
field: nginx.error.time
target_field: '@timestamp'
formats:
- yyyy/MM/dd H:m:s
timezone: '{{ event.timezone }}'
on_failure:
- append:
field: error.message
value: '{{ _ingest.on_failure_message }}'
cc:- @aspacca
There was a problem hiding this comment.
Hi @aliabbas-elastic Thanks for playing with it. To be honest to dig more into the error, I need to play around with it. I'm hoping for some help / guidance from @aspacca on this one. Based on the error, it seems IST as a time zone id is not supported :-( Unfortunately the docs do not state which time zone abbrevations are supported: https://www.elastic.co/guide/en/elasticsearch/reference/current/date-processor.html Here is the code that Beats uses to add the timezone: https://github.com/elastic/beats/blob/main/libbeat/processors/add_locale/add_locale.go#L88
Around the timestamp and timezone, my mental model is as following: As long as the timestamp in the message and the timezone are aligned, Elasticsearch should do the right thing. Meaning if you generate data on your laptop and are based in Syndey, your timestamp is local to your time in sydney so will be the time zone that is added. If this is persisted, it will work on any other machine that will replay this data no matter the time zone this machine is in. It also means, if you look at your local Kibana instance and you are ingesting data right now, it will show up un "Now". Does this make sense?
There was a problem hiding this comment.
@ruflin As per suggestion from @aspacca $timestamp.Format "-07:00" works fine with this and I am getting the following reponse in timezone. Reference:- https://gosamples.dev/date-time-format-cheatsheet/#:~:text=trailing%20zeros%20omitted)-,Time%20zone%20format,-Time%20zone%20format
"timezone": "+05:30"
Updated in the PR as well
There was a problem hiding this comment.
Did a quick test on my end with this PR and works as expected!
There was a problem hiding this comment.
@aspacca Would be nice to "document" somewhere how to do it as I expect we will need it in quite a few places.
There was a problem hiding this comment.
@ruflin i agree
there's a few topics already we should document in some kind of knowledge base.
where do you suggest to place such content? I'm keen to have it on the corpus generator repo, and link that in elastic-package, but open to suggestion
There was a problem hiding this comment.
I'm ok having for now in the corpus tool. I think eventually we will need to have a simple guide on how to get started in elastic-package but for details still can refer to the corpus tool.
Most important is that we have docs.
💚 Build Succeeded
cc @aliabbas-elastic |
| }, | ||
| "offset": 0 | ||
| }, | ||
| "message": "{{$timestamp.Format "2006/01/02 15:04:05"}} [{{ $logLevel }}] {{ $pid }}#{{ $threadId }}: *{{ $connectionId }} {{generate "message"}}", |
There was a problem hiding this comment.
Reviewing this I stumbled over the {{generate "message"}} part in the UI as it generates a random message. I was a bit surprised by this but something we can tune as a follow up.
There was a problem hiding this comment.
I agree with your point that it generates random messages that doesn't look realistic. But I referred from this PR in which the same field is being generated.
There was a problem hiding this comment.
We can always follow up on this one. Lets keep it as is for now.
| "offset": 0 | ||
| }, | ||
| "message": "{{$timestamp.Format "2006/01/02 15:04:05"}} [{{ $logLevel }}] {{ $pid }}#{{ $threadId }}: *{{ $connectionId }} {{generate "message"}}", | ||
| "tags": [ |
There was a problem hiding this comment.
Is this tag shipped by the agent normally?
There was a problem hiding this comment.
This is basically the user provided value or the default value of the data stream that remains static in all the events if kept unchanged throughout the period integration is configured. It can be set by user while configuring the integration.
ruflin
left a comment
There was a problem hiding this comment.
I'm good getting this PR in and the follow up with more fine tuning.
In elastic#8762 error log generation was added. The message was a random message generated. This changes it to have a partially more realistic message generated based on some actual logs. The enum values for the different fields are hardcoded. It would be nice to have for host, ip and paths some partially random generation that could be used as function with cardinality.
In #8762 error log generation was added. The message was a random message generated. This changes it to have a partially more realistic message generated based on some actual logs. The enum values for the different fields are hardcoded. It would be nice to have for host, ip and paths some partially random generation that could be used as function with cardinality.
Proposed commit message
errordata stream ofNginxChecklist
How to test this PR locally
Run this command from package root
elastic-package benchmark rally --benchmark error-benchmark -vRelated issues
Screenshots