Skip to content

Wrong epoch timestamp after Daylight Savings Time #3195

@pranavmarla

Description

@pranavmarla

Describe the bug

Summary

Essentially, when Fluentd is told to parse a UTC timestamp present in incoming logs and convert it to the equivalent epoch timestamp, but Fluentd is installed in a location where the local time zone's offset from UTC changes (eg. due to Daylight Savings Time) then, every time the offset changes, Fluentd will generate the wrong epoch timestamp for all subsequent logs. Currently, the only way to fix this is to restart Fluentd after the offset changes, so that it "sees" the new UTC offset, every time the offset changes.

I believe this is effectively the same issue as #2806, opened back in Jan.

Details

I have Fluentd agents which receive logs, process them (including extracting the preexisting UTC timestamp in the log, generating the equivalent epoch timestamp, and setting that as the official log timestamp) and then forward them on. The host machines (where the Fluentd agents are installed) are located in the Eastern time zone (ET).

On Nov 1, Daylight Savings Time (DST) ended -- i.e. the local time zone on the hosts changed from Eastern Daylight Time (EDT) (UTC - 4) to Eastern Standard Time (EST) (UTC - 5). After that, I noticed that the logs coming from Fluentd suddenly had the wrong timestamp -- specifically, they were 1 hour in the future.

As a short term fix, I was able to resolve this issue by simply restarting all the agents (till then, they had been running continuously since Sep). However, based on my testing, it looks like there is a bug in the way Fluentd generates its timestamp and, when DST begins again on March 14 2021, I expect a similar issue to occur, except that this time the timestamp will be 1 hour in the past -- thus, a long term fix needs to be made to Fluentd's code.

Steps to Reproduce

Summary

Another way of restating the problem is that Fluentd will ONLY generate the correct epoch timestamp when:

  • Fluentd thinks DST is currently in effect on the host, and the preexisting UTC timestamp in the log is also from a date when DST would have been in effect on the host
  • Fluentd thinks DST is not currently in effect on the host, and the preexisting UTC timestamp in the log is also from a date when DST would not have been in effect on the host

Thus, we can clearly demonstrate this bug by:

  • Manually generating two input logs, one containing a pre-Nov UTC timestamp (when DST would have been in effect) and the other containing a post-Nov UTC timestamp (when DST would not have been in effect), to supply to Fluentd for processing.
  • Manually changing the date on the underlying host to a pre-Nov date (when DST would have been in effect) and a post-Nov date (when DST would not have been in effect) -- followed, in each case, by restarting Fluentd afterwards so that it's aware of the new DST/non-DST state of the underlying host.
  • Noticing that, when the two are not in sync, the epoch timestamp generated by Fluentd will be off by 1 hour.

Details

Fluentd Config

To demonstrate this bug, I've created a simple Fluentd config that accepts input logs via HTTP, parses the UTC timestamp contained within, and prints the processed logs to stdout.
Note that, for clarity, I print each log to stdout twice -- once with an epoch timestamp, and once with a regular Fluentd timestamp (i.e. a well-formatted timestamp in the local time zone).

# Input
<source>
  @type http
  port 9880
  body_size_limit 32m

  # Parse preexisting UTC timestamp, present in the log
  <parse>
    @type json
    time_key time
    keep_time_key true
    time_type string
    time_format %Y-%m-%dT%H:%M:%S.%NZ
    utc true
  </parse>
</source>

# Throw away internal Fluentd logs
<label @FLUENT_LOG>
  <match fluent.**>
    @type null
  </match>
</label>

# Outputs
<match **>
  @type copy

  # Print log with epoch timestamp
  <store ignore_error>
    @type stdout
    <format>
      @type out_file
      time_type float
    </format>
  </store>

  # Print log with regular timestamp
  <store ignore_error>
    @type stdout
  </store>
</match>

Test Input

We will create two sample logs to send to Fluentd, that are identical except for the month of their UTC timestamp

  1. DST Log: Log with UTC timestamp set to Sep 8, 2020 at 8 pm (when DST would have been in effect on the host):
    2020-09-08T20:00:00.000000000Z
  2. Non-DST Log: Log with UTC timestamp set to Dec 8, 2020 at 8 pm (when DST would not have been in effect on the host):
    2020-12-08T20:00:00.000000000Z

Expected behaviour

Each UTC timestamp should ALWAYS be converted by Fluentd to the SAME equivalent epoch timestamp, regardless of the local time zone on the host! After all, by definition, epoch timestamps are supposed to be a time zone-agnostic way of referring to a particular point in time.

  1. Sep 8 2020, 8 pm UTC -> 1599595200.00000000
  2. Dec 8 2020, 8 pm UTC -> 1607457600.00000000

Steps

Scenario 1: DST in effect on the host
  1. Ensure Fluentd is not running
> systemctl stop td-agent
  1. To simulate DST being in effect on the host, manually change the date to any pre-Nov date
> date --set="Sep 8 2020 16:14:00"
Tue Sep  8 16:14:00 EDT 2020

Note: By default, at least on my machine, chronyd notices that the date is wrong, and fixes it, within ~2 minutes!

  1. Confirm that the local time zone on the host is now EDT (UTC - 4)
> timedatectl
                      Local time: Tue 2020-09-08 16:14:01 EDT
                  Universal time: Tue 2020-09-08 20:14:01 UTC
                        RTC time: Tue 2020-12-08 21:14:31
                       Time zone: America/New_York (EDT, -0400)
       System clock synchronized: no
systemd-timesyncd.service active: yes
                 RTC in local TZ: no
  1. Restart Fluentd
> systemctl restart td-agent
  1. Tail Fluentd's log file
> tail -f /var/log/td-agent/td-agent.log
...
2020-09-08 16:14:06 -0400 [info]: #0 fluentd worker is now running worker=0
  1. Send DST log to Fluentd, and view the output (after processing) in Fluentd's log file
> curl -X POST -d '{"log":"Test log", "time":"2020-09-08T20:00:00.000000000Z"}' http://localhost:9880/test
> tail -f /var/log/td-agent/td-agent.log
...
1599595200.00000000                   test    {"log":"Test log","time":"2020-09-08T20:00:00.000000000Z"}
2020-09-08 16:00:00.000000000 -0400   test:   {"log":"Test log","time":"2020-09-08T20:00:00.000000000Z"}

Compare the expected epoch timestamp to the actual epoch timestamp:

Expected Epoch Timestamp: 1599595200.00000000
Actual Epoch Timestamp:   1599595200.00000000

Thus, since DST is in effect on the host, and this is a DST log, Fluentd generates the correct epoch timestamp.

  1. Send non-DST log to Fluentd, and view the output (after processing) in Fluentd's log file
> curl -X POST -d '{"log":"Test log", "time":"2020-12-08T20:00:00.000000000Z"}' http://localhost:9880/test
> tail -f /var/log/td-agent/td-agent.log
...
1607461200.00000000                   test    {"log":"Test log","time":"2020-12-08T20:00:00.000000000Z"}
2020-12-08 16:00:00.000000000 -0500   test:   {"log":"Test log","time":"2020-12-08T20:00:00.000000000Z"}

Compare the expected epoch timestamp to the actual epoch timestamp:

Expected Epoch Timestamp: 1607457600.00000000
Actual Epoch Timestamp:   1607461200.00000000 

Note that the actual epoch timestamp is off by 1 hour (in the future) -- i.e. it corresponds to Dec 8, 2020 at 9 pm UTC!

Thus, since DST is in effect on the host, but this is a non-DST log, Fluentd generates the wrong epoch timestamp!
This corresponds to the situation described at the beginning of this issue, which led me to discover this bug: After Nov 1, all new incoming logs were non-DST logs but, since my Fluentd agent had been running continuously since Sep, it apparently, wrongly, thought that DST was still in effect on the underlying host.

  1. Stop Fluentd
> systemctl stop td-agent
> tail -f /var/log/td-agent/td-agent.log
...
2020-09-08 16:14:27 -0400 [info]: Worker 0 finished with status 0

################

Scenario 2: DST not in effect on the host
  1. Ensure Fluentd is not running
> systemctl stop td-agent
  1. In my case, since DST is currently not in effect, I just confirm the host is set to the correct local time. In general, need to ensure the host date is set to any post-Nov (and pre-March) date
> date
Tue Dec  8 16:24:30 EST 2020
  1. Confirm that the local time zone on the host is EST (UTC - 5):
> timedatectl
                      Local time: Tue 2020-12-08 16:24:44 EST
                  Universal time: Tue 2020-12-08 21:24:44 UTC
                        RTC time: Tue 2020-12-08 21:24:44
                       Time zone: America/New_York (EST, -0500)
       System clock synchronized: no
systemd-timesyncd.service active: yes
                 RTC in local TZ: no
  1. Restart Fluentd
> systemctl restart td-agent
  1. Tail Fluentd's log file
> tail -f /var/log/td-agent/td-agent.log
...
2020-12-08 16:28:09 -0500 [info]: #0 fluentd worker is now running worker=0
  1. Send DST log to Fluentd, and view the output (after processing) in Fluentd's log file
> curl -X POST -d '{"log":"Test log", "time":"2020-09-08T20:00:00.000000000Z"}' http://localhost:9880/test
> tail -f /var/log/td-agent/td-agent.log
...
1599591600.00000000                   test    {"log":"Test log","time":"2020-09-08T20:00:00.000000000Z"}
2020-09-08 15:00:00.000000000 -0400   test:   {"log":"Test log","time":"2020-09-08T20:00:00.000000000Z"}

Compare the expected epoch timestamp to the actual epoch timestamp:

Expected Epoch Timestamp: 1599595200.00000000
Actual Epoch Timestamp:   1599591600.00000000

Note that the actual epoch timestamp is off by 1 hour (in the past) -- i.e. it corresponds to Sep 8, 2020 at 7 pm UTC!

Thus, since DST is not in effect on the host, but this is a DST log, Fluentd generates the wrong epoch timestamp!
This corresponds to what the situation will be after March 14, 2021: After March 14, all new incoming logs will be DST logs but, since my Fluentd agent will have been running continuously since Dec, it will wrongly think that DST is still not in effect on the underlying host.

  1. Send non-DST log to Fluentd, and view the output (after processing) in Fluentd's log file
> curl -X POST -d '{"log":"Test log", "time":"2020-12-08T20:00:00.000000000Z"}' http://localhost:9880/test
> tail -f /var/log/td-agent/td-agent.log
...
1607457600.00000000                   test    {"log":"Test log","time":"2020-12-08T20:00:00.000000000Z"}
2020-12-08 15:00:00.000000000 -0500   test:   {"log":"Test log","time":"2020-12-08T20:00:00.000000000Z"}

Compare the expected epoch timestamp to the actual epoch timestamp:

Expected Epoch Timestamp: 1607457600.00000000
Actual Epoch Timestamp:   1607457600.00000000 

Thus, since DST is not in effect on the host, and this is a non-DST log, Fluentd generates the correct epoch timestamp.

  1. Stop Fluentd
> systemctl stop td-agent
> tail -f /var/log/td-agent/td-agent.log
...
2020-12-08 16:35:47 -0500 [info]: Worker 0 finished with status 0

Environment

  • td-agent version: 4.0.1
  • Fluentd version: 1.11.2
  • Operating system:
> cat /etc/os-release
NAME="Ubuntu"
VERSION="18.04.5 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.5 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic
  • Kernel version:
> uname -r
4.15.0-122-generic

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions