Skip to content

Keep field name for csv timestamp column (don't implicitly convert)#8440

Merged
stephanie-engel merged 1 commit intomasterfrom
se-csv-7288
Nov 20, 2020
Merged

Keep field name for csv timestamp column (don't implicitly convert)#8440
stephanie-engel merged 1 commit intomasterfrom
se-csv-7288

Conversation

@stephanie-engel
Copy link
Copy Markdown
Contributor

@stephanie-engel stephanie-engel commented Nov 19, 2020

Required for all PRs:

  • Signed CLA.
  • Associated README.md updated.
  • Has appropriate unit tests.

Root Cause Analysis:
Within the csv parser, the field name for the csv_timestamp_column is, initially, converted from a string into an int. This causes the field name (for example: 202011131605) to be interpreted as a Unix format (instead of the Go reference time 200601021504).

The Fix:

  • Prevent implicit type conversion on the timestamp column, so that parseTimestamp can correctly parse the time as unix or UTC
  • Added a unit test to verify TimestampFormat: "200601021504"

closes #7288

Copy link
Copy Markdown
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @stephanie-engel, nice work! However, I think we also have to handle the cases where the column-type is given explicitly. This is the section above your change. How about handling both cases by adding

		if fieldName == p.TimestampColumn {
				recordFields[fieldName] = value
                                continue
                }

directly before line 208

			if len(p.ColumnTypes) > 0 {

This way you can leave the lines 241--250 unchanged.

@srebhan srebhan self-assigned this Nov 20, 2020
@stephanie-engel
Copy link
Copy Markdown
Contributor Author

Hey @stephanie-engel, nice work! However, I think we also have to handle the cases where the column-type is given explicitly. This is the section above your change. How about handling both cases by adding

		if fieldName == p.TimestampColumn {
				recordFields[fieldName] = value
                                continue
                }

directly before line 208

			if len(p.ColumnTypes) > 0 {

This way you can leave the lines 241--250 unchanged.

Thanks for the great feedback, @srebhan ! I just went ahead and made the changes you requested. I confirmed that the timestamp is still parsed correctly and the unit tests still pass 😄

Copy link
Copy Markdown
Contributor

@ssoroka ssoroka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, congratulations. 🥇

@stephanie-engel stephanie-engel merged commit 247230c into master Nov 20, 2020
@stephanie-engel stephanie-engel deleted the se-csv-7288 branch November 20, 2020 15:52
ssoroka pushed a commit that referenced this pull request Dec 1, 2020
@Hipska Hipska added the area/csv csv parser/serialiser related label Feb 15, 2022
arstercz pushed a commit to arstercz/telegraf that referenced this pull request Mar 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/csv csv parser/serialiser related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

timestamp format parsing is confusing yyyyMMddHHmm for a unix timestamp

4 participants