Skip to content

Directory_monitor does not work with xml files #11012

@Hipska

Description

@Hipska

Relevant telegraf.conf

# Ingests files in a directory and then moves them to a target directory.
[[inputs.directory_monitor]]
  ## The directory to monitor and read files from.
  directory = "buf"
  
  ## The directory to move finished files to.
  finished_directory = "done_ok"
  
  ## The directory to move files to upon file error.
  ## If not provided, erroring files will stay in the monitored directory.
  error_directory = "done_err"
  
  ## The amount of time a file is allowed to sit in the directory before it is picked up.
  ## This time can generally be low but if you choose to have a very large file written to the directory and it's potentially slow,
  ## set this higher so that the plugin will wait until the file is fully copied to the directory.
  # directory_duration_threshold = "50ms"
  
  ## A list of the only file names to monitor, if necessary. Supports regex. If left blank, all files are ingested.
  files_to_monitor = ["_statsfile\\.xml"]
  
  ## A list of files to ignore, if necessary. Supports regex.
  # files_to_ignore = [".DS_Store"]
  
  ## Maximum lines of the file to process that have not yet be written by the
  ## output. For best throughput set to the size of the output's metric_buffer_limit.
  ## Warning: setting this number higher than the output's metric_buffer_limit can cause dropped metrics.
  # max_buffered_metrics = 1000
  
  ## The maximum amount of file paths to queue up for processing at once, before waiting until files are processed to find more files.
  ## Lowering this value will result in *slightly* less memory use, with a potential sacrifice in speed efficiency, if absolutely necessary.
  # file_queue_size = 100000
  
  ## Name a tag containing the name of the file the data was parsed from.  Leave empty
  ## to disable. Cautious when file name variation is high, this can increase the cardinality
  ## significantly. Read more about cardinality here:
  ## https://docs.influxdata.com/influxdb/cloud/reference/glossary/#series-cardinality
  # file_tag = ""
  
  ## The dataformat to be read from the files.
  ## Each data format has its own unique set of configuration options, read
  ## more about them here:
  ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
  ## NOTE: We currently only support parsing newline-delimited JSON. See the format here: https://github.com/ndjson/ndjson-spec
  data_format = "xml"

  ## Print the internal XML document when in debug logging mode.
  ## This is especially useful when using the parser with non-XML formats like protocol-buffers
  ## to get an idea on the expression necessary to derive fields etc.
  xpath_print_document = true

Logs from Telegraf

2022-04-21T12:17:59Z I! Starting Telegraf 1.22.1
2022-04-21T12:17:59Z I! Loaded inputs: directory_monitor
2022-04-21T12:17:59Z I! Loaded aggregators: 
2022-04-21T12:17:59Z I! Loaded processors: converter parser strings
2022-04-21T12:17:59Z W! Outputs are not used in testing mode!
2022-04-21T12:17:59Z I! Tags enabled: 
2022-04-21T12:17:59Z D! [agent] Initializing plugins
2022-04-21T12:17:59Z D! [agent] Starting service inputs
2022-04-21T12:17:59Z D! [agent] Stopping service inputs
2022-04-21T12:17:59Z W! [inputs.directory_monitor] Exiting the Directory Monitor plugin. Waiting to quit until all current files are finished.
2022-04-21T12:17:59Z D! [parsers.xml::directory_monitor] XML document equivalent: "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"
2022-04-21T12:17:59Z D! [parsers.xml::directory_monitor] Number of configs: 0
2022-04-21T12:17:59Z D! [parsers.xml::directory_monitor] XML document equivalent: "<?xml-stylesheet type=\"text/xsl\" href=\"MeasDataCollection.xsl\"?>"
2022-04-21T12:17:59Z D! [parsers.xml::directory_monitor] Number of configs: 0
2022-04-21T12:17:59Z E! [inputs.directory_monitor] Error while reading file: 'buf/xxxx_statsfile.xml.gz'. XML syntax error on line 1: unexpected EOF
2022-04-21T12:17:59Z D! [agent] Input channel closed
2022-04-21T12:17:59Z D! [agent] Processor channel closed
2022-04-21T12:17:59Z D! [agent] Processor channel closed
2022-04-21T12:17:59Z D! [agent] Processor channel closed
2022-04-21T12:17:59Z D! [agent] Stopped Successfully
2022-04-21T12:17:59Z E! [telegraf] Error running agent: input plugins recorded 1 errors

System info

Telegraf 1.22.1

Docker

No response

Steps to reproduce

  1. Run telegraf with a xml file in the folder

Expected behavior

directory_monitor plugin to read the whole file and pass it to the parser

Actual behavior

directory_monitor reads and parses line by line

Additional info

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/xmlbugunexpected problem or unintended behaviorplugin/input1. Request for new input plugins 2. Issues/PRs that are related to input plugins

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions