Skip to content

Getting BufferChunkOverflowError from non-buffered copy plugin! #2928

@pranavmarla

Description

@pranavmarla

Describe the bug

In my Fluentd config, I use the copy plugin to send the same logs to two Kafka clusters (each using the kafka2 plugin). In each kafka2 section, I set the max buffer chunk size (i.e. chunk_limit_size) to 600 KB (600,000 bytes).
(See Fluentd config below)

I am seeing multiple error messages like this:

{"time":"2020-04-02 09:54:14.883 -0400","level":"error","message":"[publish_logs_to_outputs] ignore emit error error_class=Fluent::Plugin::Buffer::BufferChunkOverflowError error=\"a 465698bytes record is larger than buffer chunk limit size\"","worker_id":5}

{"time":"2020-04-02 11:11:31.630 -0400","level":"error","message":"[publish_logs_to_outputs] ignore emit error error_class=Fluent::Plugin::Buffer::BufferChunkOverflowError error=\"a 35711bytes record is larger than buffer chunk limit size\"","worker_id":5}

There are two problems with these error messages:

  1. The BufferChunkOverflowError is being generated from the copy plugin which, as far as I know, is non-buffered!
  2. According to the error messages, the BufferChunkOverflowError is being triggered by log sizes (eg. ~466 KB, ~36 KB) that are all much lesser than the actual chunk_limit_size set in my kafka2 sections: 600 KB!

Expected behavior
What I expect is:

  1. Any BufferChunkOverflowError should only be generated by output plugins that actually have buffers, like the kafka2 plugin -- NOT the copy plugin!
  2. The BufferChunkOverflowError should only be triggered when the log size is greater than the configured chunk_limit_size!

Your Environment

  • td-agent version: 1.9.2
  • fluent-plugin-kafka version: 0.12.3
  • Operating system: Ubuntu 18.04.4 LTS (Bionic Beaver)
  • Kernel version: 4.15.0-88-generic

Your Configuration

...
<match **>

  @type copy
  @id publish_logs_to_outputs

  # Kafka cluster 1
  <store ignore_error>
    
    @type kafka2
    @id kafka_cluster_1

    default_topic xxx
    
    brokers xxx

    sasl_over_ssl true
    username xxx
    password xxx
    ssl_ca_cert xxx
    ssl_verify_hostname false
    
    <format>
      @type json
    </format>

    <buffer>
      @type file
      
      chunk_limit_size 600000
      total_limit_size 500g

      flush_mode interval
      flush_interval 10s
      flush_thread_count 12
    </buffer>

  </store>

  # Kafka cluster 2
  <store ignore_error>
    
    @type kafka2
    @id kafka_cluster_2

    default_topic xxx
    
    brokers xxx

    sasl_over_ssl true
    username xxx
    password xxx
    ssl_ca_cert xxx
    ssl_verify_hostname false
    
    <format>
      @type json
    </format>

    <buffer>
      @type file
      
      chunk_limit_size 600000
      total_limit_size 500g

      flush_mode interval
      flush_interval 10s
      flush_thread_count 12
    </buffer>

  </store>

</match>
...

Your Error Log
(copied from above)

I am seeing multiple error messages like this:

{"time":"2020-04-02 09:54:14.883 -0400","level":"error","message":"[publish_logs_to_outputs] ignore emit error error_class=Fluent::Plugin::Buffer::BufferChunkOverflowError error=\"a 465698bytes record is larger than buffer chunk limit size\"","worker_id":5}

{"time":"2020-04-02 11:11:31.630 -0400","level":"error","message":"[publish_logs_to_outputs] ignore emit error error_class=Fluent::Plugin::Buffer::BufferChunkOverflowError error=\"a 35711bytes record is larger than buffer chunk limit size\"","worker_id":5}

Additional context

To clarify, most of the error messages are what I expect, and indicate that the kafka2 config (and chunk_limit_size config) are working as expected.

For example, most of the error messages look like this:

{"time":"2020-04-02 11:37:29.573 -0400","level":"warn","message":"[kafka_cluster_2] chunk bytes limit exceeds for an emitted event stream: 8293577bytes","worker_id":4}

There is no problem with these error messages, because the error is coming from the buffered kafka2 plugin, and the error is being generated by a log > chunk_limit_size (600 KB) -- i.e. these error messages are expected, because I expect logs > 600 KB to be ignored by the kafka2 plugin.

My problem is that I ALSO sometimes see the weird error messages at the top, which come from the non-buffered copy plugin, and complain about logs that are < chunk_limit_size (600 KB)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions