Skip to content

Diskqueue garbles data containing multi-byte characters #26335

@ijokarumawak

Description

@ijokarumawak

I encountered the same issue that is reported at Discuss before. The post has been closed without any comments. So I investigated how disk-queue contributes to this issue.
It seems the JSON parser implemented at elastic/go-structform project is the source of this issue, which is used by disk-queue decode.
With the Go standard json.Unmarshal(), the issue doesn't happen.
Please look at the Go test mentioned below for further details.

  • Steps to Reproduce:

Posted a Go test code in my Gist here to reproduce the issue.
https://gist.github.com/ijokarumawak/b5fc3d6106e8517ef2b07c8ce36e9fc4

Test result:

Running tool: /usr/local/go/bin/go test -timeout 30m -tags integration -run ^TestSerialize$ github.com/elastic/beats/v7/libbeat/publisher/queue/diskqueue

--- FAIL: TestSerialize (0.00s)
    /Users/koji/dev/elastic/beats/libbeat/publisher/queue/diskqueue/serialize_test.go:64: 
        	Error Trace:	serialize_test.go:64
        	Error:      	Not equal: 
        	            	expected: "{\"name\": \"桃太郎\"}"
        	            	actual  : "{\"name\": \"太郎\\\"}\"}"
        	            	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -1 +1 @@
        	            	-{"name": "桃太郎"}
        	            	+{"name": "太郎\"}"}
        	Test:       	TestSerialize
FAIL
FAIL	github.com/elastic/beats/v7/libbeat/publisher/queue/diskqueue	0.686s
FAIL

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions