Skip to content

specialize hexadecimal sequence decoding#489

Merged
dtolnay merged 1 commit intoserde-rs:masterfrom
yjh0502:master
Sep 26, 2018
Merged

specialize hexadecimal sequence decoding#489
dtolnay merged 1 commit intoserde-rs:masterfrom
yjh0502:master

Conversation

@yjh0502
Copy link
Copy Markdown
Contributor

@yjh0502 yjh0502 commented Sep 26, 2018

serde_json is slower than expected when parsing long string filled with hexadecimal escape sequences. I discovered the problem while testing with following benchmark case:

https://raw.githubusercontent.com/devinus/poison/master/bench/data/utf-8-escaped.json

serde_json is two times slower than moderately optimized c++ implementation, and here's performance profile of serde_json for the testcase, captured with perf record.

# Overhead  Command      Shared Object       Symbol
# ........  ...........  ..................  ............................................................................
#
    25.81%  6_scheduler  librjs.so           [.] serde_json::read::parse_escape
    25.19%  6_scheduler  librjs.so           [.] serde_json::read::decode_hex_escape
    16.59%  6_scheduler  librjs.so           [.] serde_json::read::next_or_eof
    11.78%  6_scheduler  libc-2.27.so        [.] __memcpy_ssse3
    11.27%  6_scheduler  librjs.so           [.] <serde_json::read::StrRead<'a> as serde_json::read::Read<'a>>::parse_str
     4.46%  6_scheduler  librjs.so           [.] <alloc::raw_vec::RawVec<T, A>>::reserve
     3.08%  6_scheduler  librjs.so           [.] core::slice::<impl [T]>::copy_from_slice

This PR makes hex decoding faster with decode_hex_escape specialization, and it brings ~40% performance improvement for the case. Here's a benchmark result which shows number of iterations that decoding the testcase, total time, and time per each iteration.

# without the patch
iterations=10000, 1.558 sec, 155.820 usec/iter
# with the patch
iterations=10000, 1.067 sec, 106.709 usec/iter

Here's a profile after the PR

# Overhead  Command      Shared Object       Symbol
# ........  ...........  ..................  ..........................................................................................
#
    31.30%  4_scheduler  librjs.so           [.] serde_json::read::parse_escape
    20.64%  4_scheduler  libc-2.27.so        [.] __memcpy_ssse3
    16.11%  4_scheduler  librjs.so           [.] <serde_json::read::StrRead<'a> as serde_json::read::Read<'a>>::parse_str
    11.80%  4_scheduler  librjs.so           [.] serde_json::read::next_or_eof
     6.88%  4_scheduler  librjs.so           [.] <serde_json::read::SliceRead<'a> as serde_json::read::Read<'a>>::decode_hex_escape
     5.93%  4_scheduler  librjs.so           [.] <alloc::raw_vec::RawVec<T, A>>::reserve
     4.56%  4_scheduler  librjs.so           [.] core::slice::<impl [T]>::copy_from_slice

Copy link
Copy Markdown
Member

@dtolnay dtolnay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@dtolnay dtolnay merged commit 1ef3da9 into serde-rs:master Sep 26, 2018
@dtolnay
Copy link
Copy Markdown
Member

dtolnay commented Sep 26, 2018

I published 1.0.31 containing this fix.

takumi-earth pushed a commit to earthlings-dev/json that referenced this pull request Jan 27, 2026
specialize hexadecimal sequence decoding
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants