Skip to content

Conversation

@kamyuentse
Copy link
Contributor

@kamyuentse kamyuentse commented Mar 21, 2018

  1. Use vectorized compare based matching for header values. (More chars, fewer ranges?)
  2. Use vectorized shuffle based matching for URI. (Fewer chars, more ranges?)

Simple bench result

With AVX2

Bench with RUSTFLAGS="-C target-feature=+avx2,+bmi" cargo bench --features=nightly

test bench_httparse       ... bench:         259 ns/iter (+/- 3) = 2714 MB/s
test bench_httparse_short ... bench:          56 ns/iter (+/- 1) = 1214 MB/s
test bench_pico           ... bench:         402 ns/iter (+/- 6) = 1748 MB/s
test bench_pico_short     ... bench:          53 ns/iter (+/- 1) = 1283 MB/s

With SSE4.2

Bench with RUSTFLAGS="-C target-feature=+sse42,+bmi" cargo bench --features=nightly

test bench_httparse       ... bench:         358 ns/iter (+/- 3) = 1963 MB/s
test bench_httparse_short ... bench:          52 ns/iter (+/- 1) = 1307 MB/s
test bench_pico           ... bench:         426 ns/iter (+/- 5) = 1650 MB/s
test bench_pico_short     ... bench:          55 ns/iter (+/- 1) = 1236 MB/s

Without SIMD

Bench with cargo bench

test bench_httparse       ... bench:         468 ns/iter (+/- 6) = 1502 MB/s
test bench_httparse_short ... bench:          54 ns/iter (+/- 0) = 1259 MB/s
test bench_pico           ... bench:         405 ns/iter (+/- 7) = 1735 MB/s
test bench_pico_short     ... bench:          53 ns/iter (+/- 0) = 1283 MB/s

You may get a different mileage, depends on the specified platform.

Copy link
Owner

@seanmonstar seanmonstar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super impressive!

// ASCII codes to accept URI string.
// i.e. A-Z a-z 0-9 !#$%&'*+-._();:@=,/?[]~
// TODO: Make a stricter checking for URI string?
static URI_MAP: [bool; 256] = byte_map![
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just cause I'm curious, why is this different from just parsing as a token?

Copy link
Contributor Author

@kamyuentse kamyuentse Mar 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From RFC7230 Section 3.2.6, we can get the formalize description of token:

token  = 1*tchar

tchar  = "!" / "#" / "$" / "%" / "&" / "'" / "*"
            / "+" / "-" / "." / "^" / "_" / "`" / "|" / "~"
            / DIGIT / ALPHA
            ; any VCHAR, except delimiters

Current is_token(b: u8) -> bool; is not satisfy this def, I think that should be changed.

For URI matching, depends on the safeguard requirement, the accepted charset may differ, IMO, we should be careful here, maybe we can provide a strict mode, which is an optional feature here? That's why I extract a URI_MAP here.

src/lib.rs Outdated
// TODO: Make a stricter checking for URI string?
static URI_MAP: [bool; 256] = byte_map![
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this indentation difference intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My fault, I should check my editor config.


static HEADER_VALUE_MAP: [bool; 256] = byte_map![
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yikes! Just checked, this was previously allowing Backspace instead of Horizontal Tab 😢

Thanks for the fix!

}

#[cfg(feature = "nightly")]
#[cfg(all(any(target_arch = "x86", target_arch = "x86_64"), target_feature = "avx2"))]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if a user has enabled the nightly feature, but their compile target doesn't support these? We'd need match_url_char_32 versions for those too, I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this previous, we need to add an SSE4.2 version here? Is it reasonable?

I think the completed constaints should be:

  • nightly AND (x86 OR x86_64) AND AVX2 => enable AVX2 support;
  • nightly AND (x86 OR x86_64) AND SSE4.2 => enable SSE4.2 support;
  • otherwise, use the fallback implementation.

But this will cause a rather lengthy #[cfg(...)], I don't know how to simply it.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, its probably reasonable. Not too worried about complicated cfg attributes.

@seanmonstar
Copy link
Owner

Oh, not required, but it'd be neat to see also what the bench results are without nightly :D

@seanmonstar
Copy link
Owner

I'm so sorry I forgot about this!

Just a few quick questions:

  • I don't know the answer, so asking: do we need to check target_arch along with target_feature? Does target_feature not imply an arch has it?
  • I think the only thing left is probably some way to test the changes. Specifically, if we can test in CI the avx2, the sse42, and the fallback. I haven't investigated, does the Travis linux environment support these target features? If they are supported, that's simple! In order to test them each, then, we'd probably need cargo features for each case, and adjust the Travis matrix to test with each enabled individually...

@kamyuentse
Copy link
Contributor Author

@seanmonstar The target_feature seems will be reform? I do not check the RFC yet, but I think target_feature is enough. For the CI, I will modify the script once this PR ready. I plan to continue working on this as soon as SIMD stable for x86/x86_64.

@kamyuentse kamyuentse changed the title Initial SIMD support for x86/x86_64 with AVX2. [WIP] SIMD support for x86/x86_64 Apr 11, 2018
let ptr = buf.as_ptr();

#[allow(non_snake_case, overflowing_literals)]
unsafe {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case you didn't catch it, there's an accidental indentation here

@seanmonstar
Copy link
Owner

I've taken what was started here and got it working and running tests in #40. Thanks so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants