Experiment: fuzzy completion based on simd #15837

glepnir · 2024-10-09T13:58:13Z

some experimental attempts. I might try some new algorithms. will make a benchmark when I am satisfied with it. It's pretty loose at the moment. but no relevant fuzzy completion tests are broken. I'm guessing there are still edge cases that aren't caught. more actual behavioral testing is needed. It will still fall back to the original fuzzy_match for unsupported platforms to avoid incompatibilities.

(don't expect too much and pay too much attention. It may not be accepted 😊)

yegappan

Do you have any performance data running fuzzy match over a large number of strings using this new function?

src/search.c

glepnir · 2024-10-12T05:29:27Z

Do you have any performance data running fuzzy match over a large number of strings using this new function?

In preparation. I will update here once the results are available. :) idea is that it will only try single bytes and if it contains multiple bytes it will still fall back to the original fuzzy match

ychin · 2024-10-18T20:20:02Z

src/search.c

+	if (match_mask)
+	{
+	    // find position
+	    int shift = __builtin_ctz(match_mask);


What's the compatibility of this function __builtin_ctz?

ychin · 2024-10-18T20:25:24Z

src/search.c

+#endif
+
+#ifdef PLATFORM_X86
+    #include <immintrin.h>  // For x86/x64 SIMD intrinsics


I think using immintrin.h is a bad idea, because it pulls in all intrinsics including those from SSE3 / SSE4 / AVX instruction sets. I'm assuming you are targeting SSE2 (that's the one guaranteed by x64), but the CPU you are targeting isn't guaranteed to support say SSE4 or AVX. The MSVC compiler allows you to use all kinds of instructions even if the compiler flag doesn't specify it (e.g. with -mavx2). On your machine it will work and then on some random person's old machine this code will just crash, which is not great.

I think it's better to just include the headers you need to be explicit what instruction sets you are pulling in. So in this case you probably want emmintrin.h (which only contains SSE2 instructions)

See https://stackoverflow.com/questions/11228855/header-files-for-x86-simd-intrinsics

Yes I've been on arm for a while. So a bit sloppy. I'll test on an x64

FWIW I looked at all the intrinsics you were using and they did seem like they are SSE2 only (from referencing Intel's docs), so I think any x64 would work.

My comment is just that ideally you should have a target hardware spec (just saying baseline SSE2 is fine too since it works on all x64 devices so your __x86_64__ ifdef check suffices) and make sure you stick to it. Including the all-inclusive immintrin.h makes it harder to stick to it since it allows newer intrinsics to be called whereas you want to the smaller headers that only include the specific instructions you want since it's easier to tell at a glance what instruction sets you are using.

I haven't done enough ARM NEON programming though. Are all instructions available to all __ARM_NEON platforms?

Edit: Nevermind the ARM NEON questions. I forgot it's all one instruction set so this isn't really an issue. I guess I'm traumatized having to write x86 SIMD code and needing to be very careful which instruction set version to pull from because some CPUs support some instruction sets but not others 😅. It's kind of a MSVC/x86 specific issue where you need to be quite careful with this.

I really appreciate your advice. I will check carefully when i back here and can i ping you for any improve review?

src/search.c

ychin · 2024-10-18T20:48:02Z

Haven't taken a close look at the actual SIMD implementations yet, but just some initial comments looking at this:

Can you specify what hardware targets you are targeting here? Seems like for x86 you are targeting any x64 chips (which support SSE2 at a minimum) but what's the ARM target? Is this supposed to work on all ARM chips (I'm a little less familiar with ARM NEON than x86 SSE/AVX)? Are we going to have situations where the program could crash because it's running on some older hardware that don't support certain instructions? Also see this comment from mine: Experiment: fuzzy completion based on simd #15837 (comment). If we use newer instructions than what older chips support we may need to switch dynamically instead of just using compiler flags.
Just to be clear, this works with ASCII only right? From initial glance I don't think this code works with anything other than that. To be fair, ASCII is the most common things we use but I just want to be explicit about this. The utilities etc are named in such a way (e.g. simd_tolower) that could imply it works across the board. I would imagine you need some kind of switch to first detect if the string is ASCII only and then dynamically pick which version to use. Or is the plan to extend this to arbitrary text in Unicode or other encodings (seems hard)?
- Speaking of which I am surprised this didn't break any fuzzy tests. I'm wondering if we should add more edge cases to fuzzy matching for Unicode texts.
I wonder how this would be tested. IMO it's probably a good idea to set up a C unit test that compile both versions and feed a list of test data to make sure they perform identically on the same hardware (and do the same for utility functions like simd_tolower too). I guess Vim script tests could work but it's a somewhat indirect way to test this since you are just testing end-to-end and could miss the smaller stuff.
As the other comment mentioned, having a consistent way to evaluate the performance of SIMD code would be useful here (I know you plan to add it later), since adding SIMD specializations add a lot more testing and maintenance burdens and it would be nice to have concrete benchmarking from realistic scenarios. What's the motivation for this change btw? Is fuzzy slow in real usage? Genuine question (since I don't use the feature much). It would be nice to get a feel of (1) how slow it currently is, and (2) how much performance advantage this actually brings (both in just calling the functions itself and also the end-to-end performance). Given that we are doing 16 chars at a time, theoretically we could get up to 16x the speed but usually it's smaller than that.

glepnir · 2024-10-19T05:51:25Z

Just to be clear, this works with ASCII only right?

This is how the initial work is planned, because other encodings are somewhat troublesome.

about 4..I am also looking forward to whether the benchmark here will be improved. wait for a while

glepnir marked this pull request as draft October 9, 2024 13:58

glepnir force-pushed the simd_fuzzy branch 2 times, most recently from f22ae29 to c6b82a5 Compare October 11, 2024 11:51

Experiment: fuzzy completion based on simd

b263d38

glepnir force-pushed the simd_fuzzy branch from c6b82a5 to b263d38 Compare October 11, 2024 11:59

yegappan requested changes Oct 11, 2024

View reviewed changes

src/search.c Show resolved Hide resolved

ychin reviewed Oct 18, 2024

View reviewed changes

src/search.c Show resolved Hide resolved

glepnir mentioned this pull request May 29, 2025

feat: add register completion #17354

Closed

glepnir closed this Jun 20, 2025

ychin mentioned this pull request Jun 22, 2025

feat(fuzzy): implement SIMD-accelerated Smith-Waterman with affine gaps #17581

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Experiment: fuzzy completion based on simd #15837

Experiment: fuzzy completion based on simd #15837

Uh oh!

glepnir commented Oct 9, 2024 •

edited

Loading

Uh oh!

yegappan left a comment

Uh oh!

Uh oh!

glepnir commented Oct 12, 2024 •

edited

Loading

Uh oh!

ychin Oct 18, 2024

Uh oh!

ychin Oct 18, 2024 •

edited

Loading

Uh oh!

glepnir Oct 19, 2024

Uh oh!

ychin Oct 21, 2024 •

edited

Loading

Uh oh!

glepnir Oct 22, 2024

Uh oh!

Uh oh!

ychin commented Oct 18, 2024 •

edited

Loading

Uh oh!

glepnir commented Oct 19, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Experiment: fuzzy completion based on simd #15837

Experiment: fuzzy completion based on simd #15837

Uh oh!

Conversation

glepnir commented Oct 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yegappan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

glepnir commented Oct 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ychin Oct 18, 2024

Choose a reason for hiding this comment

Uh oh!

ychin Oct 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

glepnir Oct 19, 2024

Choose a reason for hiding this comment

Uh oh!

ychin Oct 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

glepnir Oct 22, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ychin commented Oct 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

glepnir commented Oct 19, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

glepnir commented Oct 9, 2024 •

edited

Loading

glepnir commented Oct 12, 2024 •

edited

Loading

ychin Oct 18, 2024 •

edited

Loading

ychin Oct 21, 2024 •

edited

Loading

ychin commented Oct 18, 2024 •

edited

Loading