Skip to content

Feature/performance: This PR introduces a high number of performance improvements.#33

Closed
r-Larch wants to merge 42 commits intodmitry-brazhenko:mainfrom
r-Larch:feature/performance
Closed

Feature/performance: This PR introduces a high number of performance improvements.#33
r-Larch wants to merge 42 commits intodmitry-brazhenko:mainfrom
r-Larch:feature/performance

Conversation

@r-Larch
Copy link
Contributor

@r-Larch r-Larch commented Mar 23, 2024

This PR delivers a suite of performance enhancements, primarily spurred by the discussion in issue #2.
These optimizations were urgently needed by me.

I changed a lot in this repository. I hope this changes are welcome and get merged.
Please tell me if something does not comply with your guidelines or code styles.

Key Highlights:

  • Significant performance upgrades have been implemented for .NET 8.0 due to its superior support for high-performance coding practices.
  • While .NET 6.0 and .NET Standard have also received optimizations, they do not support certain APIs necessary for achieving the same level of performance enhancement as .NET 8.0.

For detailed performance metrics, refer to the Performance Benchmarks.

Additionally, this PR introduces a high-performance, low-allocation TokenCount method. This method is essential for my use case and, I believe, beneficial for others. It primarily facilitates the counting of tokens prior to dispatching a prompt to a LLM, ensuring that the prompt does not surpass the context window size limit. This functionality is also advantageous for segmenting text into manageable chunks for RAG.

EDIT

With my latest commits, I was able to improve speed. With these changes, SharpToken effectively becomes the fastest library for .NET with the lowest allocations.

Performance Overview:

Code: CompareBenchmark.cs

After Optimization:

BenchmarkDotNet v0.13.12, Windows 11 (10.0.22631.3296/23H2/2023Update/SunValley3)
AMD Ryzen 9 3900X, 1 CPU, 24 logical and 12 physical cores
.NET SDK 8.0.200
  [Host]               : .NET 8.0.2 (8.0.224.6711), X64 RyuJIT AVX2
  .NET 6.0             : .NET 6.0.16 (6.0.1623.17311), X64 RyuJIT AVX2
  .NET 8.0             : .NET 8.0.2 (8.0.224.6711), X64 RyuJIT AVX2
  .NET Framework 4.7.1 : .NET Framework 4.8.1 (4.8.9181.0), X64 RyuJIT VectorSize=256
Method Runtime Mean Error StdDev Gen0 Gen1 Allocated
SharpToken .NET 8.0 100.4 ms 1.95 ms 1.91 ms 2000.0000 - 22.13 MB
SharpToken .NET 6.0 169.9 ms 2.42 ms 2.15 ms 24333.3333 1000.0000 196.3 MB
SharpToken .NET Framework 455.3 ms 8.34 ms 6.97 ms 34000.0000 1000.0000 204.39 MB
TiktokenSharp .NET 8.0 211.4 ms 1.83 ms 1.53 ms 42000.0000 1000.0000 338.98 MB
TiktokenSharp .NET 6.0 258.6 ms 5.09 ms 6.25 ms 39000.0000 1000.0000 313.26 MB
TiktokenSharp .NET Framework 638.3 ms 12.47 ms 16.21 ms 63000.0000 1000.0000 378.31 MB
TokenizerLib .NET 8.0 124.4 ms 1.81 ms 1.60 ms 27250.0000 1000.0000 217.82 MB
TokenizerLib .NET 6.0 165.5 ms 1.38 ms 1.16 ms 27000.0000 1000.0000 217.82 MB
TokenizerLib .NET Framework 499.7 ms 9.81 ms 14.07 ms 40000.0000 1000.0000 243.79 MB

Before Optimization:

BenchmarkDotNet v0.13.12, Windows 11 (10.0.22631.3296/23H2/2023Update/SunValley3)
AMD Ryzen 9 3900X, 1 CPU, 24 logical and 12 physical cores
.NET SDK 8.0.200
  [Host]               : .NET 8.0.2 (8.0.224.6711), X64 RyuJIT AVX2
  .NET 6.0             : .NET 6.0.16 (6.0.1623.17311), X64 RyuJIT AVX2
  .NET 8.0             : .NET 8.0.2 (8.0.224.6711), X64 RyuJIT AVX2
  .NET Framework 4.7.1 : .NET Framework 4.8.1 (4.8.9181.0), X64 RyuJIT VectorSize=256
Method Runtime Mean Error StdDev Gen0 Gen1 Allocated
SharpToken .NET 8.0 214.1 ms 4.26 ms 3.99 ms 33000.0000 1000.0000 264.44 MB
SharpToken .NET 6.0 267.9 ms 2.61 ms 2.31 ms 33000.0000 1000.0000 264.82 MB
SharpToken .NET Framework 640.2 ms 9.03 ms 7.54 ms 54000.0000 2000.0000 326.51 MB
TiktokenSharp .NET 8.0 218.2 ms 4.25 ms 6.49 ms 42000.0000 1000.0000 338.98 MB
TiktokenSharp .NET 6.0 250.1 ms 1.64 ms 1.28 ms 39000.0000 1000.0000 313.26 MB
TiktokenSharp .NET Framework 654.2 ms 11.57 ms 9.66 ms 63000.0000 1000.0000 378.31 MB
TokenizerLib .NET 8.0 130.3 ms 2.29 ms 2.03 ms 27200.0000 1000.0000 217.82 MB
TokenizerLib .NET 6.0 166.3 ms 0.74 ms 0.58 ms 27000.0000 1000.0000 217.82 MB
TokenizerLib .NET Framework 489.4 ms 6.41 ms 5.68 ms 40000.0000 1000.0000 243.79 MB
Older benchmarks

Before Optimization:

### feat(performance): add benchmark project

- version: 1.2.17 -- (baseline)

BenchmarkDotNet v0.13.12, Windows 11 (10.0.22631.3155/23H2/2023Update/SunValley3)
AMD Ryzen 9 3900X, 1 CPU, 24 logical and 12 physical cores
.NET SDK 8.0.200
  [Host]               : .NET 8.0.2 (8.0.224.6711), X64 RyuJIT AVX2
  .NET 6.0             : .NET 6.0.16 (6.0.1623.17311), X64 RyuJIT AVX2
  .NET 8.0             : .NET 8.0.2 (8.0.224.6711), X64 RyuJIT AVX2
  .NET Framework 4.7.1 : .NET Framework 4.8.1 (4.8.9181.0), X64 RyuJIT VectorSize=256

| Method | Job                  | Runtime              | Mean     | Error     | StdDev    | Ratio | RatioSD | Gen0     | Gen1    | Allocated | Alloc Ratio |
|------- |--------------------- |--------------------- |---------:|----------:|----------:|------:|--------:|---------:|--------:|----------:|------------:|
| Encode | .NET 6.0             | .NET 6.0             | 1.732 ms | 0.0309 ms | 0.0274 ms |  0.49 |    0.01 | 191.4063 | 15.6250 |   1.53 MB |        0.62 |
| Encode | .NET 8.0             | .NET 8.0             | 1.387 ms | 0.0277 ms | 0.0406 ms |  0.39 |    0.02 | 191.4063 | 15.6250 |   1.53 MB |        0.62 |
| Encode | .NET Framework 4.7.1 | .NET Framework 4.7.1 | 3.595 ms | 0.0704 ms | 0.1287 ms |  1.00 |    0.00 | 406.2500 | 39.0625 |   2.46 MB |        1.00 |

After Optimization:

### feat(benchmark): add benchmark for large file token count

- version: 1.2.17 + improvements

BenchmarkDotNet v0.13.12, Windows 11 (10.0.22631.3155/23H2/2023Update/SunValley3)
AMD Ryzen 9 3900X, 1 CPU, 24 logical and 12 physical cores
.NET SDK 8.0.200
  [Host]               : .NET 8.0.2 (8.0.224.6711), X64 RyuJIT AVX2
  .NET 6.0             : .NET 6.0.16 (6.0.1623.17311), X64 RyuJIT AVX2
  .NET 8.0             : .NET 8.0.2 (8.0.224.6711), X64 RyuJIT AVX2
  .NET Framework 4.7.1 : .NET Framework 4.8.1 (4.8.9181.0), X64 RyuJIT VectorSize=256

| Method                 | Job                  | Runtime              | Mean        | Error     | StdDev    | Ratio | RatioSD | Gen0      | Gen1     | Gen2     | Allocated | Alloc Ratio |
|----------------------- |--------------------- |--------------------- |------------:|----------:|----------:|------:|--------:|----------:|---------:|---------:|----------:|------------:|
| Encode                 | .NET 6.0             | .NET 6.0             |    973.9 us |  19.27 us |  31.66 us |  0.77 |    0.03 |   62.5000 |   0.9766 |        - |  524034 B |        0.82 |
| Encode                 | .NET 8.0             | .NET 8.0             |    552.3 us |  10.73 us |  11.48 us |  0.44 |    0.01 |    2.9297 |        - |        - |   27841 B |        0.04 |
| Encode                 | .NET Framework 4.7.1 | .NET Framework 4.7.1 |  1,252.7 us |  24.12 us |  23.69 us |  1.00 |    0.00 |  101.5625 |   1.9531 |        - |  640074 B |        1.00 |
|                        |                      |                      |             |           |           |       |         |           |          |          |           |             |
| CountTokens            | .NET 6.0             | .NET 6.0             |    801.2 us |  16.00 us |  19.65 us |  0.65 |    0.02 |   58.5938 |   0.9766 |        - |  496386 B |       0.806 |
| CountTokens            | .NET 8.0             | .NET 8.0             |    557.3 us |  10.89 us |  10.69 us |  0.45 |    0.02 |         - |        - |        - |    4161 B |       0.007 |
| CountTokens            | .NET Framework 4.7.1 | .NET Framework 4.7.1 |  1,235.5 us |  24.52 us |  31.01 us |  1.00 |    0.00 |   97.6563 |   1.9531 |        - |  615801 B |       1.000 |
|                        |                      |                      |             |           |           |       |         |           |          |          |           |             |
| CountTokens_LargeInput | .NET 6.0             | .NET 6.0             | 13,955.6 us | 258.91 us | 242.19 us |  0.64 |    0.01 |  781.2500 | 328.1250 | 125.0000 | 6122849 B |       0.806 |
| CountTokens_LargeInput | .NET 8.0             | .NET 8.0             |  5,781.9 us | 115.07 us | 127.91 us |  0.27 |    0.01 |         - |        - |        - |      75 B |       0.000 |
| CountTokens_LargeInput | .NET Framework 4.7.1 | .NET Framework 4.7.1 | 21,674.7 us | 374.37 us | 350.18 us |  1.00 |    0.00 | 1312.5000 | 500.0000 | 156.2500 | 7596775 B |       1.000 |

r-Larch added 30 commits March 22, 2024 20:20
@dmitry-brazhenko
Copy link
Owner

dmitry-brazhenko commented Mar 25, 2024

Hello @r-Larch

Thanks A LOT for this PR.

I will review it and merge it within 1-2 days.

@dmitry-brazhenko
Copy link
Owner

I will edit it a little bit and resolve all conflicts and merge after that.

Trying to understand some of the changes

Copy link
Contributor Author

@r-Larch r-Larch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some comments and explanations to code decisions.

if (searchValues is string[] { Length: 0 })
{
return new FoundMatch { Success = false };
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if is a fast path to not have searchValues.GetEnumerator() heap allocation in case allowSpecials is empty (not provided)

This is the default case so this fast path will be taken most of the time.

I could have added a small comment..


This if could be added to net6.0 and netstandard too.

@dmitry-brazhenko
Copy link
Owner

I am merging this change here: #36
Looks good to me :)

thanks a lot for contribution!

If you observe any issues, please let me know or open a new PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants