Skip to content

Optimize Lexer::readName#1813

Merged
spawnia merged 4 commits intowebonyx:masterfrom
kasparsklavins:patch-1
Dec 15, 2025
Merged

Optimize Lexer::readName#1813
spawnia merged 4 commits intowebonyx:masterfrom
kasparsklavins:patch-1

Conversation

@kasparsklavins
Copy link
Copy Markdown
Contributor

Finding the name's length using strspn(...), and doing a single string copy with substr(...) is faster than looping through one character at a time. As names must be [A-Za-z0-9_]+, it's safe to skip UTF8 checks.

Benchmark

Benchmark for introspection query shows promising results on my modest potato running php 8.3 with opcache enabled

bench.php
use GraphQL\Language\Lexer;
use GraphQL\Language\Source;
use GraphQL\Language\Token;
use GraphQL\Type\Introspection;

require __DIR__ . '/vendor/autoload.php';

$source = new Source(Introspection::getIntrospectionQuery());

$run = function() use ($source) {
    $lexer = new Lexer($source);
    do {
        $token = $lexer->advance();
    } while ($token->kind !== Token::EOF);
};

$run(); // warmup
$run(); // warmup

$times = [];
for ($i = 0; $i < 100; ++$i) {
    $start = hrtime(true);
    $run();
    $times[] = hrtime(true) - $start;
}

var_dump([
    'min' => min($times) / 1_000,
    'max' => max($times) / 1_000,
    'avg' => array_sum($times) / count($times) / 1_000,
]);
Min Max Average
Before 243.5μs 658.6μss 268.1μs
After 168.5μs (-30%) 492.2μs (-25%) 188.4μs (-30%)

Copy link
Copy Markdown
Collaborator

@spawnia spawnia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clever, nice find!

kasparsklavins and others added 2 commits December 12, 2025 18:29
Co-authored-by: Benedikt Franke <benedikt@franke.tech>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes the Lexer::readName method by replacing a character-by-character loop with PHP's native strspn() and substr() functions. Since GraphQL names are restricted to ASCII characters [_A-Za-z][_0-9A-Za-z]*, it's safe to use byte-level string operations without UTF-8 character handling. The optimization yields a 30% performance improvement on introspection query tokenization.

Key changes:

  • Replaced character-by-character loop in readName() with strspn() for counting valid name characters and substr() for extracting the name value
  • Simplified cursor advancement to a single moveStringCursor() call

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
src/Language/Lexer.php Optimized readName() method to use strspn() and substr() instead of character loop
CHANGELOG.md Documented the Lexer name tokenization optimization

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@spawnia spawnia merged commit 59fd4a8 into webonyx:master Dec 15, 2025
18 checks passed
@spawnia
Copy link
Copy Markdown
Collaborator

spawnia commented Dec 15, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants