Skip to content

feat: move tree-sitter configuration to dedicated file#3700

Merged
amaanq merged 8 commits intotree-sitter:masterfrom
amaanq:config
Sep 30, 2024
Merged

feat: move tree-sitter configuration to dedicated file#3700
amaanq merged 8 commits intotree-sitter:masterfrom
amaanq:config

Conversation

@amaanq
Copy link
Member

@amaanq amaanq commented Sep 29, 2024

Closes #3637

Problem

The current way to configure a tree-sitter grammar repo is by adding a tree-sitter section to the package.json. This is unideal because it increases coupling to node and confuses newcomers as to why changing stuff in a package.json file influences CLI behavior. Adding more fields to the package.json also increases complexity of a file that, ideally, should stick to node-related configuration data only. Moving to a dedicated file will make it clear that all tree-sitter related configuration is held in this file, and it also makes it easier for us to automatically derive more data to populate in other bindings files (license info, author, etc).

Solution

We will now migrate to a dedicated tree-sitter.json configuration file. This file is created and populated when a user runs tree-sitter init, of which now init prompts the user for input much like npm init does. Additionally, if init is ran and a package.json file is detected, the user is prompted for whether or not they'd like the CLI to automatically migrate their config to the new tree-sitter.json file. The package.json is parsed and the key fields for migration are extracted, including information such as the author, version, description, name, and repository url.

The loader now holds all the relevant data structures that represent the old package.json tree-sitter fields and new tree-sitter.json fields. Compatibility is retained for the old tree-sitter field in a package.json file until 0.25, at which point that will be removed.

I'm putting this PR up now and not once I've finished the docs so that people can take a look and review it.

TODO

  • Doc updates

@amaanq amaanq force-pushed the config branch 7 times, most recently from f521aa8 to 6232f6b Compare September 29, 2024 08:13
Copy link
Member

@ObserverOfTime ObserverOfTime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I had in mind:

  • If package.json exists, the configuration should be moved to tree-sitter.json (with a warning)
  • If tree-sitter.json (or another file specified via -c / --config) exists, it should read it
    • If any of the required fields are missing, it should exit with an error
  • If neither file exists, it should prompt the user for the basic info (with defaults):
    Parser name: foo (from cwd)
    CamelCase parser name: Foo
    Description: Foo grammar for tree-sitter
    Repository URL: https://github.com/tree-sitter/tree-sitter-foo
    TextMate Scope: source.foo
    File types (space-separated): .foo
    Version: 0.1.0
    License: MIT
    Author name:
    Author email:
    Author URL:
    

Copy link
Member

@ObserverOfTime ObserverOfTime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These template files also need to be updated:

  • __init__.py (description)
  • binding_test.go (stripped URL)
  • go.mod (stripped URL)

@amaanq amaanq merged commit ea3846a into tree-sitter:master Sep 30, 2024
@ObserverOfTime
Copy link
Member

We forgot to add a PARSER_VERSION placeholder.

@amaanq
Copy link
Member Author

amaanq commented Sep 30, 2024

done

@bm-w
Copy link
Contributor

bm-w commented Oct 21, 2024

This PR broke my usage of tree-sitter-loader. The migration strategy only works when using the Tree-sitter CLI, and then only when the parser repo is the current directory. In my case I’m using tree-sitter-loader by itself, in which case there is no “migration”, but just the pre-existing fallback that parses a very reduced LanguageConfiguration from src/grammar.json.

And it turns out that even if I can manually select such a reduced LanguageConfiguration based on the name, there is no way to load a Language from it (Loader::language_by_id is private, and all the public methods don’t work because the reduced LanguageConfiguration lacks the required data). Edit: I just opened #3816 to remedy this.

jfly added a commit to jfly/tree-sitter that referenced this pull request Feb 9, 2026
A number of grammars in the wild have this character in their name:

- sogaiu/tree-sitter-janet-simple#7
- tree-sitter/tree-sitter-c-sharp#408
- tree-sitter/tree-sitter-embedded-template#45
- tree-sitter/tree-sitter-ql-dbscheme#7

I read through tree-sitter#3700 and
tree-sitter#3637, and it doesn't
look like there was much discussion about this name regex, so hopefully
this is an acceptable change?
WillLillis pushed a commit that referenced this pull request Feb 10, 2026
A number of grammars in the wild have this character in their name:

- sogaiu/tree-sitter-janet-simple#7
- tree-sitter/tree-sitter-c-sharp#408
- tree-sitter/tree-sitter-embedded-template#45
- tree-sitter/tree-sitter-ql-dbscheme#7

I read through #3700 and
#3637, and it doesn't
look like there was much discussion about this name regex, so hopefully
this is an acceptable change?
github-actions bot pushed a commit that referenced this pull request Feb 10, 2026
A number of grammars in the wild have this character in their name:

- sogaiu/tree-sitter-janet-simple#7
- tree-sitter/tree-sitter-c-sharp#408
- tree-sitter/tree-sitter-embedded-template#45
- tree-sitter/tree-sitter-ql-dbscheme#7

I read through #3700 and
#3637, and it doesn't
look like there was much discussion about this name regex, so hopefully
this is an acceptable change?

(cherry picked from commit 7d3c321)
WillLillis pushed a commit that referenced this pull request Feb 10, 2026
A number of grammars in the wild have this character in their name:

- sogaiu/tree-sitter-janet-simple#7
- tree-sitter/tree-sitter-c-sharp#408
- tree-sitter/tree-sitter-embedded-template#45
- tree-sitter/tree-sitter-ql-dbscheme#7

I read through #3700 and
#3637, and it doesn't
look like there was much discussion about this name regex, so hopefully
this is an acceptable change?

(cherry picked from commit 7d3c321)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Move tree-sitter configuration to dedicated file

4 participants