Skip to content

Add JSON schemas for pyrefly.toml and pyproject.toml validation#2100

Closed
Prathamesh-tech-eng wants to merge 8 commits intofacebook:mainfrom
Prathamesh-tech-eng:main
Closed

Add JSON schemas for pyrefly.toml and pyproject.toml validation#2100
Prathamesh-tech-eng wants to merge 8 commits intofacebook:mainfrom
Prathamesh-tech-eng:main

Conversation

@Prathamesh-tech-eng
Copy link
Contributor

Description:
This PR adds JSON schemas (pyrefly.json and pyproject-tool-pyrefly.json) to enable editor support (auto-completion, validation, and documentation) for Pyrefly configuration files in pyrefly.toml and pyproject.toml.

Fixes #536

Notes:

  • I've constructed this schema based on crates/pyrefly_config/src/config.rs and and crates/pyrefly_config/src/base.rs
  • I used kebab-case for keys (e.g., project-includes) as defined by the Serde structs.
  • I placed the schema in the schemas/ directory

@meta-cla
Copy link

meta-cla bot commented Jan 13, 2026

Hi @Prathamesh-tech-eng!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

Copy link
Contributor

@connernilsen connernilsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Prathamesh-tech-eng, thanks for working on this!

I have a few recommendations for changes, and an additional followup to make sure this is integrated into Schemastore as well.

For the followup, can we:

  1. make a PR to schemastore to add our pyrefly.json there.
  2. update their pyproject.toml to have a tool entry for Pyrefly.
  3. Add some tests like the ones for pyproject.toml? Doesn't need to be a ton, but just some stuff to verify what we're doing is generally correct.

Comment on lines +237 to +260
"errors": {
"description": "Error configuration overrides for matched files.",
"type": "object",
"additionalProperties": {
"type": "boolean"
}
},
"replace-imports-with-any": {
"description": "Module glob patterns to replace with typing.Any for matched files.",
"type": "array",
"items": {
"type": "string"
}
},
"untyped-def-behavior": {
"description": "Override untyped function behavior for matched files.",
"type": "string",
"enum": ["check-and-infer-return-type", "check-and-infer-return-any", "skip-and-infer-return-any"]
},
"ignore-errors-in-generated-code": {
"description": "Override generated code error handling for matched files.",
"type": "boolean"
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is possible, so if you struggle to find a way to do this, don't worry about it.

Could we pull this out into a subschema and do two things with it?

  1. flatten the subschema into the top-level object
  2. reuse that sub-schema here

This would help us avoid redefining definitions in multiple places, which means we won't have to remember to update multiple things if there are changes to the config base object.

Again, not sure if flattening is possible, but it would be awesome if it works!

Comment on lines +191 to +193
"isolation-dir": true,
"extras": true
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to double check, these aren't required fields, right? Just fields that are allowed with the buck type?

"required": ["command"],
"properties": {
"command": true,
"repo_root": true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repo root should be for all types, not just custom

- Extract configBase into reusable  subschema
- Flatten configBase properties into top-level schema
- Reuse configBase in sub-config via
- Fix build-system: move repo_root to all types, add not conditions
- Simplify pyproject-tool-pyrefly.json to reference pyrefly.json
- All tests passing
@Prathamesh-tech-eng
Copy link
Contributor Author

@connernilsen I've addressed all the feedback! Here is a summary of the changes:

  1. Extracted configBase to avoid repetition
    I created a $defs/configBase section containing all the shared properties (like errors, untyped-def-behavior, replace-imports-with-any, etc.).

  2. Flattened ConfigBase into the root
    All configBase properties are now duplicated in the main properties section so they're available at the root level. This ensures the schema works correctly with additionalProperties: false.

  3. Applied ConfigBase to sub-config
    The sub-config items now use allOf with { "$ref": "#/$defs/configBase" }. This ensures that any future changes to the base config automatically propagate to sub-configs without manual duplication.

  4. Fixed Build System Logic
    I cleaned up the validation logic:

Moved repo_root: It is now a top-level property inside build-system, accessible to both buck and custom types (addressing your comment about it being for all types).

Added strict not conditions:

If type: "buck", the schema explicitly forbids command.

If type: "custom", the schema requires command but forbids isolation-dir and extras.

Clarified optional fields: isolation-dir and extras are defined as properties but not required (answering your question about them).

  1. Simplified pyproject-tool-pyrefly.json
    I updated this to just reference pyrefly.json via $ref rather than duplicating 270+ lines. This aligns with the SchemaStore pattern you mentioned.

Once this is approved, I'll proceed with opening the PR to SchemaStore.

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity for more than 2 weeks.

If you are still working on this this pull request, please add a comment or push new commits to keep it active. Otherwise, please unassign yourself and allow someone else to take over.

Thank you for your contributions!

@Prathamesh-tech-eng
Copy link
Contributor Author

Hi @connernilsen ,

Hope you're doing well. It's been a while since we last connected, so I just wanted to follow up and check if there have been any updates regarding the changes.

@github-actions github-actions bot removed the stale label Feb 1, 2026
@connernilsen
Copy link
Contributor

Hey @Prathamesh-tech-eng, sorry, I've had it open in my browser but haven't been able to spare the time to look into it yet. I'll sit down sometime this week and give a review

Copy link
Contributor

@connernilsen connernilsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Prathamesh-tech-eng, things are looking good!

Just a quick question, were you able to test this? If so, how? I was quickly trying to load it in to see if there were things I could try out with it, but wasn't sure where to start.

Also, not sure if you saw this, but there's apparently a Rust schema generator that might be worth checking out too I heard about recently. If you're interested in making this a bigger project, we could add that to this or a follow up PR and work on adding some script that runs in CI to see if the version in schemastore is the same that we have generated internally.

Copy link
Contributor

@connernilsen connernilsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry just a few more requests.

Also test instructions if you have just so I can validate some things

@Prathamesh-tech-eng
Copy link
Contributor Author

HI @connernilsen, Addressed all the feedback , here's a summary of the changes:

  • additionalProperties set to true
  • python-platform changed from enum to examples (non-exhaustive)
  • errors description clarified to note any valid error code works
  • Top-level schema now uses allOf with $ref to configBase (no duplication)
  • .gitignore updated to track test files and docs

To test : Run pip install jsonschema toml referencing then python schemas/validate_schemas.py. It validates both test TOML files against the schemas. You can also test in VS Code with the "Even Better TOML" extension.

I looked into schemars (https://github.com/GREsau/schemars), which can derive JSON Schema from Rust structs with #[derive(JsonSchema)]. That could auto-generate these schemas from ConfigBase/ConfigFile and we could add a CI check to detect drift between the generated schema and what's in schemastore. Happy to explore that as a follow-up PR if you think it's worth pursuing.

…ow additionalProperties, make python-platform non-exhaustive
- errors: accept both boolean and severity strings (ignore/info/warn/error)
- Add missing configBase fields: tensor-shapes, recursion-depth-limit, recursion-overflow-handler
- Add 'ty' to enabled-ignores enum
- Fix project-includes default to include **/*.ipynb
- Add baseline field (top-level)
- Remove non-existent repo_root from build-system
- Add ignore-if-build-system-missing and search-path-prefix to build-system
- Fix pyproject-tool-pyrefly.json  to use relative path
- Update test files to exercise new fields and severity strings
@Prathamesh-tech-eng
Copy link
Contributor Author

I have also Performed additional fixes found during cross-referencing the schema against the Rust config source of truth:

  • errors values: Now accepts both booleans and severity strings ("ignore", "info", "warn", "error") via oneOf, matching the Rust Severity enum
  • Missing configBase fields: Added tensor-shapes, recursion-depth-limit, recursion-overflow-handler (these were present in Rust's ConfigBase but missing from the schema; they also apply to sub-config items via $ref)
  • enabled-ignores: Added missing "ty" variant (for # ty: ignore directives)
  • project-includes default: Fixed from ["**/*.py*"] to ["**/*.py*", "**/*.ipynb"] to match the Rust default
  • baseline: Added missing top-level string field for suppressing existing errors
  • build-system: Removed non-existent repo_root field (computed at runtime, not a config option), added ignore-if-build-system-missing (boolean) and search-path-prefix (array) which were present in the Rust BuildSystem struct
  • pyproject-tool-pyrefly.json: Changed $ref from https://pyrefly.org/schemas/pyrefly.json to relative pyrefly.json so it resolves locally without depending on the URL being live
  • Test files: Updated to exercise severity string values (missing-import = "warn", invalid-argument = "error") and the newly added fields

@connernilsen
Copy link
Contributor

I looked into schemars (https://github.com/GREsau/schemars), which can derive JSON Schema from Rust structs with #[derive(JsonSchema)]. That could auto-generate these schemas from ConfigBase/ConfigFile and we could add a CI check to detect drift between the generated schema and what's in schemastore. Happy to explore that as a follow-up PR if you think it's worth pursuing.

Yeah, that would be awesome if you're interested! If not, then we can file that as a task and come back to it later!

Also, thanks for doing all of this, it looks awesome! I'm going to pull this in and start testing it, and I'll merge it and let you know if there are any followups.

@meta-codesync
Copy link

meta-codesync bot commented Feb 9, 2026

@connernilsen has imported this pull request. If you are a Meta employee, you can view this in D92721357.

@Prathamesh-tech-eng
Copy link
Contributor Author

Thanks! Yeah, I'm definitely interested. I'll start looking into schemars and see how it can be integrated to auto-generate the schemas from the Rust config structs. I'll open a follow-up PR once I have something working. Might ping you if I run into any blockers!

And glad the schemas look good, let me know if anything comes up during testing!

Copy link
Contributor

@grievejia grievejia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review automatically exported from Phabricator review in Meta.

@meta-codesync
Copy link

meta-codesync bot commented Feb 13, 2026

@connernilsen merged this pull request in 4fc48b4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Pyrefly config to schemastore

4 participants