Conversation
Document Grain source provenance on files, connector URI sync, and upstream metadata lookup endpoints. Co-authored-by: Cursor <cursoragent@cursor.com>
📝 WalkthroughWalkthroughThis PR updates the OpenAPI specification to version 0.7.2, adding two new Data Connector endpoints for syncing connector URIs and retrieving source provenance metadata. The File schema is extended with a source_metadata field, and new Grain-specific provenance types are introduced. Multiple endpoint descriptions are refined for clarity around URI handling and TikTok charges. ChangesData Connector API and Provenance
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
spec/openapi.json (1)
4443-4494: ⚡ Quick winAdd the standard
500response to this endpoint too.The sibling Data Connector endpoints on Lines 4271-4322 and Lines 4361-4412 both document unexpected server failures, but this new lookup endpoint omits that contract. That leaves generated SDKs/docs with inconsistent error handling for the same resource family.
📄 Suggested addition
"responses": { "200": { "description": "Source metadata for the URI", "content": { "application/json": { "schema": { "$ref": "`#/components/schemas/SourceMetadataResponse`" } } } }, "400": { "description": "Bad request (e.g. URL source does not match the connector type)", "content": { "application/json": { "schema": { "$ref": "`#/components/schemas/Error`" } } } }, "404": { "description": "Data connector not found", "content": { "application/json": { "schema": { "$ref": "`#/components/schemas/Error`" } } } }, + "500": { + "description": "An unexpected error occurred on the server", + "content": { + "application/json": { + "schema": { + "$ref": "`#/components/schemas/Error`" + } + } + } + }, "501": { "description": "Source metadata lookup is not implemented for this connector type", "content": { "application/json": { "schema": { "$ref": "`#/components/schemas/Error`" } } } },🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@spec/openapi.json` around lines 4443 - 4494, Add a standard 500 response object to the responses block for the "Source metadata for the URI" endpoint so it matches the sibling Data Connector endpoints: include "500" with description "Unexpected server error" (or similar) and the same application/json content referencing the existing "`#/components/schemas/Error`" schema; update the responses object that currently contains 200, 400, 404, 501, 502 (the Source metadata endpoint using SourceMetadataResponse) to also include this 500 entry.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@spec/openapi.json`:
- Line 4330: Update the operation description string that currently begins
"Materialize a connector URI (e.g. `grain://recording/<id>`) into a Cloudglue
file..." to avoid promising that `source_metadata` is always populated; either
limit the guarantee to "newly materialized files" or explicitly note that
idempotent returns for pre-existing/legacy Grain imports may have
`source_metadata: null`. Edit the JSON "description" value to include that
clarification so callers know that `source_metadata` may be null for existing
files.
---
Nitpick comments:
In `@spec/openapi.json`:
- Around line 4443-4494: Add a standard 500 response object to the responses
block for the "Source metadata for the URI" endpoint so it matches the sibling
Data Connector endpoints: include "500" with description "Unexpected server
error" (or similar) and the same application/json content referencing the
existing "`#/components/schemas/Error`" schema; update the responses object that
currently contains 200, 400, 404, 501, 502 (the Source metadata endpoint using
SourceMetadataResponse) to also include this 500 entry.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
| "tags": ["Data Connectors"], | ||
| "summary": "Sync a data connector URI into a file", | ||
| "operationId": "syncDataConnectorFile", | ||
| "description": "Materialize a connector URI (e.g. `grain://recording/<id>`) into a Cloudglue file without starting a downstream job. Idempotent: syncing the same URI returns the existing file. For Grain, the file's `source_metadata` is populated from the recording.", |
There was a problem hiding this comment.
Don't guarantee source_metadata on every idempotent sync result.
Because this operation can return an already-existing file, older Grain imports can still come back with source_metadata: null per Line 8492. The description should scope that guarantee to newly materialized files or call out the legacy-null case explicitly.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@spec/openapi.json` at line 4330, Update the operation description string that
currently begins "Materialize a connector URI (e.g. `grain://recording/<id>`)
into a Cloudglue file..." to avoid promising that `source_metadata` is always
populated; either limit the guarantee to "newly materialized files" or
explicitly note that idempotent returns for pre-existing/legacy Grain imports
may have `source_metadata: null`. Edit the JSON "description" value to include
that clarification so callers know that `source_metadata` may be null for
existing files.
Summary
/data-connectors/{id}/syncto materialize a connector URI (e.g.grain://recording/<id>) into a Cloudglue file without starting a downstream job; idempotent for the same URI./data-connectors/{id}/source-metadatato fetch upstream source metadata for a connector URI without creating a file (Grain supported; other types return 501).source_metadataon the File schema plusGrainSourceMetadata,SourceMetadata, andSourceMetadataResponseschemas for Grain recording provenance (participants, AI summary, action items, HubSpot links, etc., where Grain returns them).Test plan
spec/openapi.jsonparses as valid OpenAPI 3.0Made with Cursor
Summary by CodeRabbit