-
Notifications
You must be signed in to change notification settings - Fork 99
feat: describegpt add lookup support
#3170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- reorged usage text into logical sections - tag-vocab now just supports regular CSV format; removed kv pair separated by a colon text file format - qsvlite only support local files for tag-vocab; feature-capable build can use CKAN, disk-cache, dathere scheme
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds lookup support to the describegpt command, enabling the --tag-vocab option to load tag vocabulary from multiple sources including local files, remote URLs (HTTP/HTTPS), CKAN resources (ckan://), and dathere:// scheme resources. This extends the tag inference capability by allowing users to constrain tags to a predefined vocabulary that can be centrally managed and accessed remotely.
Key changes:
- Integration of the existing
lookupmodule to handle remote resource fetching and caching for tag vocabularies - Addition of new CLI options (
--cache-dir,--ckan-api,--ckan-token) to configure remote resource access - Comprehensive test coverage for both valid and error cases (invalid CSV format, missing files)
- Updated documentation with detailed examples for all supported tag vocabulary sources
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/cmd/describegpt.rs | Added lookup module integration for tag vocabulary loading, new CLI arguments for cache/CKAN configuration, CSV parsing logic for tag vocab files, and initialization logic for tag vocabulary cache settings. Also reorganized help text sections and moved --format option under Common options. |
| tests/test_describegpt.rs | Added three new test cases: valid tag vocabulary CSV usage, invalid CSV with missing column, and non-existent file error handling. |
| docs/Describegpt.md | Added comprehensive documentation for Tag Options including --tag-vocab, --cache-dir, --ckan-api, --ckan-token with examples demonstrating local files, remote URLs, and CKAN resource usage. |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
resolves #3123