Skip to content

Conversation

@jqnatividad
Copy link
Collaborator

resolves #3123

- reorged usage text into logical sections
- tag-vocab now just supports regular CSV format; removed kv pair separated by a colon text file format
- qsvlite only support local files for tag-vocab; feature-capable build can use CKAN, disk-cache, dathere scheme
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds lookup support to the describegpt command, enabling the --tag-vocab option to load tag vocabulary from multiple sources including local files, remote URLs (HTTP/HTTPS), CKAN resources (ckan://), and dathere:// scheme resources. This extends the tag inference capability by allowing users to constrain tags to a predefined vocabulary that can be centrally managed and accessed remotely.

Key changes:

  • Integration of the existing lookup module to handle remote resource fetching and caching for tag vocabularies
  • Addition of new CLI options (--cache-dir, --ckan-api, --ckan-token) to configure remote resource access
  • Comprehensive test coverage for both valid and error cases (invalid CSV format, missing files)
  • Updated documentation with detailed examples for all supported tag vocabulary sources

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
src/cmd/describegpt.rs Added lookup module integration for tag vocabulary loading, new CLI arguments for cache/CKAN configuration, CSV parsing logic for tag vocab files, and initialization logic for tag vocabulary cache settings. Also reorganized help text sections and moved --format option under Common options.
tests/test_describegpt.rs Added three new test cases: valid tag vocabulary CSV usage, invalid CSV with missing column, and non-existent file error handling.
docs/Describegpt.md Added comprehensive documentation for Tag Options including --tag-vocab, --cache-dir, --ckan-api, --ckan-token with examples demonstrating local files, remote URLs, and CKAN resource usage.

jqnatividad and others added 2 commits December 7, 2025 19:34
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@jqnatividad jqnatividad merged commit 85eeb08 into master Dec 8, 2025
17 checks passed
@jqnatividad jqnatividad deleted the 3123-describegpt-add-lookup-support branch December 8, 2025 01:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

describegpt: add lookup support for controlled tags vocabulary

2 participants