Releases: dathere/qsv
14.0.0
[14.0.0] - 2026-01-12 📦 "The qsv MCP for Everyone Release" 🎁
Building on our 13.0.0 "AI-native Agent" release last week, qsv 14.0.0 is dedicated to making AI integration seamless, reliable, and easy for everyone.
Previously, installing the qsv MCP Server required a full-fledged development environment and familiarity with command line tools and was not readily usable by non-developers.
This release transforms the qsv MCP Server from a powerful developer tool into a user-friendly, transparently integrated Claude Desktop data-wrangling agent with robust cross-platform support, automatic updates, and comprehensive testing infrastructure.
MCP Desktop Extension (Bundle) - One-Click Installation
The new MCP Desktop Extension provides a streamlined installation experience for Claude Desktop users:
- User-Friendly Package - Pre-configured bundle with automatic qsv binary detection - and if not found, provide installation guidance1
- Cross-Platform Support - Works seamlessly on macOS, Windows, and Linux
- Smart Data-wrangling - it's deep knowledge of qsv insulates the User from the nitty-gritty details of the comprehensive toolkit with its hundreds of options, while ensuring fast, effective operations
- Token Efficient - Despite this deep knowledge, the MCP server is still token efficient by including intelligent contextual guidance to help Claude make optimal decisions (USE WHEN, COMMON PATTERNS, ERROR PREVENTION, PERFORMANCE HINTS prompt guidance along with lazy-loading of full qsv
--helptext when more info is required) - Security Enhanced - Raw Data is not sent to Claude, only statistical metadata2
- Welcome Experience - Includes prompts and examples to get started quickly
- Seamlessly works with both Claude Code and the just launched Claude Cowork! Take qsv beyond data-wrangling chats and unlock even greater potential with an agentic qsv.
The Desktop Extension follows the official MCP Bundle (MCPB) manifest specification v0.3, ensuring compatibility with Claude Desktop and future MCP-compatible applications.
See the MCP documentation for installation instructions.
Breaking Changes
- MCP Skills:
qsv-skill-genbinary removed - useqsv --update-mcp-skillsinstead (requiresmcpfeature flag)
Added
- feat: MCP Desktop Extension - user friendly installation of qsv MCP Server #3296
- feat: MCP Server: numerous QoL improvements to MCP Desktop Bundle #3298
- feat: MCP skills auto update #3292
- feat: MCP - add expert guidance, common patterns, MCP optimized descriptions & usage hints #3303
- feat: MCP skills generator now extracts performance hints (📇 indexed, 🤯 memory-intensive, 😣 proportional memory) from README.md command table
- feat: MCP Server automatically enables --stats-jsonl flag for stats command to create cache for smart commands
- feat: MCP enhanced tool descriptions with intelligent guidance - USE WHEN, COMMON PATTERNS, ERROR PREVENTION hints
- feat: MCP parameter enhancements with examples for common options (selection, delimiter, etc.)
- feat: MCP comprehensive pipeline tool description with workflows and limitations
- feat: MCP enhanced filesystem tools (list_files, set_working_dir, get_working_dir) with usage guidance
- feat: MCP add auto-detection of qsv binary path for Desktop Extension 5c09672e
- feat: MCP various Quality-of-Life UI/UX improvements b5b338f6
- feat: MCP enhance Desktop Extension with validation and fixes e2e20551
- feat: MCP add prompts for welcome message and examples 2672a74b
- feat: Claude Code GitHub App integration - PR review and issue assistance workflows #3312
- tests: MCP add CI test workflow for qsv MCP server 8732fee3
- docs: MCP add comprehensive Claude Code (CLI) documentation 97a88c4e
- docs: MCP add an MCP Server-specific CLAUDE.md e7e5f9e1
- docs: add qsv pro download badges to README and update description #3295
- docs: add alt text to all download badges cc1c3819
- docs: add mise alternate installation documentation #3304
- docs: MCP update skills markdown documentation #3308
- docs: add MCP Server environment variables section to ENVIRONMENT_VARIABLES.md & dotenv.template
Changed
- refactor: MCP Server - removed applydp command (datapusher+ specific, not needed for general use)
- refactor: MCP use qsv --update-mcp-skill instead of separate qsv-skill-gen binary 13380ba1
- refactor: MCP remove qsv-skill-gen binary, make it an option in qsv gated behind
mcpfeature flag 9c771ee6 - refactor: MCP more robust output processing - use temp output file and stdout intelligently #3291
- refactor: MCP qsv-skill-gen.rs to preserve positional docopt args when generating skills JSON file 9618a25c
- refactor: MCP make output/temp file processing smarter 207274c7
- refactor: MCP use directory type for filesystem config to clarify restricted access 9650fb41
- refactor: MCP added null checks before iterating arrays 2d0747ab
- refactor: MCP fixed TS output directory to account for prod and test builds b0b12a40
- refactor: MCP address all issues identified during Copilot review 27027e50
- refactor: MCP optimize tokens use - extract concise command descriptions from README #3307
- refactor: MCP fine-tune
selectguidance 37964123 - docs: with MCP fully implemented - update the logo to make the horse robotic 33f3b9f5
- docs: comprehensive STATS_DEFINITION.md update b443ccc4
- chore: address valid robustness issues in last Copilot review 55a5a300
- chore: delete CITATION.cff file and just depend on Zenodo integration which auto-assigns a DOI on release 9b981b8c
- deps: bump polars to 0.52.0 at py-1.37.1 tag 3bbad1ea
- deps: bump atoi_simd and calamine c7cd928f
- deps: bump data-encoding from 2.9.0 to 2.10.0 09bf3c33
- deps: bump unicase from 2.8.1 to 2.9.0 99f66a3b
- deps: bump csvlens to 15.1 and remove our patched fork d588e36e
- deps: use latest csvlens with marked row export fd706255
- deps: bump blake3 to 1.8.3 and remove our patched fork 05f0efbb
- deps: bump toml from 0.9.10+spec-1.1.0 to 0.9.11+spec-1.1.0 2330b1d2
- deps: bump zerocopy from 0.8.32 to 0.8.33 950564d1
- build(deps): bump serde_json from 1.0.148 to 1.0.149 #3290
- build(deps): bump @modelcontextprotocol/sdk from 1.25.1 to 1.25.2 #3293
- build(deps): bump indexmap from 2.12.1 to 2.13.0 #3294
- build(deps): bump libc from 0.2.179 to 0.2.180 #3299
- build(deps): bump zmij from 1.0.12 to 1.0.13 #3305
- build(deps): bump actions/checkout from 4 to 6 #3309
- build(deps): bump actions/setup-node from 4 to 6 #3310
- deps: bump nightly from 2025-10-24 to 2026-01-09; same as polars f77ea524
- bumped several indirect dependencies
- applied select clippy & Codacy suggestions
- applied several GH Copilot and Claude review suggestions
- bumped nightly from 2025-10-24 to 2026-01-09, same as polars
Fixed
- fix:
statsuse .get() instead of [] indexing to avoid panics on missing keys when using old stats cache file #3306 - fix: MCP force add tsconfig.json #3301
- fix: MCP correct manifest.json to match official spec v0.3 c783cf2c
- fix: MCP expand template variables in config paths 3177cfe1
- fix: MCP address Copilot review issues in package-mcpb.js ec37b7c7
- fix: MCP replace execSync with execFileSync for security reasons 5209c751
- fix: MCP add promise-based deduplication for metadata cache to prevent race conditions https...
13.0.0
[13.0.0] - 2026-01-06 🦾 "The Statistical Data-Wrangling Agent Release" 🤖
We welcome 2026 with qsv 13.0.0 - a major milestone that transforms qsv into an AI-native Agent!
This is in addition to the online AI-Chatbot for CKAN portals we released last September and the expanded describegpt command we released last month as we continue our march towards even more AI/ML/Graph/FAIR and Data Librarian/Concierge/Advisor/Analyst capabilities across the datHere suite in the coming months as we embark on a strategic partnership with the Open Knowledge Foundation to Strengthen Open, FAIR, AI-Ready Data Infrastructure powered by CKAN.
This release introduces first-class support for AI agents through three major new capabilities:
MCP Server - Model Context Protocol Integration
qsv now ships with a built-in Model Context Protocol (MCP) Server enabling seamless integration with AI Chatbots starting with Claude Desktop.
- Local Data - Its "zero-copy" inspired approach allows you to wrangle very large datasets - WITHOUT sending raw data1, only sending statistical metadata to Claude! This is not only good for security and privacy reasons - it overcomes Claude's upload size limit, saves tokens and improves performance!
- 22 MCP Tools: 20 common qsv commands as individual tools + 1 generic tool to access all other 46 commands + 1 pipeline tool
- Natural Language Interface: No need to remember command syntax
- Pipeline Support: Chain multiple operations together seamlessly
See the MCP documentation for detailed setup instructions.
Claude Agent SDK Helper Utilities
New Agent Skills infrastructure provides:
qsv-skill-genCLI - Generate skill definitions for AI agents- Parses qsv USAGE text using qsv-docopt to generate JSON skill definitions. This allows quick update of Agent Skills as commands and options are added & modified.
- Shell-safe example generation with proper quoting
- Comprehensive documentation for AI agent integration to integrate qsv into your own AI solutions!
moarstats - Massive Statistical Expansion
The moarstats command received substantial enhancements, adding 24+ MOAR statistical measures:
Advanced Univariate Statistics:
- Bimodality Coefficient - Detect multimodal distributions
- Normalized Entropy - Scaled information content measure (0-1)
- Atkinson Index - Inequality measure with configurable epsilon parameter
Bivariate Statistics:
- Pearson's correlation - Linear correlation coefficient
- Spearman's rank correlation - Monotonic relationship measure
- Kendall's tau - Concordance-based correlation
- Covariance - Joint variability measure
- Mutual Information - Information-theoretic dependency
- Normalized Mutual Information - Scaled mutual information (0-1)
- Multi-dataset joins -
--join-inputsfor bivariate analysis ACROSS datasets
XSD Type Mapping:
- Automatic inference of W3C XML Schema Definition (XSD) datatypes
- Smart XSD Gregorian date type inferencing with "quick" and "thorough" modes (#3259)
- Support for gYear, gMonth, gDay, gMonthDay, gYearMonth validation
See STATS_DEFINITIONS.md for a comprehensive list of the ~100 statistical metrics qsv compiles!
Breaking Changes
lens: Default behavior changed to NOT stream from stdin (use explicit flag if needed)moarstats: Output now includes additional columns (xsd_type, bivariate stats)
Added
- feat: qsv MCP server #3269
- feat:
MCP- expanded file selector for more supported tabular file formats; auto index for files larger than 10mb #3278 - feat: added Claude Agent Skills SDK support 🤖 #3264
- feat:
moarstatsadd "xsd_type" column #3242 - feat:
moarstatsadd Atkinson Index with configurable inequality aversion parameter, Normalized Entropy & Bimodal Coefficient #3243 - feat:
moarstatsadd bivariate stats #3247 - feat:
moarstatsadd normalized mutual info #3256 - feat:
moarstatsadd--forceand--jobsoptions #3253 - feat:
moarstatsadd "xsd_subtype" Gregorian date data types inferencing with--xsd-gdate-scanhaving fast (default) and comprehensive modes #3259 - feat:
qsvdpenable join command that moarstats uses #3252 - docs: added comprehensive stats documentation #3240
Changed
- refactor:
describegpt- consolidate JSON response parsing; cache handling; and make DuckDB & Polars error handling more consistent #3241 - refactor:
frequencyreduce duplication introduced by--weightoption #3236 - perf:
frequencyprecomputeother_prefixfor performance 2dc75ee - perf:
frequencysimplifyapply_limits*helper functions f0b7f9c - perf:
pivotpconvert directly toPlSmallStrfor performance b7dbb3f - refactor
MCP Serverto optimize for Local Access to Files #3272 - refactor:
MCP Serverimprovements #3274 - refactor:
MCP Serverremove examples from ci tests #3277 - refactor:
MCP Serveradd LIFO converted cache #3280 - refactor:
MCP Servermoar refactoring after tests #3282 - perf:
moarstatsmuch faster bivariate calculation #3248 - perf:
moarstatsoptimize non-streaming bivariate stats compilation #3250 - refactor:
qsv Skills Agent#3267 - deps: polars bump to rev c241260 #3276
- build(deps): bump itoa from 1.0.16 to 1.0.17 by @dependabot[bot] in #3239
- build(deps): bump human-panic from 2.0.4 to 2.0.5 by @dependabot[bot] in #3234
- build(deps): bump human-panic from 2.0.5 to 2.0.6 by @dependabot[bot] in #3249
- build(deps): bump libc from 0.2.178 to 0.2.179 by @dependabot[bot] in #3265
- build(deps): bump redis from 1.0.1 to 1.0.2 by @dependabot[bot] in #3232
- build(deps): bump rfd from 0.16.0 to 0.17.0 by @dependabot[bot] in #3279
- build(deps): bump rfd from 0.17.0 to 0.17.1 by @dependabot[bot] in #3284
- build(deps): bump serde_json from 1.0.147 to 1.0.148 by @dependabot[bot] in #3238
- build(deps): bump serial_test from 3.2.0 to 3.3.0 by @dependabot[bot] in #3273
- build(deps): bump serial_test from 3.3.0 to 3.3.1 by @dependabot[bot] in #3275
- build(deps): bump tokio from 1.48.0 to 1.49.0 by @dependabot[bot] in #3266
- build(deps): bump url from 2.5.7 to 2.5.8 by @dependabot[bot] in #3286
- build(deps): numerous bumps zmij from 0.1.7 to 1.0.12
- bumped several indirect dependencies
- applied select clippy & Codacy suggestions
- applied several GH Copilot and Claude review suggestions
Fixed
- fix: refresh_cpu_all() -> refresh_cpu_list(sysinfo::CpuRefreshKind::nothing())… #3261
- fix:
statsremove redundant check 0977ebf - fix:
moarstatscorrectkendall_tauformula cf16543 - fix:
describegptandutil::run_qsv_cmd- add special case forsampleas it expects output differently 6b6039f - fix: CVE-2025-66414 security vulnerability GHSA-w48q-cv73-mx4w
- fix: RUSTSEC-2026-0001 (rkyv bump) c2d4937
- typo: Portugese → Portuguese
- typo: stats asummes → assumes
AI Contributors
- @jqnatividad collaborated with and orchestrated @Copilot, Claude Code, Cursor and Gemini using various models
Full Changelog: 12.0.0...13.0.0
12.0.0
[12.0.0] - 2025-12-24 🎄
Stuff your virtual stocking and jingle your data bells - qsv 12.0.0 slides down the chimney packed fuller than Santa’s sleigh! Unwrap delightful surprises like the shiny new moarstats command, gift-wrapped weighted statistics, and AI-powered FAIR metadata inferencing now speaking in multiple languages (no elf translation required). As the star on top, meet TOON - the brand new LLM-optimized, token-efficient format - ready to sleigh your AI projects all through 2026. Ho-ho-hold my data, this update’s a festive feast!
Special thanks to @kulnor for advocating, brainstorming & testing many of the new features below!
🌟 Major Features
NEW: moarstats Command
A powerful new command for "moar" advanced statistical analysis, providing statistics beyond what the stats command offers:
-
Comprehensive Statistics: Over 50+ advanced statistical measures including:
- Detailed outlier analysis (count, sum, average)
- Winsorized and trimmed means (5%, 10%, 20%, 25%)
- Multiple dispersion measures (IQR to range ratio, quartile coefficient of dispersion)
- Distribution statistics (skewness, multiple kurtosis measures)
-
Advanced Option (
--advanced): Access computationally intensive statistics:- Gini coefficient for inequality measurement
- Excess Kurtosis to measure "tailedness" of the distribution
- Shannon Entropy for data diversity analysis
-
Available on all binary variants for universal access
Enhanced describegpt Command
Major enhancements to AI-powered data description capabilities:
-
⛩️ Minijinja Template Engine Integration:
- Custom prompt templating with full Minijinja and Minijinja-contrib filters
- More powerful and flexible prompt customization
-
Multilingual Support:
--languageoption for generating descriptions in any language/dialect- Languages: Spanish, Portuguese, Italian, Japanese, Hindi, Arabic
- Dialects: Franglais, Taglish, Pennsylvania Dutch
- Constructed Languages: Klingon, High Valyrian, Quenya
- Personalities: Snoop Dog, Hans Rosling, Christopher Walken
- Personas: Gen Z Slang, Silly, Emoji-loving Santa
- Automatic language detection in
--promptmode - SQL comments also generated in requested language
-
Advanced Features:
--addl-columnsoption with detailed attribution and system metadata--export-prompt <file>to save the default prompts to the specified file.
This file can then be tailored and used with the--prompt-file <file>option.- Iterative, session-based SQL RAG with
--promptoption - Sampling in prompt mode for better SQL generation
- Lookup table and CKAN support for controlled vocabularies
- Convenience values for
--addl-cols-list
(i.e., "everything", "everything!", "moar", "moar!")
Weighted Statistics Support
Comprehensive weighted statistics implementation across multiple commands:
-
stats Command (
--weight <column>):- Weighted mean, standard deviation, variance
- Weighted MAD (Median Absolute Deviation) and percentiles
- Weighted modes and antimodes
- Weighted harmonic and geometric means
- All weighted calculations handle non-finite values gracefully
-
frequency Command (
--weight <column>):- Weighted frequency distributions
- Proper handling of weighted "Other" and "ALL UNIQUE" category
- Non-finite weights automatically skipped
Token Object Oriented Notation (TOON) Format Support
-
A compact, human-readable encoding of the JSON data model for LLM prompts
-
Commands Supporting TOON:
describegpt --format TOONfrequency --toon
-
Benefits: More readable than JSON, easier to parse than CSV for hierarchical data
and more token-efficient, terse format targeted for LLMs
stats Command Enhancements
-
Percentile Improvements:
--percentile-listspecial values: "deciles" and "quintiles"- Percentile labels now include prefix before value (e.g., "p50: 42.5")
- Validation of percentile-list on startup
-
New Columns: Added
n_countsfor more detailed count information -
Performance Optimizations:
- Optimized Stats struct layout
- Eliminated redundant, unnecessary sorting
- Removed redundant filtering for weighted stats functions
- Microoptimizations throughout
transpose Command
- New
--longOption: Transform data from wide to long format- Column selection support using select syntax
- Streaming implementation per GitHub Copilot review suggestions
diff Command
- upgraded csv-diff from 0.1.1 to faster 0.1.2, improving performance
in optimal cases by up to 25% 🚀
lens Command
- Aligned
--no-streaming-stdinbehavior with csvlens upstream
📊 Output Format Changes
schema Command
- Updated
$schemafrom Draft 7 to JSON Schema Draft 2020-12
⚡ Performance Improvements
suite-wide
- replaced already fast ryu float to string conversion crate crate with even
faster zmij crate (https://vitaut.net/posts/2025/faster-dtoa/)
stats Command
- Optimized Stats struct memory layout
- Eliminated redundant sorting operations
- Removed unnecessary clone operations
- Better handling of real-world data (assumes no infinity values)
frequency Command
- Microoptimizations for faster frequency computation
- Optimized top_n/bottom_n retrieval
🐛 Bug Fixes
frequency Command
- Fixed behavior when compiling weighted frequencies with
ALL_UNIQUE - Fixed issue where "Other (0),0,0,0" could appear in output
- Proper handling of non-finite weights (automatically skipped)
🏗️ Infrastructure & Quality
Testing
- Test suite expanded from 2,060 to 2,380 tests
- Comprehensive test coverage for all new features
- Weighted statistics thoroughly tested
- Advanced moarstats options validated
Code Quality
- Extensive GitHub Copilot review integration
- Multiple refactoring passes for code clarity
- Clippy suggestions incorporated throughout
- Better error handling and edge case management
FAIR Principles
- Added CITATION.cff (by @rzmk) for academic citation
- Added Zenodo DOI badge for dataset citation
- Enhanced FAIRification of qsv as a research tool
📚 Documentation Improvements
Statistical Documentation
- Comprehensive documentation for statistics produced by stats command (by @kulnor) WIP
- Enhanced usage text for stats, frequency, and moarstats
- Better examples throughout documentation
Command Documentation
- Updated describegpt with multilingual examples
- Added controlled tag vocabulary examples
- Enhanced TOON format documentation
- Better SQL RAG workflow documentation
Migration Notes
Breaking Changes
-
schema command:
$schemaoutput changed from Draft 7 to Draft 2020-12- Most schemas should be compatible
- Validation tools must support JSON Schema Draft 2020-12
-
stats command: Output now includes percentile label prefixes
- Example: "p50: 10" of the 50th percentile value instead of just the value "10"
- May affect parsing scripts that expect raw numbers
Added
- feat:
describegptadd--add-colsand--addl-cols-list <list>options #3179 - feat:
describegptadd--languageoption #3184 - feat:
describegptuse minijinja engine for prompt processing #3188 - feat:
describegptadd language autodetection in--prompt(chat) mode #3193 - feat:
describegptsampling in prompt mode for better SQL generation… #3198 - feat:
describegptadd --prompt sessions for iterative SQL RAG refinement #3200 - feat:
describegptadd TOON format support #3205 - feat:
frequencyadd TOON format #3206 - feat:
frequencyadd weighted frequencies #3218 - feat: add new
moarstatscommand #3207 - feat:
moarstatsadd even moar! Now with detailed outliers info! #3208 - feat:
moarstats- add configurable ...
11.0.2
[11.0.2] - 2025-12-08
qsv 11.0.2 brings significant enhancements to larger-than-memory data processing, AI-powered metadata inferencing, JSON Schema inferencing & validation, and data viewing capabilities, along with important bug fixes and performance improvements.
All in preparation for at-scale, secure, interactive, "zero-copy" "Data Steward-in-the-Loop" FAIRification on the desktop in qsv pro.
🌟 Major Features
stats & frequency
- Larger than Memory Files:
stats&frequencycan now handle arbitrarily large files, even when "advanced" statistics are enabled with its new dynamic parallel chunk sizing algorithm! (example stats, frequency) - N Counts: Added "n_counts" (
n_negative,n_zeroandn_positive) columns tostatsoutput for more detailed count information for numeric fields.
describegpt
The describegpt command has received substantial improvements for AI-powered metadata inferencing:
-
"Neuro-Procedural" Data Dictionaries: combines deterministically computed statistics and frequency distribution data with AI-inferred Human-Friendly Labels and Descriptions to compile an expanded Data Dictionary (not quite "neuro-symbolic" (YET!))
-
Chat with your Data!: Improved DuckDB and Polars SQL guidance mean more reliable transformations of your Natural Language queries to SQL - leading to fast, deterministic, reproducible, hallucination-free answers! (example, SQL result)
-
Format Option: Replaced
--jsonflag with--formatoption for more flexible output formatting- Supports multiple output formats - Markdown (default), TSV and JSON
- Removed
--jsonloption for cleaner API
-
Controlled Tag Vocabulary: New tag vocabulary system for consistent categorization
--tag-vocaboption to specify controlled vocabulary- Lookup support for tag vocabularies - retrieve a tag vocabulary from a local or remote CSV
usinghttp://,https://,dathere://andckan://URL schemes.
-
Enhanced Boolean Inference:
--infer-booleanis now enabled by default for better data type detection -
Performance Metrics: Added elapsed time tracking to monitor processing duration
-
Improved Prompt Templates: Updated default description prompt with PII/PHI alerts and better attribution metadata
schema & validate
Enhanced JSON Schema inference and validation capabilities:
-
Strict Formats: New
--strict-formatsoption for stricter JSON Schema format validation,
enforcing JSON Schema format constraints for email, hostname & IP address (IPV4/IPV6) formats. -
Output Option: New
--outputoption for specifying schema output destination- Polars schema now uses consistent naming conventions across commands
- Updated
joinp,pivotp, andsqlpcommands to use new.pschema.jsonnaming convention
-
Configurable Email Validation:
validatehas numerous options to tweak email validation
- taking advantage ofschema's email format constraint inferencing.
sample time-series sampling
A new --timeseries sampling method with grouping (hourly, daily, weekly),
adaptive sampling (prefer business hours or weekends) with various aggregation (mean, sum, min, max)
within each interval with configurable starting points (first, last or random).
lens "real-time" Features
Enhanced CSV viewing capabilities with csvlens integration:
-
Auto-Reload: New
--auto-reloadoption to automatically reload file when it changes- Useful for monitoring live data files
-
Streaming stdin: New
--streaming-stdinoption for real-time data viewing- Supports viewing data as it's being piped in
-
Row Marking: Updated csvlens dependency with row marking feature
Breaking Changes
describegpt:--jsonflag replaced with--formatoptiondescribegpt:--jsonloption removedschema,joinp,pivotp,sqlp: Updated Polars schema naming conventions
(existing workflows should work but output format may differ slightly)
Added
- Created Event Logo Archive with AI-generated seasonal/version logos
describegpt: add controlled vocabulary support for tags #3122describegpt: add elapsed time #3168describegpt: add lookup support #3170excel: add--celloption #3133frequency: add dynamic parallel chunk sizing #3135lens: add--auto-reloadoption #3128lens: add--streaming-stdinoption #3171sample: add timeseries sampling options #3130schema: infer addl JSON Schema predefined formats - email, ipv4, ipv6, hostname #3125schema: add--outputoption and standardize Polars Schema file name #3126stats: dynamic parallel chunk sizing with indexed files #3134stats: add n_negative, n_zero, n_positive count columns #3157validate:add email validation options #3148tests: add tests for https://100.dathere.com/lessons/4 by @rzmk in #3151- Added Claude AI guidance for contributors
- Enhanced
--versionoutput with more comprehensive system metadata
Changed
- refactor:
describegptimprove tags inferencing with Tag Vocabulary #3139 - feat:
describegpt- major refactor #3143 - feat:
describegptimproved Polars SQL processing #3147 - feat:
describegptreplace--jsonoption with--formatoption supporting 3 formats - markdown, json and TSV; remove--jsonloption #3167 - refactor:
frequency&stats- parallel chunk sizing - allow forcing of cpu based chunking #3138 - Align partition stdin handling with split/stats pattern by @Copilot in #3162
- deps: use latest polars upstream with new SQL fixes and features (pola-rs/polars@e1be17f)
- build(deps): bump actions/setup-python from 6.0.0 to 6.1.0 by @dependabot[bot] in #3120
- build(deps): bump actix-web from 4.12.0 to 4.12.1 by @dependabot[bot] in #3127
- build(deps): bump flate2 from 1.1.5 to 1.1.7 by @dependabot[bot] in #3159
- build(deps): bump jsonschema from 0.37.1 to 0.37.2 by @dependabot[bot] in #3129
- build(deps): bump jsonschema from 0.37.2 to 0.37.3 by @dependabot[bot] in #3131
- build(deps): bump jsonschema from 0.37.3 to 0.37.4 by @dependabot[bot] in #3140
- build(deps): bump log from 0.4.28 to 0.4.29 by @dependabot[bot] in #3150
- build(deps): bump minijinja from 2.12.0 to 2.13.0 by @dependabot[bot] in #3142
- build(deps): bump minijinja-contrib from 2.12.0 to 2.13.0 by @dependabot[bot] in #3141
- build(deps): bump pyo3 from 0.27.1 to 0.27.2 by @dependabot[bot] in #3137
- build(deps): bump qsv-stats from 0.40.0 to 0.41.0 by @dependabot[bot] in #3136
- build(deps): bump qsv-stats from 0.41.0 to 0.42.0 by @dependabot[bot] in #3156
- build(deps): bump qsv-stats from 0.42.0 to 0.43.0 by @dependabot[bot] in #3169
- build(deps): bump rfd from 0.15.4 to 0.16.0 by @dependabot[bot] in #3121
- build(deps): bump uuid from 1.18.1 to 1.19.0 by @dependabot[bot] in #3146
- Improved qsvpy build process for Apple Silicon
- Updated GitHub Actions workflows for better reliability
- bumped several indirect dependencies
- applied select clippy & Codacy suggestions
- Improved dependency version management
- Better feature flag handling
Fixed
- fix:
applypanic on empty selection #3165 - fix: more robust snappy and file extension detection #3166
- fix:
partitionadd proper stdin handling regression introduced when--limitoption was added #3161 - Fix broken layout of environment variable documentation by @tmtmtmtm in #3163
Removed
New Contributors
- @Copilot made their first contribution in #3162
*...
10.0.0
[10.0.0] - 2025-11-23
Highlights:
- Enhanced Data Dictionary:
describegptnow features an expanded default prompt (v4.0) that generates more comprehensive data dictionaries. - Parallel Search/Replace Operations:
search,searchset, andreplacecommands now support parallel execution when working with indexed CSV files, delivering significant performance improvements for large datasets. - Search/Replace Exact Match Options: Added
--exactoption tosearch,searchset, andreplacecommands for precise string matching without regex patterns. - Enhanced SQL Capabilities:
sqlpnow supports arbitrary expressions in SQL JOIN constraints, named window references, and new SQL functions includingrow_number,rank,dense_rank, andarray_to_string. - Improved
pivotpPerformance: Updated to use Polars' new lazy pivot API with--maintain-orderflag for predictable output ordering. - Luau 0.701: Updated embedded Luau from 0.697 to 0.701 with additional pattern matching documentation and tests.
Added
search&searchset: add--exactoption for literal string matching #3094search: parallel search when file is indexed #3096searchset: parallel execution when indexed #3097replace: add--exactoption e73d9bfreplace: parallel execution when indexed #3098sqlp: added support for arbitrary expressions in SQL JOIN constraints d47c44e & 0d2402bsqlp: added support forrow_number,rank, anddense_rankSQL window functions #3115sqlp: added support for named window references #3118sqlp: added support forarray_to_stringlist evaluation 64cbf34pivotp: added--maintain-orderflag for predictable output ordering 02dca12describegpt: default-prompt-file v4.0 with expanded Data Dictionary generation 4db0d18luau: expanded documentation for string functions using pattern matching a7344e3 & 2dcc9a4util::mem_file_check: added platform adjustment factor 421be84- benchmarks: v7.0 added search & searchset indexed parallel benchmarks 55df784
- benchmarks: v7.1.0 added replace_indexed_parallel benchmark 05c89d8
Changed
describegpt: refactored for improved reliability 1433bf1 & b6190a4frequency: special rank of 0 now assigned to<ALL_UNIQUE>rows effa13bfrequency: microoptimizations 775bb88 & 29ec7afsearch,searchset&replace: now parallelizable with an index, with significant performance improvements 45fc83dsearch: use faster, non-allocatingpar_sort_unstable_by_keyfor improved performance 5f50f23search: optimize--quickoption 1fc1b85search:--preview-matchoption forces sequential search 017ca6fsearch,searchset&replace: sort chunks instead of raw data for better performance 5b58cb8searchset: microoptimizations for performance c4ce324replace: remove unneeded index rebuild logic cfdba60pivotp: refactored to adapt to Polars' new lazy pivot API #3102excel: microoptimize hot loop and formula retrieval f141c1b & 17780b5stats: cache repetitive expensive env_var access in hot path a6ad0cestats: multiple microoptimizations 2f41c33 & 9bf43e5 & 00958a1validate: updated to jsonschema 0.37.x with improved error handling f45693d & c7ad5d2 & b9ea447luau: updated embedded Luau from 0.697 to 0.701 8885dce- deps: bump polars to latest upstream with numerous SQL and LazyFrame improvements
- deps: bump jsonschema from 0.34 to 0.37.1
- deps: bump syn from 2.0.109 to 2.0.110 d207524
- deps: bump quick-xml from 0.38.3 to 0.38.4 11a5ae4
- deps: bump geosuggest-core from 0.8.1 to 0.8.2 baf3194
- deps: bump geosuggest-utils from 0.8.1 to 0.8.2 c5bcd1b
- deps: bump governor from 0.10.1 to 0.10.2 b0068ef
- deps: bump gzp from 2.0.1 to 2.0.2 2a0b901
- deps: bump indexmap from 2.12.0 to 2.12.1 afa9c1f
- deps: bump mlua from 0.11.4 to 0.11.5 49eedb9
- deps: bump signal-hook-registry from 1.4.6 to 1.4.7 5c2e705
- deps: bump calamine to 0.32 (removed git dependency) 449f162
- deps: bump cached to latest upstream (removed patched fork) 508d1ce
- deps: bump actions/checkout from 5 to 6 f76e009
- deps: removed hashbrown patched fork ad30460
- deps: removed grex patched fork 88cd3fc
- deps: updated Cargo.lock file multiple times with indirect dependency updates
- docs: updated rust-version requirement to 1.91 c288d4d
- docs: prebuilt binaries on Linux and Windows x86_64 are no longer compiled with target-cpu=native 5f892a1
- docs: expanded note about Illegal Instruction (SIGILL) faults and portable builds e4df784
- docs:
describegptupdate with expanded Data Dictionary example and link to defaults d722afd & cedcd41 & bba4f76 - applied select clippy lint suggestions
- bumped several indirect dependencies
Fixed
count: should still work with "broken" CSVs when polars feature is enabled #3104describegpt: more robust SQL escaping to prevent SQL injection e958329excel: formula retrieval bug on error b894515excel: reverted mistaken alloc optimization for trim path b37361aindex: added check to confirm that only uncompressed CSV files can be indexed 1be485bsqlp: unnest workaround for test compatibility 54d079bsqlp: corrected array_to_string test 6c661ac- docs: fixed typo
QSV_MEMORY_HEADROOM_PCT->QSV_FREEMEMORY_HEADROOM_PCTf15d03e
Removed
- deps: removed polars crates (
polars-utils,polars-ops) that are no longer needed a7785f6 - publish: removed target-cpu=native as it causes SIGILL on GitHub Action Runners fd74f8f
Full Changelog: 9.1.0...10.0.0
9.1.0
[9.1.0] - 2025-11-03
FAIRification continues to be a focus, as we tweak key commands that enable us to FAIRify raw data at blazing speed:
frequencyreceived significant updates in this release, including several new options that make compiling frequency distribution tables easier.describegptnow uses the much faster BLAKE3 hash as a cache key (10-20x faster than SHA256) and supports passing complex prompts more easily through the file system.- qsv-stats - the engine that powers both
statsandfrequencycommands - has been further optimized with the 0.40.0 release, to compile summary statistics as fast as possible - even for very large files - often one to two orders of magnitude faster (10 to 100x faster) than typical Python-based tools. - Polars has been upgraded to 0.52.0. This vectorized query engine allows us to support more tabular formats & analyze/query millions of rows in seconds in situ - all without loading the data into a database.
- the csv 1.4.0 crate has been tuned further to squeeze out even higher throughput - already ~2 million rows per second!1
These improvements prepare the ground for the upcoming MCP server on qsv pro, which will enable at-scale, configurable, interactive "Data Steward-in-the-loop", value-added FAIRification of privacy-sensitive files.
The qsv pro MCP server will handle not just CSVs but also other formats, including unstructured data - all processed locally on the desktop, without sending your raw data to the cloud.
It will produce AI-ready, standards-compliant metadata (starting with DCAT-US v3, Croissant and schema.org) - ideal context for AI applications and data governance efforts alike.
Added
frequency: add--pretty-jsonoption c67fd06frequency: add--rank-strategyoption #3075frequency: add-null-textoption #3082
Changed
describegpt: explicitly usefrequency's dense rank strategy dc3f270describegpt: allow--promptto be loaded from a text file b11a10cdescribegpt: use much faster BLAKE3 hash for cache keyfrequency: change default rank-strategy from min (AKA "1224" ranking) to dense (AKA "1223" ranking)lens: bumped csvlens from 0.13.0 to 0.14.0lens: automatically set to monochrome mode when using--findoption 8539869luau: bumped embedded Luau from 0.694 to 0.697 3e68e29stats: fingerprint hash now uses much-faster, parallelizable BLAKE3 instead of SHA256table: document that it also creates "aligned TSVs" and Fixed Width Format files aaa84b0- tests: change default Python to 3.13
- docs: documented that Extended Input Support (🗄️) does
.zipauto-decompression - docs: documented Limited Extended Input Support (🗃️)
- use latest qsv-tuned csv crate with performance optimizations
- build(deps): bump flate2 from 1.1.4 to 1.1.5 by @dependabot[bot] in #3071
- build(deps): bump human-panic from 2.0.3 to 2.0.4 by @dependabot[bot] in #3077
- deps: bump Polars from 0.51.0 at py-1.35.0-beta.1 to 0.52.0 618edf0
- build(deps): bump qsv-stats from 0.39.1 to 0.40.0 by @dependabot[bot] in #3078
- build(deps): bump actions/upload-artifact from 4 to 5 by @dependabot[bot] in #3074
- applied several clippy lint suggestions
- bumped several indirect dependencies
- align nightly to 2025-10-24, the same nightly as Polars
- bumped MSRV to Rust 1.91
Fixed
describegpt: add SQL escaping to eliminate SQL injection attack vector; add.csvextension to--sql-outputwhen Polars SQL query runs successfully ad52a35frequency: fix--selectoption always returning<ALL_UNIQUE>#3082- fixed some publishing workflows
Removed
- Removed SHA256 and replaced with mush faster, parallelizable BLAKE3 hash #3072 and #3080
- publish: removed
maximize-build-spacestep in workflows as it was not working as advertised - tests: removed
target-cpu=nativeRUSTFLAG in CI tests to avoid intermittent SIGILL (Illegal Instruction) faults
Full Changelog: 8.1.1...9.1.0
8.1.1
[8.1.1] - 2025-10-22
Added
Changed
- deps: use latest version of qsv-tuned csv crate 7523e08
- deps: unpin zip from 4.6 and bump to 6 now that geosuggest uses it 957ad6d
- build(deps): bump dns-lookup from 3.0.0 to 3.0.1 by @dependabot[bot] in #3057
- build(deps): bump geosuggest-utils from 0.8.0 to 0.8.1 by @dependabot[bot] in #3058
- build(deps): bump geosuggest-core from 0.8.0 to 0.8.1 by @dependabot[bot] in #3059
- build(deps): bump memmap2 from 0.9.8 to 0.9.9 by @dependabot[bot] in #3060
- build(deps): bump pyo3 from 0.27.0 to 0.27.1 by @dependabot[bot] in #3061
- tweaked several publishing and test GH Actions workflows
- applied
clippy::to_string_in_format_argslint suggestion - bumped several indirect dependencies
Fixed
- use latest csvlens patched fork that fixes panic when using stdin input 34154e6
New Contributors
Full Changelog: 8.1.0...8.1.1
8.1.0
[8.1.0] - 2025-10-20
This minor release features:
- qsv on IBM Z mainframes (s390x)! - now that we have endianness detection, even adding a prebuilt binary for it.
describegpt: Output Kind and Token Usage have been added to the output making it easier to parse responses and track LLM costs.python: with the latest pyO3.rs 0.27 crate, we're setting the stage to drop support for Python 3.12 and below, targeting free-threaded Python exclusively starting with the 9.0 release. This should allow us to massively boost performance by parallelizingpyworkloads.
It will also power the upcoming FAIRification commands.- a tuned csv fork based on the just released csv 1.4 crate, increasing performance suite-wide.
Added
describegpt: add Kind and Token Usage to output a21e117- add big-endian handling for big-endian platforms (e.g.
s390x-unknown-linux-gnu) #3045 - add s390x prebuilt binary (qsv now runs on IBM Z Mainframes!) a3f455c
Changed
datefmt: Replacelocalzonecrate withiana-time-zonecrate #3048geoconvert: Improved with the latest geozero fixes needed for Datapusher+ processing of GeoJSON and SHP files.python: micro-optimize to remove unnecessary clone; use more idiomatic error_result handling - 777aa14- docs: update badges with PowerPC Linux GNU, Windows ARM64 MSVC, remove macOS Intel by @rzmk in #3036
- deps: bump bitflags from 2.9.4 to 2.10.0 8d65c1b
- deps: bumped csv crate to 1.4 and reapplied qsv optimizations. For more info, see 4e2f2a0
- deps: bump csvs_convert patch fork 8aa398f
- deps: bump geozero to latest upstream with unreleased fixes - 0a9d1b3
- deps: bump polars to 0.51.0 at py-1.35.0-beta-1 tag
- deps: bump socket2 from 0.6.0 to 0.6.1
- deps: bump whatlang to 0.18 e80e9c0
- build(deps): bump actions/setup-python from 5.0.0 to 6.0.0 by @dependabot[bot] in #3030
- build(deps): bump actix-governor from 0.8.0 to 0.10.0 by @dependabot[bot] in #3046
- build(deps): bump gzp from 1.0.1 to 2.0.0 by @dependabot[bot] in #3033
- build(deps): bump github/codeql-action from 3 to 4 by @dependabot[bot] in #3034
- build(deps): bump flexi_logger from 0.31.4 to 0.31.5 by @dependabot[bot] in #3032
- build(deps): bump flexi_logger from 0.31.5 to 0.31.6 by @dependabot[bot] in #3035
- build(deps): bump flexi_logger from 0.31.6 to 0.31.7 by @dependabot[bot] in #3038
- build(deps): bump libc from 0.2.176 to 0.2.177 by @dependabot[bot] in #3040
- build(deps): bump pyo3 from 0.26.0 to 0.27.0 by @dependabot[bot] in #3055
- build(deps): bump qsv_docopt from 1.8.0 to 1.9.0 by @dependabot[bot] in #3041
- build(deps): bump regex from 1.11.3 to 1.12.1 by @dependabot[bot] in #3043
- build(deps): bump regex from 1.12.1 to 1.12.2 by @dependabot[bot] in #3050
- build(deps): bump reqwest from 0.12.23 to 0.12.24 by @dependabot[bot] in #3049
- build(deps): bump rust_decimal from 1.38.0 to 1.39.0 by @dependabot[bot] in #3047
- build(deps): bump simd-json from 0.16.0 to 0.17.0 by @dependabot[bot] in #3031
- build(deps): bump tikv-jemallocator from 0.6.0 to 0.6.1 by @dependabot[bot] in #3053
- build(deps): bump tokio from 1.47.1 to 1.48.0 by @dependabot[bot] in #3052
- applied select clippy lint suggestions
- updated indirect dependencies
Fixed
headers: fix stdin handling without explicit-for stdin input #3039
Removed
- removed Python 3.10 prebuilts as py03 0.27 no longer supports it and Python 3.10 is no longer maintained
- deps: removed patched fork of time-rs now that 0.3.43 has been released fde03b3
Full Changelog: 8.0.0...8.1.0
8.0.0
[8.0.0] - 2025-10-06
1
Findable, Accessible, Interoperable & Reusable (FAIR) Data is AI-Ready Data.
A week and a half after launching our "People's API" AI Chatbot and "AI-Ready" service, we fine-tune qsv further, as it powers the FAIRification engine that allows us to "open your data" (as a verb) - to infer and calculate AI-Ready, FAIR metadata at blazing speed even for large datasets.
This release features:
describegptfixes and improvementstablecan now produce "aligned" TSV and Fixed Width format filesvalidatenow has Extended Input Support in its RFC 4180 validation modeextdedupfixed to dedupe arbitrarily large csv or text filesluauupgraded from 0.690 to 0.693- PowerPC64 pre-built binaries - making it more convenient to use qsv on this "power"ful 😉 platform that's widely used in research (thanks to IBM-provided access to its native GitHub Action ppc64le runners! For the next release - qsv on IBM Z Mainframes!)
These changes set the stage for even more advanced, powerful, configurable FAIRification capabilities to
make ALL your Data AI-Ready, Useful, Usable & Used by Machines & Humans alike.
Added
table: addleftendtabalignment option #3004table: addleftfwf(Fixed Width Format) alignment option 590c861validate: add Extended Input Support to RFC 4180 validation mode #3012- added PowerPC64 LE Linux prebuilt
Changed
describegpt: fine-tuned default LLM Prompt template (v3.1.0) 00e52a3 6b09b7e 5be7f2eluau: bump embedded Luau from 0.690 to 0.693 #3017schema: make Decimal Type Scale configurable for polars schema withQSV_POLARS_DECIMAL_SCALEenv var - f20edd5- updated optimized csv crate, adding non-allocating
StringRecord::trim()and moreinline()s 4a1c82a - deps: bump calamine to 0.31.0 bd7a04c
- deps: Bump polars to 0.51.0 from 0.50.0 at py-1.33.1 tag #2995
- deps: bump polars to 0.51.0 at py-1.34.0-beta.4 tag at revision b973cac (latest upstream) #3022
- deps: bump polars to 0.51.0 at py-1.35.0 tag revision b973cac 4164875
- deps: replace tabwriter with renamed fork qsv-tabwriter #3010
- deps: use patched fork of whatlang-rs. Though our PR was merged, there is still no new release 6afff4f
- build(deps): bump base62 from 2.2.2 to 2.2.3 by @dependabot[bot] in #3003
- build(deps): bump bytemuck from 1.23.2 to 1.24.0 by @dependabot[bot] in #3026
- build(deps): bump chrono from 0.4.41 to 0.4.42 by @dependabot[bot] in #2974
- build(deps): bump fancy-regex from 0.16.1 to 0.16.2 by @dependabot[bot] in #3000
- build(deps): bump flate2 from 1.1.2 to 1.1.3 by @dependabot[bot] in #3027
- build(deps): bump flexi_logger from 0.31.2 to 0.31.3 by @dependabot[bot] in #3005
- build(deps): bump flexi_logger from 0.31.3 to 0.31.4 by @dependabot[bot] in #3008
- build(deps): bump indexmap from 2.11.0 to 2.11.1 by @dependabot[bot] in #2973
- build(deps): bump indexmap from 2.11.1 to 2.11.3 by @dependabot[bot] in #2993
- build(deps): bump indexmap from 2.11.3 to 2.11.4 by @dependabot[bot] in #2999
- build(deps): bump libc from 0.2.175 to 0.2.176 by @dependabot[bot] in #3009
- build(deps): bump mlua from 0.11.3 to 0.11.4 by @dependabot[bot] in #3021
- build(deps): bump regex from 1.11.2 to 1.11.3 by @dependabot[bot] in #3011
- build(deps): bump redis from 0.32.5 to 0.32.6 by @dependabot[bot] in #3016
- build(deps): bump qsv-stats from 0.38.0 to 0.39.0 by @dependabot[bot] in #3028
- build(deps): bump qsv-stats from 0.39.0 to 0.39.1 by @dependabot[bot] in #3029
- build(deps): bump redis from 0.32.6 to 0.32.7 by @dependabot[bot] in #3025
- build(deps): bump serde from 1.0.219 to 1.0.223 by @dependabot[bot] in #2983
- build(deps): bump serde from 1.0.223 to 1.0.224 by @dependabot[bot] in #2988
- build(deps): bump serde from 1.0.224 to 1.0.225 by @dependabot[bot] in #2994
- build(deps): bump serde from 1.0.225 to 1.0.226 by @dependabot[bot] in #3002
- build(deps): bump serde from 1.0.226 to 1.0.227 by @dependabot[bot] in #3014
- build(deps): bump serde from 1.0.227 to 1.0.228 by @dependabot[bot] in #3019
- build(deps): bump serde_json from 1.0.143 to 1.0.145 by @dependabot[bot] in #2981
- build(deps): bump semver from 1.0.26 to 1.0.27 by @dependabot[bot] in #2982
- build(deps): bump sysinfo from 0.37.0 to 0.37.1 by @dependabot[bot] in #3015
- build(deps): bump sysinfo from 0.37.1 to 0.37.2 by @dependabot[bot] in #3024
- build(deps): bump tempfile from 3.21.0 to 3.22.0 by @dependabot[bot] in #2975
- build(deps): bump tempfile from 3.22.0 to 3.23.0 by @dependabot[bot] in #3007
- build(deps): bump toml from 0.9.6 to 0.9.7 by @dependabot[bot] in #3001
- pin zip to 4.6, as zip 5 has features that are not widely adopted b231a23
- applied select clippy lint suggestions
- updated indirect dependencies
- bumped MSRV to Rust 1.90
Fixed
describegpt: init cache vars even when --no-cache is used #2970describegpt:--base-urloption being ignored #2977schema: delimiter detection #2998extdedup: really use memmapped ondisk hash table #3020
Removed:
- removed powerpc64-le cross-compilation directive now that we have access to IBM-provided native PowerPC GH Action runner 9659bfc
- removed macOS on Intel (x86_64-apple-darwin) prebuilt binaries
Full Changelog: 7.1.0...8.0.0
-
SangyaPundir, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons https://commons.wikimedia.org/wiki/File:FAIR_data_principles.jpg ↩
7.1.0
[7.1.0] - 2025-09-06
🇮🇹 csv,conf,v9 edition 🍝
![]() |
Just in time for csv,conf,v9, we're Bologna-bound and will be talking all things qsv, CSV, open data, metadata standards, AI, POSE and CKAN! For this feature release, we polished describegpt a bit more for the occasion...Towards the "People's API!"! Verso l'API del Popolo! (Answering People/Policymaker Interface) |
🚀 Enhanced describegpt Command
- Configurable Frequency Limits: Make frequency distribution limit configurable for better control over data analysis
- Few-shot Learning: Add
--fewshot-examplesoption to improve LLM response quality with contextual examples - Advanced SQL Generation: Fine-tuned SQL generation guidance for better date handling and query optimization
- Conditional SQL Results: Implement conditional
--sql-resultsformat for more efficient "SQL RAG" processing - i.e. if the generated SQL query executes successfully - the results are saved to the specified file with a.csvextension. If a "SQL hallucination" fails, the file is saved with a.sqlextension instead for the user to tweak and edit. - TogetherAI Support: Add support for TogetherAI models endpoint, expanding LLM provider options
- Enhanced Error Handling: Improved SQL parsing error handling and more informative error messages
- Disk Cache by Default: The disk cache is now enabled by default for better performance
- TOML Configuration: Migrate from JSON to more readable TOML format for more easily modifiable prompt files.
(see https://github.com/dathere/qsv/blob/master/resources/describegpt_defaults.toml) - Better Local LLM Support:
--api-keycan now be set to NONE for local LLM configurations that may not necessarily run onlocalhost(e.g. a shared Local LLM service running on the local network)
partition Command Enhancements
- New
--limitOption: Implement--limitoption to set the maximum number of open files - Streaming to Enhanced Batching Logic: Convert from streaming to a simplified, two-pass batched approach designed to partition on columns with high cardinality for very large datasets
Added
describegpt: add configurable frequency limit #2950describegpt: migrate prompt file from JSON to more easier to edit TOML format #2954describegpt: refactor default prompt file; add--fewshot-examplesoption #2955describegpt: add TogetherAI support for models endpoint #2965partition: add--limitoption #2960- added Windows ARM64 prebuilt binaries
Changed
describegpt: enable disk cache by default #2951describegpt: Polars SQL generation tweaks #2958python: replace deprecatedwith_gilwithattach#2949. This sets the stage for "free-threaded" Python 3.14 support when its released in October 2025. Buh-bye GIL!- deps: bump embedded Luau from 0.688 to 0.690 #2967
- deps: bump Polars to 0.50.0 at py-1.33.0 tag
- build(deps): bump actions/setup-python from 5.6.0 to 6.0.0 by @dependabot[bot] in #2962
- build(deps): bump actions/stale from 9 to 10 by @dependabot[bot] in #2963
- build(deps): bump log from 0.4.27 to 0.4.28 by @dependabot[bot] in #2961
- build(deps): bump mlua from 0.11.2 to 0.11.3 by @dependabot[bot] in #2948
- build(deps): bump pyo3 from 0.25.1 to 0.26.0 by @dependabot[bot] in #2946
- build(deps): bump uuid from 1.18.0 to 1.18.1 by @dependabot[bot] in #2956
- build(deps): bump zip from 4.5.0 to 4.6.0 by @dependabot[bot] in #2952
- applied select clippy lints
- updated indirect dependencies
Full Changelog: 7.0.1...7.1.0

