Skip to content

fix(models): handle binary file downloads in tool execution#190

Merged
jerann merged 3 commits into
mainfrom
fix/execute-binary-download-handling
Jun 3, 2026
Merged

fix(models): handle binary file downloads in tool execution#190
jerann merged 3 commits into
mainfrom
fix/execute-binary-download-handling

Conversation

@jerann

@jerann jerann commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Summary

  • File-download actions (e.g. googledrive_unified_download_file) serve raw binary with the file's own MIME type and a Content-Disposition header, so StackOneTool.execute() calling response.json() unconditionally raised UnicodeDecodeError on the first non-UTF-8 byte.
  • execute() now branches on Content-Type: JSON media types (application/json and +json suffixes) are parsed as before; any other type returns the file as {content: bytes, content_type, status_code, headers, file_name}, matching the StackOne generated SDKs' download response shape. file_name is parsed from Content-Disposition (including RFC 5987 filename*).
  • Existing JSON responses are unchanged; the fix applies to the RPC tool path transitively.

Note: content holds raw bytes and is not JSON-serializable — callers that re-serialize tool results (e.g. for an LLM) should handle that key. The sibling stackone-ai-node SDK has the same response.json() gap and warrants a matching fix.

🤖 Generated with Claude Code


Summary by cubic

Fixes StackOneTool.execute() failing on binary downloads by parsing JSON only when the Content-Type is JSON and returning bytes + metadata for other types. Improves filename parsing to honor RFC 5987 charsets. Adds README docs explaining the file-download return shape and the raw-bytes caveat.

  • Bug Fixes
    • Treat JSON only as application/json or +json types; otherwise return {content: bytes, content_type, status_code, headers, file_name}.
    • Parse file_name from Content-Disposition, including RFC 5987 filename*; honor declared charsets (UTF-8/ISO-8859-1), fall back to UTF-8 on unknown labels, and strip quotes on non-conformant values.
    • Applies to RPC actions (e.g. googledrive_unified_download_file) as well.
    • Added tests for binary bodies, missing Content-Type, JSON with charset, ISO-8859-1 and unknown filename* charsets, and the helper parsers.

Written for commit 10d7802. Summary will update on new commits.

Review in cubic

Download actions (e.g. googledrive_unified_download_file) serve raw binary
with the file's own MIME type and a Content-Disposition header, so
StackOneTool.execute() calling response.json() unconditionally raised
UnicodeDecodeError on the first non-UTF-8 byte.

Branch on Content-Type: parse JSON only for JSON media types (application/json
and +json suffixes); otherwise return the file as
{content: bytes, content_type, status_code, headers, file_name}, matching the
StackOne generated SDKs' download response shape. file_name is parsed from
Content-Disposition (incl. RFC 5987 filename*). Existing JSON behavior is
unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 3, 2026 07:27

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates StackOneTool.execute() to correctly handle file-download endpoints that return raw binary bodies by only JSON-parsing responses when the Content-Type indicates a JSON media type, and otherwise returning the raw bytes plus relevant metadata (including file_name parsed from Content-Disposition).

Changes:

  • Added Content-Type detection to parse JSON only for application/json / +json responses; otherwise return {content: bytes, content_type, status_code, headers, file_name}.
  • Added helper utilities for JSON media type detection and Content-Disposition filename extraction (including filename* support).
  • Added tests covering binary bodies, missing Content-Type, JSON with charset parameters, and the helper parsers.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
tests/test_tool_calling.py Adds regression and helper unit tests for binary download handling and header parsing.
stackone_ai/models.py Implements Content-Type-based branching in execute() and adds helper functions for JSON detection and filename extraction.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread stackone_ai/models.py Outdated
Comment on lines +82 to +84
extended = re.search(r"filename\*\s*=\s*[^']*'[^']*'([^;]+)", value, re.IGNORECASE)
if extended:
return unquote(extended.group(1).strip())

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 10d7802: declared charset is now honoured via unquote(..., encoding=charset) with a UTF-8 fallback on unknown charsets, and surrounding quotes are stripped off non-conformant quoted filename* values (.strip('"')).

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 2 files

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread stackone_ai/models.py Outdated
Add a "File Downloads" section covering the bytes-plus-metadata dict that
execute()/call() return for non-JSON responses, the Content-Type detection
rule, and the not-JSON-serializable caveat for LLM-facing re-serialization.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 issues found across 1 file (changes from recent commits).

Requires human review: Auto-approval blocked by 1 unresolved issue from previous reviews.

Re-trigger cubic

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

Comment thread stackone_ai/models.py Outdated
Comment on lines +82 to +85
extended = re.search(r"filename\*\s*=\s*[^']*'[^']*'([^;]+)", value, re.IGNORECASE)
if extended:
return unquote(extended.group(1).strip())
quoted = re.search(r'filename\s*=\s*"([^"]*)"', value, re.IGNORECASE)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 10d7802 — charset is parsed from the filename* value and passed to unquote(..., encoding=...), falling back to UTF-8 on an unknown charset label.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This s great catch by copilot but only in Python for parity we should optionally change this to the node too?

Comment on lines +505 to +507
# RFC 5987 extended form is percent-decoded and takes precedence.
("attachment; filename=\"fallback.txt\"; filename*=UTF-8''na%C3%AFve.txt", "naïve.txt"),
("attachment", None),

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in 10d7802: ISO-8859-1 (non-UTF-8) charset, an unknown-charset fallback, and a quoted filename* case in test_filename_from_content_disposition.

The Content-Disposition filename* branch captured the charset in its
regex but discarded it, so unquote() always decoded with UTF-8. RFC 5987
also permits ISO-8859-1, so an ISO-8859-1 percent-encoded filename
decoded to mojibake despite the helper claiming RFC 5987 support.

- Decode the extended value with its declared charset; fall back to
  UTF-8 for an unknown or empty charset label (LookupError) instead of
  raising.
- Strip surrounding quotes off non-conformant quoted filename* values.
- Add tests for ISO-8859-1, an unknown charset, and a quoted filename*.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 issues found across 2 files (changes from recent commits).

Requires human review: This change modifies core execution logic to handle binary responses differently, altering the return shape for non-JSON tool results, which could impact existing callers that rely on JSON-deserialized responses.

Re-trigger cubic

@willleeney willleeney left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@jerann jerann merged commit 8983c68 into main Jun 3, 2026
16 checks passed
@jerann jerann deleted the fix/execute-binary-download-handling branch June 3, 2026 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants