Skip to content

NUON: Support raw strings (SNUON) for token-efficient LLM serialization #17124

@andrewgazelka

Description

@andrewgazelka

Basics

  • I have done a basic search through the issue tracker to find similar or related issues.
  • I have made myself familiar with the available features of Nushell for the particular area this enhancement request touches.

Related problem

NUON is great for structured data, but it still escapes strings containing quotes or backslashes. This causes problems for LLM tooling.

Example - reading a TOML file:

# example.toml
name = "my-app"
version = "1.0.0"

When serialized to NUON:

open --raw example.toml | to nuon

Output:

"# example.toml
name = \"my-app\"
version = \"1.0.0\"
"

Why this is problematic for LLMs:

  • Models see tokens, not characters. \" might be one token or two
  • Find-replace operations become fragile because token boundaries don't match what humans see as strings
  • When models see escaped strings in context, they learn to output escaped strings - causing a mismatch between input/output formats
  • Escaping makes string operations delicate and error-prone

This also affects human readability - escaped strings are harder to read and write.

Describe the solution you'd like

Support a "SNUON" (Simple NUON) mode that uses Nushell's raw string syntax (r#'...'#) instead of escaping. When a string contains quotes or backslashes, use raw strings. Otherwise, identical to NUON.

The same example would become:

open --raw example.toml | to snuon

Output:

r##'# example.toml
name = "my-app"
version = "1.0.0"
'##

The string is exactly what you see between r##' and '## - no escaping needed.

Benefits:

  • Token-efficient: no escape sequences to inflate token count
  • Model bias: input and output use the same format, so models learn to generate correct output
  • Human readable: what you see is what you get
  • Already valid Nushell: raw strings work directly in Nushell today

Could be implemented as:

  • A --raw flag on to nuon
  • A separate to snuon command
  • Automatic detection (use raw strings only when escaping would be needed)

Describe alternatives you've considered

Using other formats like JSON or YAML, but they have the same escaping problems (or worse).

Additional context and details

Full writeup with more examples: https://imandrew.pages.dev/thoughts/snuon

Related to improving nushell's MCP experience for LLM tooling. See also #17122 and #17123.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions