encoding/json/jsontext: add errors to Token accessors for numbers

### Proposal Details

This was proposed in https://github.com/go-json-experiment/json/pull/158 . An implementation of this proposal is also present there.

## Background

I was using jsontext to implement a Unmarshaler for my type. But I found the The current jsontext.Token APIs about number parsing are confusing and easy to misuse.

See also https://github.com/golang/go/discussions/63397#discussioncomment-11942532

At first glance, it seems very natural to use `Token.Float()` in the custom unmarshaler. However, it is not.

Specifically, I'm not satisfied with the following behavior (and cannot be disabled):
```golang
token, _ := jsontext.NewDecoder(bytes.NewReader([]byte("1e500"))).ReadToken()
token.Float() // returns 1.7976931348623157e+308
```
```golang
token, _ := jsontext.NewDecoder(bytes.NewReader([]byte(`"Infinity"`))).ReadToken()
token.Float() // returns +Inf
```
These are not standard, and not consistent with the json package itself. But a new user of jsontext package may not realize this, and finally implement an undesired unmarshaler. e.g. unmarshal `{"total":1e1000,"used":1e500}` into
```golang
type Amount struct {
	Total float64 `json:"total"`
	Used  float64 `json:"used"`
}
```
`Total - Used` could return `0` silently if the unmarshaler is using `Token.Float()`

So I think `Token.Float()`, etc. are not suitable for custom unmarshaler.

## Proposed API Changes

Separate API for decoding and encoding
```mermaid
flowchart LR

Decoder -->|ReadToken| RawToken -->|jsontext.Raw| Token -->|WriteToken| Encoder
RawToken -->|ParseInt| int((int64)) -->|jsontext.Int| Token
```

**Added**:
```golang
// RawToken is similar to [Token], and is returned by [Decoder.ReadToken].
//
// Use [Raw] to convert it to [Token] for [Encoder.WriteToken].
type RawToken struct {...}

func (RawToken) ParseFloat(bits int) (float64, error)
func (RawToken) ParseInt(bits int) (int64, error)
func (RawToken) ParseUint(bits int) (uint64, error)

// Raw wraps a [RawToken] as [Token], for passing through the freshly decoded token to [Encoder].
func Raw(RawToken) Token

// Raw returns the [RawToken] embedded.
// It panics if the token is not created with [Raw].
func (t Token) Raw() RawToken {
```
ParseXxx functions should work like `strconv.ParseXxx`. The only difference is that it only needs to support JSON format, so that the implementation may be more efficient. Compared with the current `Token.Float`() etc., it will not silently exhibit any non-standard behavior:
- It will not parse `"Infinity"` as float value Infinity (error should be returned instead)
- It will not parse `1e500` as `math.MaxFloat64` or `math.MaxInt64` (error should be returned instead)
- It will not truncate fractional component when parsing int (error should be returned instead)

**Changed**: `Decoder.ReadToken` should now return `RawToken` instead of `Token`
```golang
// ReadToken reads the next [RawToken], advancing the read offset.
// The returned token is only valid until the next Peek, Read, or Skip call.
// It returns [io.EOF] if there are no more tokens.
func (d *Decoder) ReadToken() (RawToken, error)
```

**Behavior changes** to existing APIs:

When passing an Infinity, -Infinity, NaN float token to `Encoder.WriteToken`, it should now return an error, rather than writing JSON string `"Infinity"`, `"-Infinity"`, `"NaN"`, which is not standard.

The following functions are greatly simplified to only act as accessors to `Token` values passed to corresponding constructor. Much like `reflect.Value.Float()`

```golang
// Float returns the floating-point value for a JSON number.
// It panics if the token is not created with [Float].
func (Token) Float() float64
// Int returns the signed integer value for a JSON number.
// It panics if the token is not created with [Int].
func (Token) Int() int64
// Uint returns the unsigned integer value for a JSON number.
// It panics if the token is not created with [Uint].
func (Token) Uint() uint64
```

As a result of this change, only the passthrough use case will become a little more complex (need `enc.WriteToken(jsontext.Raw(tok))`, rather than `enc.WriteToken(tok)`. I think this use case is very rare.

Other currently supported patterns that are disabled by this proposal, such as `jsontext.Float(123.456).Int()`, should not have valid use cases, IMO.

`RawToken` returned by decoder does not have `Float()` method, so it is impossible to misuse.

Convertion between JSON string "Infinity", "-Infinity", "NaN" and Go float is now an `encoding/json` feature exclusively (controlled by `nonfinite` format option). jsontext package will not do such convertion. User can write his own convert function easily.

## Alternatives

### Use strconv

```golang
tok, _ := dec.ReadToken()
f, err := strconv.ParseFloat(tok.String(), 64)
```
It will take non-trivial efforts to realize that JSON number is a subset of Go float. This usage is very hard to discover.
And we cannot take the advantage of that json-float is actually simpler than go-float (in the future).

### Add ParseFloat method to Token

`ParseXxx` is actually only useful for decoding. But if added to existing `Token` type, we will be forced to determine what to do with `jsontext.Float(123.456).ParseInt()`.

cc @dsnet 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

encoding/json/jsontext: add errors to Token accessors for numbers #77666

Proposal Details

Background

Proposed API Changes

Alternatives

Use strconv

Add ParseFloat method to Token

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

encoding/json/jsontext: add errors to Token accessors for numbers #77666

Description

Proposal Details

Background

Proposed API Changes

Alternatives

Use strconv

Add ParseFloat method to Token

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions