JSON Marshaling of large responses is excessively expensive

**What did you do?**
Sent a query looking like `metricname[3d]` to prometheus.

**What did you expect to see?**
`metricname` has ~5k labelsets, and from my math I'm expecting to get a large data response (~2G)

**What did you see instead? Under which circumstances?**
Instead, I seemingly never get a response from prometheus. More interestingly I see a large increase in memory utilization to the point that prometheus stops scraping and eventually OOMs. From digging in more I found that the issue is all to do with how prometheus is marshaling out the response on the HTTP API. With my below test script, we generate 3d worth of data (at a 15s period) for 5k timeseries. We generate that data (~500ms and ~1.2G RAM) and then json marshal that data (~2m -- and consumes ~11G of RAM).


**Issues**
- json.Marshal uses significantly more memory than the original (1.2G) and the output (2G) at ~11G
- requests in [json.Marshal](https://github.com/prometheus/prometheus/blob/master/web/api/v1/api.go#L729) aren't cancel-able so that if a request like this ever hits prometheus, it will either complete or cause prometheus to die
- in addition to the large memory footprint, the marshaling (in the example below) takes ~2m on my desktop -- which is excessive, especially when you consider that the data generation took ~500ms.

**Suggestions**
***Suggestion 1***
In an effort to alleviate both problems I suggest the json marshaling is made to stream the data to the wire. There's no need to make a copy of it all in memory first, especially in the API case where we literally just write to the buffer. A terrible-hacky example would be something like:
```
		enc := json.NewEncoder(w)
		w.Write([]byte{'['})
		
		for i, item := range m {
    		if err := enc.Encode(item); err != nil {
    		    fmt.Println(err)
    		    return
    		}
    		if i < len(m)-1 {
    		    w.Write([]byte{','})
    		}
		}
		w.Write([]byte{']'})
```
In this example we would spin over every entry in the response and marshal that out. This means that each samplestream (in this example) would need to be in memory, but we'd then write it to the wire and no longer need it in memory. In addition this means that the request is "cancelable" at each encode step (if the client disconnects, then you get a stream closed error). Of course the "correct" implementation of this would require a bit of type switching

***Suggestion 2***
Change the marshaling of the various structs to be codegen'd. Most of them are partially there, but there are some minor improvements that can be made that would give you ~2x boost in performance (mostly copying less, and reflecting less).


For both of these I'd more than happy to implement it (its not that bad), but I wanted to get some feedback prior to implementation.

**Repro Script**
```
package main

import (
	"encoding/json"
	"fmt"
	"strconv"
	"time"

	"github.com/prometheus/common/model"
)

func generateData() model.Matrix {
	NUM_TIMESERIES := 5000
	NUM_DATAPOINTS := 17280

	// Create the top-level matrix
	m := make(model.Matrix, 0)

	for i := 0; i < NUM_TIMESERIES; i++ {
		lset := map[model.LabelName]model.LabelValue{
			model.MetricNameLabel: model.LabelValue("timeseries_" + strconv.Itoa(i)),
		}

		now := model.Now()

		values := make([]model.SamplePair, NUM_DATAPOINTS)

		for x := NUM_DATAPOINTS; x > 0; x-- {
			values[x-1] = model.SamplePair{
				Timestamp: now.Add(time.Second * -15 * time.Duration(x)), // Set the time back assuming a 15s interval
				Value:     model.SampleValue(float64(x)),
			}
		}

		ss := &model.SampleStream{
			Metric: model.Metric(lset),
			Values: values,
		}

		m = append(m, ss)
	}
	return m
}

func main() {
	start := time.Now()
	m := generateData()
	took := time.Now().Sub(start)

	fmt.Println("done generatingData took:", took)

	start = time.Now()
	json.Marshal(m)
	took = time.Now().Sub(start)
	fmt.Println("done marshaling took:", took)
}
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JSON Marshaling of large responses is excessively expensive #3601

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

JSON Marshaling of large responses is excessively expensive #3601

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions