Comparing changes

#55) ## Problem There are a number of data plane features that need to be implemented in the CLI: index upsert and ingestion, query, fetch, list vectors, delete vectors, etc. In order to work with these resources via CLI, we need a consistent way of establishing an `IndexConnection` using index name and namespace. We're also not threading `context.Context` through the cobra command tree properly, which is important for properly timing out actions and network requests. Currently, we're passing a lot of `context.Background()` directly rather than using the `cmd.Context()` option for shared context. ## Solution Add `NewIndexConnection` to the `sdk` package to allow establishing a connection to an index by `pinecone.Client`, index `name`, and `namespace`. This encapsulates the logic for describing the index to grab the host, and then initializing an `IndexConnection`. Update `root.go` to add an explicit root parent `context.Context` to `Execute`. Use `signal.NotifyContext` to allow interrupt and termination signals to properly cancel commands. Add a global `--timeout` flag to allow users to control the overall timeout per command. Set the default `timeout=60s` for now. ## Type of Change - [ ] Bug fix (non-breaking change which fixes an issue) - [X] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [ ] This change requires a documentation update - [ ] Infrastructure change (CI configs, etc) - [ ] Non-code change (docs, etc) - [ ] None of the above: (explain here) ## Test Plan CI - unit & integration tests Existing operations should continue working as expected. If you want to test passing `--timeout` it can be passed to any command using a duration format: `10s`, `1h`, `2m`, etc.

## Problem The CLI does not currently support dataplane operations when working with index resources. A core component of Pinecone is the ability to work with data inside of an index: `upsert`, `query`, `fetch`, `delete`, `update`, `list vectors`, and `describe index stats`. The CLI should offer flexibility in how users are able to pass their data (inline JSON, filepath, stdin, etc). ## Solution This PR implements new `vector` operations under the `index` command that allow you to work with data inside an index. The changes here represent an MVP of the functionality allowing users to interact with data inside of Pinecone. I've iterated a bunch on the ergonomics around getting data into Pinecone through the CLI, but am very open to suggestions around how these things can be improved going forward. NOTE: The main goal of the work here was to establish a solid base to work out from as we build up further features around data operations for a specific index. There are some gaps that I still need to work through around data presentation when not using JSON output, progress indicators for longer uploads, and better handling of output messages via stdout/stderr. Again, suggestions and feedback here would be very helpful! ### Index `vector` (dataplane) Commands - Add new `vector` sub-command under the `index` command. This command also includes aliases for `vectors`, `record` and `records`. This may be a bit overzealous for now, but I'm trying to take into account that folks may have suggestions around what name/resource to use for these commands. New commands: - `pc index describe-stats` - technically dataplane, but added to `index` since it returns a general index summary. - `pc index vector` (`upsert`, `update`, `delete`, `query`, `fetch`, `list`) Note: These operations should be up to date with `go-pinecone@v5.0.0`, so `fetch` and `update` both support new metadata operations, for example. ### New custom Cobra flag types There are several new flag types which are used with Cobra: `type JSONObject map[string]any`, `type Float32List []float32`, `type UInt32List []uint32`, and `type StringList []string`. The definitions and methods for these live in the `flags` package under utils. Each type has `Set()`, `String()`, and `Type()` operations to conform to the `pflag.Value` type: https://pkg.go.dev/github.com/spf13/pflag#Value. These flags handle processing inline, @file, or @- (stdin) input, while providing more informative typing in the CLI documentation and manpages. The data ingestion flow is driven by the changes mentioned below. ### JSON Ingestion (inline, file, and stdin) Because the size of payloads when working with vectors may be large, even for individual flags like `--vector` where the array itself may have a dimension of 1000, we needed a way to handle larger payloads in different places seamlessly. `stdin` as an input for operations is also important ergonomically when working with a CLI, so I've tried to come up with something fairly robust, yet easy-to-use (I hope). #### New `inputpolicy` / `stdin` packages `inputpolicy` defines a `DefaultMaxJSONBytes` variable which sets an overall bound on the size of payloads we ingest. I've defaulted this to 1 GiB for now, but the user can override this via the `PC_CLI_MAX_JSON_BYTES` env var. We can also change the default if that's desired. There's also `func ValidatePath()` which does some basic validation on paths provided to the CLI. `stdin` uses an `atomic.Bool` exposed via functions that prevent stdin from being used multiple times in the same process. #### New `argsio` package `argsio` wraps `inputpolicy` and `stdin` and exposes functions for working with io. `OpenReader()` checks an incoming value for inline, @file, or @-(stdin), returns an `io.ReadCloser` bounded by the inputpolicy for reading the data. `ReadAll()` and `DecodeJSONArg()` allow returning source bytes (used by `upsert` to support JSON or JSONL), and decoding a value directly into a type. These functions are used across the new commands and the new flag types to process incoming data. ### Notes - I went with the prefixes for file (`@file.json`) and stdin (`@-`) input after playing around with some other options. I'm definitely open to going with something else, but the flexibility per-flag felt nice. - Currently, only `upsert --body` supports JSONL. If this makes sense to expand to other places later we can. - I've cleaned up our usage of `context.Context` across the command tree. Previously, we weren't using the top-level Cobra `cmd.Context()` across all commands and surfaces. - The `--body` implementation and approach for these commands feels like nice devex, and I'd like to expand that same functionality to other commands like `index create` when possible. The most important thing to call out here would be that if a value is passed in both a flag and the body, the flag value wins. ## Type of Change - [ ] Bug fix (non-breaking change which fixes an issue) - [X] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [ ] This change requires a documentation update - [ ] Infrastructure change (CI configs, etc) - [ ] Non-code change (docs, etc) - [ ] None of the above: (explain here) ## Test Plan I need to add integration tests covering data plane operations in a future PR. I need to do a small overhaul to the current integration test harness to support organizing things by suite with proper setup, teardown, and sharing of external resources. For now I've done a bunch of manual testing. For testing yourself, and here are some example operations and small JSON/JSONL payloads to get you started: ```bash # create index pc index create --name test-index --dimension 5 --region us-east-1 --cloud aws # upsert to index from JSON / JSONL - test both file input & stdin pc index vector upsert --index-name test-index --namespace test-ns --body @./vectors_dim5x200.json cat ./vectors_dim5x200.json | pc index vector upsert --index-name test-index --namespace test-ns --body @- # describe stats pc index describe-stats pc index vector upsert --index-name test-index --namespace test-ns --body @./sparse_vectors_x200.json cat ./sparse_vectors_x200.json | pc index vector upsert --index-name test-index --namespace test-ns --body @- # list vectors in an index pc index vector list --index-name test-index --namespace test-ns # query vectors from an index by inline --vector, file, and stdin pc index vector query --index-name test-index --namespace test-ns --vector '[0.23,0.34,0.56,0.67,0.80]' --top-k 10 pc index vector query \ --index-name test-index \ --namespace test-ns \ --vector "$(jq '.vectors[0].values' ./vectors_dim5x200.json)" \ --top-k 10 cat @./vector.json | pc index vector query --index-name test-index --namespace test-ns --vector @- --top-k 10 # query with a filter pc index vector query --index-name test-index --namespace test-ns --vector @./vector.json --top-k 10 --filter '{"genre":{"$eq":"sci-fi"}}' # fetch vectors from an index by ID and metadata filter pc index vector fetch --index-name test-index --namespace test-ns --ids '["123","456"]' pc index vector fetch --index-name test-index --namespace test-ns --filter '{"genre":{"$eq":"drama"}}' --pagination-token ajx_123879729374 # update vector by ID and update vectors by metadata filter pc index vector update --index-name test-index --namespace test-ns --id vector-1 --metadata '{"genre":"sci-fi"}' pc index vector delete --index-name test-index --namespace test-ns --filter '{"genre":{"$eq":"sci-fi"}}' --metadata '{"genre":"fantasy"}' ``` [vectors_dim_5x200.json](https://github.com/user-attachments/files/23753641/vectors_dim_5x200.json) [sparse_vectors_x200.json](https://github.com/user-attachments/files/23753647/sparse_vectors_x200.json) JSONL pasted below because git doesn't support that filetype: `vectors` ```json {"id":"vector-1","values":[0.31834602,0.54113007,0.99697185,0.49015462,0.6359935],"metadata":{"country":"USA","genre":"drama","language":"Korean","title":"Echo Genesis","year":2023}} {"id":"vector-2","values":[0.28192687,0.99743,0.42680392,0.67175174,0.012253131],"metadata":{"country":"France","genre":"sci-fi","language":"English","title":"Apex Chronicles","year":2025}} {"id":"vector-3","values":[0.28798103,0.55216235,0.98476386,0.20921956,0.7093976],"metadata":{"country":"India","genre":"action","language":"Hindi","title":"Matrix Ascension","year":2019}} {"id":"vector-4","values":[0.67374235,0.30950361,0.14402653,0.542046,0.26878613],"metadata":{"country":"India","genre":"sci-fi","language":"Portuguese","title":"Prism Eclipse","year":2022}} ``` `sparse vectors` ```json {"id":"vector-1","sparse_values":{"indices":[4291719715,304304608,3085284271,2487496436,878024967],"values":[0.5754068,0.4353752,0.72693384,0.6066574,0.22014463]},"metadata":{"country":"India","genre":"documentary","language":"German","title":"Pulse Genesis","year":2025}} {"id":"vector-2","sparse_values":{"indices":[810272037,2834973828,3337122720,4265751223,564456948],"values":[0.39986563,0.6304805,0.6471212,0.6413373,0.70074713]},"metadata":{"country":"South Korea","genre":"action","language":"Hindi","title":"Quantum Odyssey","year":2023}} {"id":"vector-3","sparse_values":{"indices":[1197244266,3189103624,2949683914,2981049090,1487074986],"values":[0.97382957,0.7537474,0.39413723,0.9917503,0.90063083]},"metadata":{"country":"USA","genre":"fantasy","language":"Portuguese","title":"Aether Awakening","year":2025}} {"id":"vector-4","sparse_values":{"indices":[2459373064,358526014,2222015023,1719227962,1499432083],"values":[0.7923365,0.42066705,0.14323704,0.2608238,0.9800974]},"metadata":{"country":"Japan","genre":"horror","language":"French","title":"Echo Awakening","year":2024}} ``` --- - To see the specific tasks where the Asana app for GitHub is being used, see below: - https://app.asana.com/0/0/1210697636736731 - https://app.asana.com/0/0/1212070545711467 - https://app.asana.com/0/0/1212070545711470 - https://app.asana.com/0/0/1212070545711472

## Problem I got some feedback on the previous iteration of index [data ingestion and management work](#54). For JSON flag inputs, instead of using the "@" (@file.json(l) / @- (stdin)), from a consumptions perspective it may be easier to drop the "@" entirely. ## Solution Swap from using '@' prefixes for file and stdin inputs for JSON flags: check file suffixes instead (explicitly .json and .jsonl for now), read from stdin on '-', update unit tests. ## Type of Change - [ ] Bug fix (non-breaking change which fixes an issue) - [X] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [ ] This change requires a documentation update - [ ] Infrastructure change (CI configs, etc) - [ ] Non-code change (docs, etc) - [ ] None of the above: (explain here) ## Test Plan Same test flows as the previous PR: #54, but test stdin with "-" and files without any kind of prefix. Other functionality should remain the same.

## Problem `pc index describe-stats` is clunky as a command, change it to `pc index stats`. ## Solution Change flag name, update documentation examples. ## Type of Change - [ ] Bug fix (non-breaking change which fixes an issue) - [X] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [ ] This change requires a documentation update - [ ] Infrastructure change (CI configs, etc) - [ ] Non-code change (docs, etc) - [ ] None of the above: (explain here) ## Test Plan CI - unit + integration tests

## Problem N/A - see title ## Solution Update README.md ## Type of Change - [ ] Bug fix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [ ] This change requires a documentation update - [ ] Infrastructure change (CI configs, etc) - [X] Non-code change (docs, etc) - [ ] None of the above: (explain here) ## Test Plan N/A

## Problem While testing I ran into instances where an empty response, such as in `query` could result in a nil reference error. A lot of our presenters for table-output, and non-json output, don't do enough nil-checking on the inbound pointers. ## Solution Clean up the presenters package and individual presentational functions to better deal with nils, add a small `PrintEmptyState` function and test, and try and cover cases where we're accessing pointers without explicitly checking. ## Type of Change - [X] Bug fix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [ ] This change requires a documentation update - [ ] Infrastructure change (CI configs, etc) - [ ] Non-code change (docs, etc) - [ ] None of the above: (explain here) ## Test Plan CI - unit & integration tests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comparing changes

Open a pull request

Commits on Nov 17, 2025

Commits on Nov 25, 2025

Commits on Nov 26, 2025

This comparison is taking too long to generate.

Uh oh!