Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: pinecone-io/cli
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v0.1.3
Choose a base ref
...
head repository: pinecone-io/cli
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v0.2.0
Choose a head ref
  • 6 commits
  • 65 files changed
  • 1 contributor

Commits on Nov 17, 2025

  1. Implement sdk.NewIndexConnection, clean up context.Context passing (

    #55)
    
    ## Problem
    There are a number of data plane features that need to be implemented in
    the CLI: index upsert and ingestion, query, fetch, list vectors, delete
    vectors, etc.
    
    In order to work with these resources via CLI, we need a consistent way
    of establishing an `IndexConnection` using index name and namespace.
    
    We're also not threading `context.Context` through the cobra command
    tree properly, which is important for properly timing out actions and
    network requests. Currently, we're passing a lot of
    `context.Background()` directly rather than using the `cmd.Context()`
    option for shared context.
    
    ## Solution
    Add `NewIndexConnection` to the `sdk` package to allow establishing a
    connection to an index by `pinecone.Client`, index `name`, and
    `namespace`. This encapsulates the logic for describing the index to
    grab the host, and then initializing an `IndexConnection`.
    
    Update `root.go` to add an explicit root parent `context.Context` to
    `Execute`. Use `signal.NotifyContext` to allow interrupt and termination
    signals to properly cancel commands. Add a global `--timeout` flag to
    allow users to control the overall timeout per command. Set the default
    `timeout=60s` for now.
    
    ## Type of Change
    - [ ] Bug fix (non-breaking change which fixes an issue)
    - [X] New feature (non-breaking change which adds functionality)
    - [ ] Breaking change (fix or feature that would cause existing
    functionality to not work as expected)
    - [ ] This change requires a documentation update
    - [ ] Infrastructure change (CI configs, etc)
    - [ ] Non-code change (docs, etc)
    - [ ] None of the above: (explain here)
    
    ## Test Plan
    CI - unit & integration tests
    
    Existing operations should continue working as expected. If you want to
    test passing `--timeout` it can be passed to any command using a
    duration format: `10s`, `1h`, `2m`, etc.
    austin-denoble authored Nov 17, 2025
    Configuration menu
    Copy the full SHA
    723dd6c View commit details
    Browse the repository at this point in the history

Commits on Nov 25, 2025

  1. Implement Vector Upsert, Query, Fetch, List, Delete, and Update (#54)

    ## Problem
    The CLI does not currently support dataplane operations when working
    with index resources. A core component of Pinecone is the ability to
    work with data inside of an index: `upsert`, `query`, `fetch`, `delete`,
    `update`, `list vectors`, and `describe index stats`. The CLI should
    offer flexibility in how users are able to pass their data (inline JSON,
    filepath, stdin, etc).
    
    ## Solution
    This PR implements new `vector` operations under the `index` command
    that allow you to work with data inside an index. The changes here
    represent an MVP of the functionality allowing users to interact with
    data inside of Pinecone. I've iterated a bunch on the ergonomics around
    getting data into Pinecone through the CLI, but am very open to
    suggestions around how these things can be improved going forward.
    
    NOTE: The main goal of the work here was to establish a solid base to
    work out from as we build up further features around data operations for
    a specific index. There are some gaps that I still need to work through
    around data presentation when not using JSON output, progress indicators
    for longer uploads, and better handling of output messages via
    stdout/stderr. Again, suggestions and feedback here would be very
    helpful!
    
    ### Index `vector` (dataplane) Commands
    - Add new `vector` sub-command under the `index` command. This command
    also includes aliases for `vectors`, `record` and `records`. This may be
    a bit overzealous for now, but I'm trying to take into account that
    folks may have suggestions around what name/resource to use for these
    commands.
    
    New commands:
    - `pc index describe-stats` - technically dataplane, but added to
    `index` since it returns a general index summary.
    - `pc index vector` (`upsert`, `update`, `delete`, `query`, `fetch`,
    `list`)
    
    Note: These operations should be up to date with `go-pinecone@v5.0.0`,
    so `fetch` and `update` both support new metadata operations, for
    example.
    
    ### New custom Cobra flag types
    
    There are several new flag types which are used with Cobra: `type
    JSONObject map[string]any`, `type Float32List []float32`, `type
    UInt32List []uint32`, and `type StringList []string`. The definitions
    and methods for these live in the `flags` package under utils. Each type
    has `Set()`, `String()`, and `Type()` operations to conform to the
    `pflag.Value` type: https://pkg.go.dev/github.com/spf13/pflag#Value.
    
    These flags handle processing inline, @file, or @- (stdin) input, while
    providing more informative typing in the CLI documentation and manpages.
    The data ingestion flow is driven by the changes mentioned below.
    
    ### JSON Ingestion (inline, file, and stdin)
    
    Because the size of payloads when working with vectors may be large,
    even for individual flags like `--vector` where the array itself may
    have a dimension of 1000, we needed a way to handle larger payloads in
    different places seamlessly. `stdin` as an input for operations is also
    important ergonomically when working with a CLI, so I've tried to come
    up with something fairly robust, yet easy-to-use (I hope).
    
    #### New `inputpolicy` / `stdin` packages
    
    `inputpolicy` defines a `DefaultMaxJSONBytes` variable which sets an
    overall bound on the size of payloads we ingest. I've defaulted this to
    1 GiB for now, but the user can override this via the
    `PC_CLI_MAX_JSON_BYTES` env var. We can also change the default if
    that's desired. There's also `func ValidatePath()` which does some basic
    validation on paths provided to the CLI.
    
    `stdin` uses an `atomic.Bool` exposed via functions that prevent stdin
    from being used multiple times in the same process.
    
    #### New `argsio` package
    
    `argsio` wraps `inputpolicy` and `stdin` and exposes functions for
    working with io. `OpenReader()` checks an incoming value for inline,
    @file, or @-(stdin), returns an `io.ReadCloser` bounded by the
    inputpolicy for reading the data. `ReadAll()` and `DecodeJSONArg()`
    allow returning source bytes (used by `upsert` to support JSON or
    JSONL), and decoding a value directly into a type.
    
    These functions are used across the new commands and the new flag types
    to process incoming data.
    
    ### Notes
    - I went with the prefixes for file (`@file.json`) and stdin (`@-`)
    input after playing around with some other options. I'm definitely open
    to going with something else, but the flexibility per-flag felt nice.
    - Currently, only `upsert --body` supports JSONL. If this makes sense to
    expand to other places later we can.
    - I've cleaned up our usage of `context.Context` across the command
    tree. Previously, we weren't using the top-level Cobra `cmd.Context()`
    across all commands and surfaces.
    - The `--body` implementation and approach for these commands feels like
    nice devex, and I'd like to expand that same functionality to other
    commands like `index create` when possible. The most important thing to
    call out here would be that if a value is passed in both a flag and the
    body, the flag value wins.
    
    ## Type of Change
    - [ ] Bug fix (non-breaking change which fixes an issue)
    - [X] New feature (non-breaking change which adds functionality)
    - [ ] Breaking change (fix or feature that would cause existing
    functionality to not work as expected)
    - [ ] This change requires a documentation update
    - [ ] Infrastructure change (CI configs, etc)
    - [ ] Non-code change (docs, etc)
    - [ ] None of the above: (explain here)
    
    ## Test Plan
    I need to add integration tests covering data plane operations in a
    future PR. I need to do a small overhaul to the current integration test
    harness to support organizing things by suite with proper setup,
    teardown, and sharing of external resources.
    
    For now I've done a bunch of manual testing. For testing yourself, and
    here are some example operations and small JSON/JSONL payloads to get
    you started:
    
    ```bash
    # create index
    pc index create --name test-index --dimension 5 --region us-east-1 --cloud aws
    
    # upsert to index from JSON / JSONL - test both file input & stdin
    pc index vector upsert --index-name test-index --namespace test-ns --body @./vectors_dim5x200.json
    cat ./vectors_dim5x200.json | pc index vector upsert --index-name test-index --namespace test-ns --body @-
    
    # describe stats
    pc index describe-stats
    
    pc index vector upsert --index-name test-index --namespace test-ns --body @./sparse_vectors_x200.json
    cat ./sparse_vectors_x200.json | pc index vector upsert --index-name test-index --namespace test-ns --body @-
    
    # list vectors in an index
    pc index vector list --index-name test-index --namespace test-ns
    
    # query vectors from an index by inline --vector, file, and stdin
    pc index vector query --index-name test-index --namespace test-ns --vector '[0.23,0.34,0.56,0.67,0.80]' --top-k 10
    pc index vector query \ 
      --index-name test-index \ 
      --namespace test-ns \ 
      --vector "$(jq '.vectors[0].values' ./vectors_dim5x200.json)" \
      --top-k 10
    cat @./vector.json | pc index vector query --index-name test-index --namespace test-ns --vector @- --top-k 10
    
    # query with a filter
    pc index vector query --index-name test-index --namespace test-ns --vector @./vector.json --top-k 10 --filter '{"genre":{"$eq":"sci-fi"}}'
    
    # fetch vectors from an index by ID and metadata filter
    pc index vector fetch --index-name test-index --namespace test-ns --ids '["123","456"]'
    pc index vector fetch --index-name test-index --namespace test-ns --filter '{"genre":{"$eq":"drama"}}' --pagination-token ajx_123879729374
    
    # update vector by ID and update vectors by metadata filter
    pc index vector update --index-name test-index --namespace test-ns --id vector-1 --metadata '{"genre":"sci-fi"}'
    pc index vector delete --index-name test-index --namespace test-ns --filter '{"genre":{"$eq":"sci-fi"}}' --metadata '{"genre":"fantasy"}'
    ```
    
    
    
    [vectors_dim_5x200.json](https://github.com/user-attachments/files/23753641/vectors_dim_5x200.json)
    
    [sparse_vectors_x200.json](https://github.com/user-attachments/files/23753647/sparse_vectors_x200.json)
    
    JSONL pasted below because git doesn't support that filetype:
    `vectors`
    ```json
    {"id":"vector-1","values":[0.31834602,0.54113007,0.99697185,0.49015462,0.6359935],"metadata":{"country":"USA","genre":"drama","language":"Korean","title":"Echo Genesis","year":2023}}
    {"id":"vector-2","values":[0.28192687,0.99743,0.42680392,0.67175174,0.012253131],"metadata":{"country":"France","genre":"sci-fi","language":"English","title":"Apex Chronicles","year":2025}}
    {"id":"vector-3","values":[0.28798103,0.55216235,0.98476386,0.20921956,0.7093976],"metadata":{"country":"India","genre":"action","language":"Hindi","title":"Matrix Ascension","year":2019}}
    {"id":"vector-4","values":[0.67374235,0.30950361,0.14402653,0.542046,0.26878613],"metadata":{"country":"India","genre":"sci-fi","language":"Portuguese","title":"Prism Eclipse","year":2022}}
    ```
    
    `sparse vectors`
    ```json
    {"id":"vector-1","sparse_values":{"indices":[4291719715,304304608,3085284271,2487496436,878024967],"values":[0.5754068,0.4353752,0.72693384,0.6066574,0.22014463]},"metadata":{"country":"India","genre":"documentary","language":"German","title":"Pulse Genesis","year":2025}}
    {"id":"vector-2","sparse_values":{"indices":[810272037,2834973828,3337122720,4265751223,564456948],"values":[0.39986563,0.6304805,0.6471212,0.6413373,0.70074713]},"metadata":{"country":"South Korea","genre":"action","language":"Hindi","title":"Quantum Odyssey","year":2023}}
    {"id":"vector-3","sparse_values":{"indices":[1197244266,3189103624,2949683914,2981049090,1487074986],"values":[0.97382957,0.7537474,0.39413723,0.9917503,0.90063083]},"metadata":{"country":"USA","genre":"fantasy","language":"Portuguese","title":"Aether Awakening","year":2025}}
    {"id":"vector-4","sparse_values":{"indices":[2459373064,358526014,2222015023,1719227962,1499432083],"values":[0.7923365,0.42066705,0.14323704,0.2608238,0.9800974]},"metadata":{"country":"Japan","genre":"horror","language":"French","title":"Echo Awakening","year":2024}}
    ```
    
    ---
    - To see the specific tasks where the Asana app for GitHub is being
    used, see below:
      - https://app.asana.com/0/0/1210697636736731
      - https://app.asana.com/0/0/1212070545711467
      - https://app.asana.com/0/0/1212070545711470
      - https://app.asana.com/0/0/1212070545711472
    austin-denoble authored Nov 25, 2025
    Configuration menu
    Copy the full SHA
    845bd1c View commit details
    Browse the repository at this point in the history

Commits on Nov 26, 2025

  1. Refactor ingestion for file/stdin (#56)

    ## Problem
    I got some feedback on the previous iteration of index [data ingestion
    and management work](#54). For
    JSON flag inputs, instead of using the "@" (@file.json(l) / @- (stdin)),
    from a consumptions perspective it may be easier to drop the "@"
    entirely.
    
    ## Solution
    Swap from using '@' prefixes for file and stdin inputs for JSON flags:
    check file suffixes instead (explicitly .json and .jsonl for now), read
    from stdin on '-', update unit tests.
    
    ## Type of Change
    - [ ] Bug fix (non-breaking change which fixes an issue)
    - [X] New feature (non-breaking change which adds functionality)
    - [ ] Breaking change (fix or feature that would cause existing
    functionality to not work as expected)
    - [ ] This change requires a documentation update
    - [ ] Infrastructure change (CI configs, etc)
    - [ ] Non-code change (docs, etc)
    - [ ] None of the above: (explain here)
    
    ## Test Plan
    Same test flows as the previous PR:
    #54, but test stdin with "-" and
    files without any kind of prefix. Other functionality should remain the
    same.
    austin-denoble authored Nov 26, 2025
    Configuration menu
    Copy the full SHA
    8b2d473 View commit details
    Browse the repository at this point in the history
  2. Rename describe-stats -> stats (#57)

    ## Problem
    `pc index describe-stats` is clunky as a command, change it to `pc index
    stats`.
    
    ## Solution
    Change flag name, update documentation examples.
    
    ## Type of Change
    - [ ] Bug fix (non-breaking change which fixes an issue)
    - [X] New feature (non-breaking change which adds functionality)
    - [ ] Breaking change (fix or feature that would cause existing
    functionality to not work as expected)
    - [ ] This change requires a documentation update
    - [ ] Infrastructure change (CI configs, etc)
    - [ ] Non-code change (docs, etc)
    - [ ] None of the above: (explain here)
    
    ## Test Plan
    CI - unit + integration tests
    austin-denoble authored Nov 26, 2025
    Configuration menu
    Copy the full SHA
    65859ab View commit details
    Browse the repository at this point in the history
  3. Finalize README for new vector operations (#58)

    ## Problem
    N/A - see title
    
    ## Solution
    Update README.md
    
    ## Type of Change
    - [ ] Bug fix (non-breaking change which fixes an issue)
    - [ ] New feature (non-breaking change which adds functionality)
    - [ ] Breaking change (fix or feature that would cause existing
    functionality to not work as expected)
    - [ ] This change requires a documentation update
    - [ ] Infrastructure change (CI configs, etc)
    - [X] Non-code change (docs, etc)
    - [ ] None of the above: (explain here)
    
    ## Test Plan
    N/A
    austin-denoble authored Nov 26, 2025
    Configuration menu
    Copy the full SHA
    63b72c1 View commit details
    Browse the repository at this point in the history
  4. Clean up presenters pointer handling (#59)

    ## Problem
    While testing I ran into instances where an empty response, such as in
    `query` could result in a nil reference error. A lot of our presenters
    for table-output, and non-json output, don't do enough nil-checking on
    the inbound pointers.
    
    ## Solution
    Clean up the presenters package and individual presentational functions
    to better deal with nils, add a small `PrintEmptyState` function and
    test, and try and cover cases where we're accessing pointers without
    explicitly checking.
    
    ## Type of Change
    
    - [X] Bug fix (non-breaking change which fixes an issue)
    - [ ] New feature (non-breaking change which adds functionality)
    - [ ] Breaking change (fix or feature that would cause existing
    functionality to not work as expected)
    - [ ] This change requires a documentation update
    - [ ] Infrastructure change (CI configs, etc)
    - [ ] Non-code change (docs, etc)
    - [ ] None of the above: (explain here)
    
    ## Test Plan
    CI - unit & integration tests
    austin-denoble authored Nov 26, 2025
    Configuration menu
    Copy the full SHA
    132b30f View commit details
    Browse the repository at this point in the history
Loading