Skip to content

validate FT.HYBRID vector input type#3756

Merged
ndyakov merged 3 commits into
redis:masterfrom
DengY11:fix/fthybrid-vector-validation
May 8, 2026
Merged

validate FT.HYBRID vector input type#3756
ndyakov merged 3 commits into
redis:masterfrom
DengY11:fix/fthybrid-vector-validation

Conversation

@DengY11

@DengY11 DengY11 commented Mar 29, 2026

Copy link
Copy Markdown
Contributor

See #3755

With this change:

  • FT.HYBRID only accepts VectorFP32
  • unsupported types like VectorValues and VectorRef now return a clear client-side error
  • added unit tests for valid and invalid vector inputs

Note

Medium Risk
Changes FT.HYBRID argument building to validate/normalize vector blobs client-side and to reject some previously-accepted Vector implementations (e.g. VectorValues/VectorRef), which could be a behavior-breaking change for callers.

Overview
FT.HYBRID now extracts and validates the raw vector blob via a new hybridVectorBlob helper, returning clear client-side errors for missing/empty vectors and explicitly rejecting unsupported Vector implementations (notably VectorValues and VectorRef) instead of sending malformed args.

Adds new blob-backed vector wrappers (VectorFloat16, VectorBFloat16, VectorFloat64, VectorInt8, VectorUint8) and unit tests covering accepted/rejected vector types and ensuring the command uses the raw blob in the generated args.

Reviewed by Cursor Bugbot for commit c973763. Bugbot is set up for automated code reviews on this repo. Configure here.

@jit-ci

jit-ci Bot commented Mar 29, 2026

Copy link
Copy Markdown

Hi, I’m Jit, a friendly security platform designed to help developers build secure applications from day zero with an MVS (Minimal viable security) mindset.

In case there are security findings, they will be communicated to you as a comment inside the PR.

Hope you’ll enjoy using Jit.

Questions? Comments? Want to learn more? Get in touch with us.

ndyakov
ndyakov previously approved these changes Apr 9, 2026

@ndyakov ndyakov left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thank you @DengY11.

@ndyakov

ndyakov commented Apr 9, 2026

Copy link
Copy Markdown
Member

@DengY11 after a more detailed check, ft.hybrid should accept any vector type defined in the index, right? If that is the case, let's not limit the command and just return the error from the server. What do you think about that? Maybe it would be better to actually implement additional vector types for indexes in Redis - https://redis.io/docs/latest/develop/ai/search-and-query/vectors/#flat-index ?

@DengY11

DengY11 commented Apr 9, 2026

Copy link
Copy Markdown
Contributor Author

@DengY11 after a more detailed check, ft.hybrid should accept any vector type defined in the index, right? If that is the case, let's not limit the command and just return the error from the server. What do you think about that? Maybe it would be better to actually implement additional vector types for indexes in Redis - https://redis.io/docs/latest/develop/ai/search-and-query/vectors/#flat-index ?

Yes I don't think FT.HYBRID should only accept fp32. The other existing Vector types are more likely to be designed for vectorset command and do not map correctly to hybrid search query input. So instead of limiting it to fp32, it is better to improve/add proper vector input types for FT.HYBRID, matching the actual index vector encodings.
For example, support types like:

  • FLOAT16
  • BFLOAT16
  • FLOAT32
  • FLOAT64
  • INT8
  • UINT8

@ndyakov

ndyakov commented Apr 9, 2026

Copy link
Copy Markdown
Member

@DengY11 I do think we should add those as vector types and since the API is already in a released version, we cannot change the FTHybrid signature at this stage. Let's improve by adding the rest of the vector types and check if the type is supported by the command (either ft hybrid or vector set command) when executing. However, I still think it is more than OK to let the server return an error. This way if there is a new vector type we don't need to alter the code. Since this is an interface, the user can implement their own Vector and use it in the commands.

@DengY11

DengY11 commented Apr 10, 2026

Copy link
Copy Markdown
Contributor Author

@DengY11 I do think we should add those as vector types and since the API is already in a released version, we cannot change the FTHybrid signature at this stage. Let's improve by adding the rest of the vector types and check if the type is supported by the command (either ft hybrid or vector set command) when executing. However, I still think it is more than OK to let the server return an error. This way if there is a new vector type we don't need to alter the code. Since this is an interface, the user can implement their own Vector and use it in the commands.

I understand the point about keeping the interface open, but I still think there are 2 problems here.

  1. Returning server error is not always strong enough. Some wrong input can still be syntactically valid, so the server may not reject it, and then we get semantically wrong behavior instead of a clear failure.
    For example, if the index field is FLOAT16 DIM 8, Redis expects a blob with 8 * 2 = 16 bytes. If client sends another 16-byte blob with different encoding by mistake, the request can still look valid from server side: command syntax is valid, argument position is valid, blob length is also valid
    Then Redis may not reject it, and just interpret the bytes as FLOAT16 DIM 8. In that case it is not a clear server error, but a semantically wrong request.

  2. The Vector interface is also not very clear for users. They only see Value() []any, but dont really know what Value() is supposed to return for FT.HYBRID or other commands.

So I think maybe better to do one of these:

  1. document the Vector contract more clearly
  2. or add a more explicit API for FT.HYBRID input, without changing the current signature, for example by adding something like FTQueryVector and FTHybridWithQueryVector(...)

@ndyakov

ndyakov commented Apr 23, 2026

Copy link
Copy Markdown
Member

@DengY11 documenting the Vector contract and introducing a new one for FTHybrid makes sense. However, transitioning from the Vector to the FTQueryVector will be a tricky process as we cannot change the already released api. We can however deprecate it, but I am not sure if this won't be confusing as well.

I fully understand the concern that the server may accept wrong input if its format is correct. This however is not something the client can guard against (for example you can set any value and later if you want to read it and unmarshal it - it may not be the correct type).

How would sound documenting the Vector contract better, introducing the types that are currently missing and documenting their usage as well - some will be used for the vector sets, others for the fthybrid.

@DengY11

DengY11 commented May 4, 2026

Copy link
Copy Markdown
Contributor Author

@ndyakov Hi, I updated the PR to add the missing vector blob types, and also kept a legacy fallback for known types .For known types I kept the old behavior and still use Value()[1], so this should not break users who already implemented their own Vector.
I keep reject VectorValues and VectorRef for FT.HYBRID,these are invalid inputs and don't really map to hybrid search query blob. Also added unit tests for the new types and the legacy fallback.

@ndyakov ndyakov left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @DengY11! Overall I do like this approach. What do you think about - instead having Val exported on the specific types - add it as part of the interface? Technically we are extending the interface, but if someone has implemented their own vector type this may be breaking. I do not think this is likely, but would like to take your input on it since you invested quite a lot of time on this one.

@DengY11

DengY11 commented May 7, 2026

Copy link
Copy Markdown
Contributor Author

Thanks for checking this. I would prefer to keep vector interface unchanged for now. Even if custom implementations are probably rare, adding a new method to the interface still feels like a stronger breaking change.
So I think the current approach is a bit safer and better: handle the built-in blob types directly, and keep the old Value()[1] fallback for custom vectors. And maybe discuss a cleaner vector API separately in future issue.

@ndyakov

ndyakov commented May 8, 2026

Copy link
Copy Markdown
Member

@DengY11 I agree and accept this approach as a safer one to include now. Thank you for your work on this!

@ndyakov ndyakov merged commit 5e36de8 into redis:master May 8, 2026
39 checks passed
saschazepter pushed a commit to saschazepter/forgejo that referenced this pull request May 29, 2026
This PR contains the following updates:

| Package | Change | [Age](https://docs.renovatebot.com/merge-confidence/) | [Confidence](https://docs.renovatebot.com/merge-confidence/) |
|---|---|---|---|
| [github.com/redis/go-redis/v9](https://github.com/redis/go-redis) | `v9.19.0` → `v9.20.0` | ![age](https://developer.mend.io/api/mc/badges/age/go/github.com%2fredis%2fgo-redis%2fv9/v9.20.0?slim=true) | ![confidence](https://developer.mend.io/api/mc/badges/confidence/go/github.com%2fredis%2fgo-redis%2fv9/v9.19.0/v9.20.0?slim=true) |

---

### Release Notes

<details>
<summary>redis/go-redis (github.com/redis/go-redis/v9)</summary>

### [`v9.20.0`](https://github.com/redis/go-redis/releases/tag/v9.20.0): 9.20.0

[Compare Source](redis/go-redis@v9.19.0...v9.20.0)

#### 🚀 Highlights

##### Redis 8.8 Support

This release adds support for **Redis 8.8**. The README's supported-versions list now includes Redis 8.8 alongside 8.0/8.2/8.4, and CI exercises the `8.8` client-libs-test image across the full suite (Makefile, build workflow, doctests, run-tests action, and docker-compose).

Coverage for the new commands that ship in the 8.x line, rounded out in this release:

- **`AR*` array data type** ([#&#8203;3813](redis/go-redis#3813)) — new array data structure, exposed via the `ArrayCmdable` interface (see the experimental-features highlight below).
- **`INCREX`** ([#&#8203;3816](redis/go-redis#3816)) — atomic increment with expiration in a single round-trip.
- **`XNACK`** ([#&#8203;3790](redis/go-redis#3790)) — explicit negative-acknowledge of pending stream entries.
- **`XAUTOCLAIM` PEL deletes** ([#&#8203;3798](redis/go-redis#3798)) — `XAUTOCLAIM`/`XAUTOCLAIMJUSTID` now return the list of deleted message IDs from the pending entries list.
- **`TS.RANGE` multiple aggregators** ([#&#8203;3791](redis/go-redis#3791)) — `TS.RANGE`/`TS.REVRANGE`/`TS.MRANGE`/`TS.MREVRANGE` accept multiple aggregators in a single call.
- **`Z(UNION|INTER|DIFF)` `COUNT` aggregator** ([#&#8203;3802](redis/go-redis#3802)) — `COUNT` reducer for sorted-set set operations.
- **`JSON.SET FPHA`** ([#&#8203;3797](redis/go-redis#3797)) — new `FPHA` argument that specifies the floating-point type for homogeneous FP arrays.

CI image bump ([#&#8203;3814](redis/go-redis#3814)) by [@&#8203;ofekshenawa](https://github.com/ofekshenawa). Command coverage contributions by [@&#8203;cxljs](https://github.com/cxljs), [@&#8203;elena-kolevska](https://github.com/elena-kolevska), [@&#8203;Khukharr](https://github.com/Khukharr), [@&#8203;ndyakov](https://github.com/ndyakov), and [@&#8203;ofekshenawa](https://github.com/ofekshenawa).

##### Stable RESP3 for RediSearch (`UnstableResp3` deprecated)

`FT.SEARCH`, `FT.AGGREGATE`, `FT.INFO`, `FT.SPELLCHECK`, and `FT.SYNDUMP` now parse RESP3 (map) responses into the same typed result objects as RESP2 — `Val()` and `Result()` work uniformly on both protocols, no flag required. Previously, RESP3 search responses required `UnstableResp3: true` and were returned as opaque maps accessible only via `RawResult()` / `RawVal()`.

As a result, the `UnstableResp3` option is now a **no-op** across every options struct (`Options`, `ClusterOptions`, `UniversalOptions`, `FailoverOptions`, `RingOptions`) and has been marked `// Deprecated:`. The field is retained for backwards compatibility — existing code that sets `UnstableResp3: true` will continue to compile and behave identically — but it will be removed in a future release and new code should not set it. `RawResult()` / `RawVal()` continue to work for callers that prefer the raw RESP payload.

([#&#8203;3741](redis/go-redis#3741)) by [@&#8203;ndyakov](https://github.com/ndyakov)

##### Experimental Array Data Structure Commands

Adds an experimental `ArrayCmdable` interface with the `AR*` command family (`ARSet`, `ARGet`, `ARGetRange`, `ARMSet`, `ARMGet`, `ARDel`, `ARDelRange`, `ARScan`, `ARSeek`, `ARNext`, `ARLastItems`, `ARGrep`, `ARGrepWithValues`, `ARInfo`/`ARInfoFull`, and typed reducers `AROpSum`/`AROpMin`/`AROpMax`/`AROpAnd`/`AROpOr`/`AROpXor`/`AROpMatch`/`AROpUsed`) for working with Redis 8.8's new array data type. **API is experimental and may change in a future release.**

([#&#8203;3813](redis/go-redis#3813)) by [@&#8203;cxljs](https://github.com/cxljs)

#### ✨ New Features

- **RESP3 search parser**: First-class RESP3 parsing for `FT.SEARCH`/`FT.AGGREGATE`/`FT.INFO`/`FT.SPELLCHECK`/`FT.SYNDUMP` responses with backwards compatibility for RESP2 ([#&#8203;3741](redis/go-redis#3741)) by [@&#8203;ndyakov](https://github.com/ndyakov)
- **INCREX**: New `INCREX` command support — atomic increment with expiration ([#&#8203;3816](redis/go-redis#3816)) by [@&#8203;ndyakov](https://github.com/ndyakov)
- **XNACK**: Client support for the `XNACK` stream command for explicitly negative-acknowledging pending entries ([#&#8203;3790](redis/go-redis#3790)) by [@&#8203;elena-kolevska](https://github.com/elena-kolevska)
- **TS range multiple aggregators**: `TS.RANGE`/`TS.REVRANGE`/`TS.MRANGE`/`TS.MREVRANGE` now accept multiple aggregators in a single call ([#&#8203;3791](redis/go-redis#3791)) by [@&#8203;elena-kolevska](https://github.com/elena-kolevska)
- **`XAutoClaim` deleted IDs**: `XAUTOCLAIM`/`XAUTOCLAIMJUSTID` now return the list of deleted message IDs from the PEL ([#&#8203;3798](redis/go-redis#3798)) by [@&#8203;Khukharr](https://github.com/Khukharr)
- **`JSON.SET FPHA`**: `JSON.SET` accepts a new `FPHA` argument that specifies the floating-point type for homogeneous floating-point arrays ([#&#8203;3797](redis/go-redis#3797)) by [@&#8203;ndyakov](https://github.com/ndyakov)
- **Sorted-set union/intersection COUNT**: `ZUNION`/`ZINTER`/`ZDIFF` aggregator now supports `COUNT` ([#&#8203;3802](redis/go-redis#3802)) by [@&#8203;ofekshenawa](https://github.com/ofekshenawa)
- **`FT.HYBRID` vector validation**: Validates hybrid-search vector input types and adds proper typed vector parameters ([#&#8203;3756](redis/go-redis#3756)) by [@&#8203;DengY11](https://github.com/DengY11)
- **Cluster pool wait stats**: `ClusterClient.PoolStats()` now accumulates `WaitCount` and `WaitDurationNs` across all node pools (previously always zero) ([#&#8203;3809](redis/go-redis#3809)) by [@&#8203;LINKIWI](https://github.com/LINKIWI)

#### 🐛 Bug Fixes

- **TLS-only Cluster PubSub**: `CLUSTER SLOTS` port-0 entries now fall back to the origin endpoint's port, fixing `dial tcp <ip>:0: connection refused` on TLS-only clusters started with `--port 0 --tls-port <port>` (fixes [#&#8203;3726](redis/go-redis#3726)) ([#&#8203;3828](redis/go-redis#3828)) by [@&#8203;ndyakov](https://github.com/ndyakov)
- **Sharded PubSub reconnect routing**: `PubSub.conn()` now passes both regular (`c.channels`) and sharded (`c.schannels`) channels into the per-PubSub `newConn` closure. Previously, `ClusterClient.SSubscribe`-only PubSubs reconnected to a random node (because the routing closure saw an empty channel list), the `SSUBSCRIBE` was sent to the wrong shard, and the resulting `MOVED` reply was silently dropped ([#&#8203;3829](redis/go-redis#3829)) by [@&#8203;ndyakov](https://github.com/ndyakov)
- **ClusterClient `Watch` retry**: User errors returned from a `Watch` callback are no longer subjected to cluster-retry classification; transient cluster errors still retry, but a callback returning e.g. `net.ErrClosed` short-circuits immediately ([#&#8203;3821](redis/go-redis#3821)) by [@&#8203;obiyang](https://github.com/obiyang)
- **Sentinel concurrent-probe leak**: `MasterAddr`'s concurrent sentinel probe now closes the non-winning sentinel clients instead of leaking them ([#&#8203;3827](redis/go-redis#3827)) by [@&#8203;cxljs](https://github.com/cxljs)
- **Sentinel rediscovery loop on master-only setups**: `replicaAddrs` no longer tears down the cached sentinel client when the replica list is empty, eliminating a continuous rediscovery loop on master-only Sentinel deployments that flooded logs and added per-operation latency ([#&#8203;3795](redis/go-redis#3795)) by [@&#8203;shahyash2609](https://github.com/shahyash2609)
- **Pool `CloseConn` hooks**: `Pool.CloseConn` now triggers registered hooks, fixing a memory leak when connections are closed explicitly rather than via the normal removal path ([#&#8203;3818](redis/go-redis#3818)) by [@&#8203;ndyakov](https://github.com/ndyakov)
- **Dial TCP error redirection**: Wrapped `dial tcp` errors are now correctly classified as redirectable so cluster routing can recover from a single unreachable node ([#&#8203;3810](redis/go-redis#3810)) by [@&#8203;vladisa88](https://github.com/vladisa88)
- **Pool `Close` health checks**: `ConnPool.Close` now only runs health checks against idle connections, avoiding spurious activity on connections still in use ([#&#8203;3805](redis/go-redis#3805)) by [@&#8203;ndyakov](https://github.com/ndyakov)
- **VLinks return type**: Fixed the return type of `VLINKS`/`VLINKSWITHSCORES` vector-set replies ([#&#8203;3820](redis/go-redis#3820)) by [@&#8203;romanpovol](https://github.com/romanpovol)

#### 🧪 Testing & Infrastructure

- **Flaky tests**: Stabilized several flaky tests in the sentinel and pool suites ([#&#8203;3815](redis/go-redis#3815)) by [@&#8203;ndyakov](https://github.com/ndyakov)
- **Sentinel failover metric race**: Fixed a data race in the sentinel failover metric test ([#&#8203;3824](redis/go-redis#3824)) by [@&#8203;cxljs](https://github.com/cxljs)
- **`waitForSentinelClusterStable` post-conditions**: The sentinel test harness now waits for replicas to be fully connected (not just present in the count) and is robust to randomized spec ordering after failover specs, eliminating an intermittent `Expected master to equal slave` flake ([#&#8203;3830](redis/go-redis#3830)) by [@&#8203;ndyakov](https://github.com/ndyakov)
- **`govulncheck` workflow**: New scheduled GitHub Actions workflow runs `govulncheck` on every push, PR, and weekly, surfacing newly disclosed Go vulnerabilities even when no code changes ([#&#8203;3779](redis/go-redis#3779)) by [@&#8203;solardome](https://github.com/solardome)
- **CI Redis 8.8-rc1**: CI now exercises the 8.8-rc1 Redis image ([#&#8203;3814](redis/go-redis#3814)) by [@&#8203;ofekshenawa](https://github.com/ofekshenawa)

#### 🧰 Maintenance

- **`Cmd.Slot()` lookup refactor**: Caches the per-command `CommandInfo` and short-circuits keyless commands before the switch dispatch, removing redundant `Peek` calls ([#&#8203;3804](redis/go-redis#3804)) by [@&#8203;retr0-kernel](https://github.com/retr0-kernel)
- **stdlib `math/rand`**: Replaced `internal/rand` with `math/rand` from the standard library now that the minimum Go version is 1.24 ([#&#8203;3823](redis/go-redis#3823)) by [@&#8203;cxljs](https://github.com/cxljs)
- **ConnPool queue channel**: Removed the unused queue channel from `ConnPool`, trimming the pool's footprint ([#&#8203;3826](redis/go-redis#3826)) by [@&#8203;cxljs](https://github.com/cxljs)
- **Extra packages LICENSE**: Added a LICENSE file to each `extra/*` package ([#&#8203;3817](redis/go-redis#3817)) by [@&#8203;ndyakov](https://github.com/ndyakov)
- **README & CI image**: Documentation refresh and bumped the default CI image tag ([#&#8203;3822](redis/go-redis#3822)) by [@&#8203;ndyakov](https://github.com/ndyakov)

#### 👥 Contributors

We'd like to thank all the contributors who worked on this release!

[@&#8203;cxljs](https://github.com/cxljs), [@&#8203;DengY11](https://github.com/DengY11), [@&#8203;elena-kolevska](https://github.com/elena-kolevska), [@&#8203;Khukharr](https://github.com/Khukharr), [@&#8203;LINKIWI](https://github.com/LINKIWI), [@&#8203;ndyakov](https://github.com/ndyakov), [@&#8203;obiyang](https://github.com/obiyang), [@&#8203;ofekshenawa](https://github.com/ofekshenawa), [@&#8203;retr0-kernel](https://github.com/retr0-kernel), [@&#8203;romanpovol](https://github.com/romanpovol), [@&#8203;shahyash2609](https://github.com/shahyash2609), [@&#8203;solardome](https://github.com/solardome), [@&#8203;vladisa88](https://github.com/vladisa88)

***

**Full Changelog**: <redis/go-redis@v9.19.0...v9.20.0>

</details>

---

### Configuration

📅 **Schedule**: (UTC)

- Branch creation
  - Between 12:00 AM and 03:59 AM (`* 0-3 * * *`)
- Automerge
  - Between 12:00 AM and 03:59 AM (`* 0-3 * * *`)

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Mend Renovate](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xOTUuMSIsInVwZGF0ZWRJblZlciI6IjQzLjE5NS4xIiwidGFyZ2V0QnJhbmNoIjoiZm9yZ2VqbyIsImxhYmVscyI6WyJkZXBlbmRlbmN5LXVwZ3JhZGUiLCJ0ZXN0L25vdC1uZWVkZWQiXX0=-->

Reviewed-on: https://codeberg.org/forgejo/forgejo/pulls/12804
Reviewed-by: Mathieu Fenniak <mfenniak@noreply.codeberg.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants