Skip to content

Features/filter payload#104

Merged
generall merged 7 commits intoqdrant:masterfrom
haicheviet:features/filter_payload
Oct 12, 2021
Merged

Features/filter payload#104
generall merged 7 commits intoqdrant:masterfrom
haicheviet:features/filter_payload

Conversation

@haicheviet
Copy link
Contributor

@haicheviet haicheviet commented Sep 29, 2021

All Submissions:

  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?

New Feature Submissions:

  1. Does your submission pass tests?
  2. Have you lint your code locally using cargo fmt command prior to submission?
  3. Have you checked your code using cargo clippy command?

New Feature [Filter Payload] Explain:

  • Link issue: Allow to include payload and vector into search result #50 (comment)
  • I've added the feature filter payload in 3 API [search[POST], scroll[POST], get_point[POST]. The mechanism of with_payload is similar to _source (except wildcards that I think not necessary right now) in elasticsearch that enable 3 type input:
    • Bool
    • List (require key)
    • Dict (Include and Exclude enum)
  • The new code is updated with backward compatibility
  • Add more tests for the new features

Constraint

  • I have to update docker rust version 1.51 -> 1.53 because the bugs retain in Btree cause errors in the build process. The bug is resolved with the newer version Rust (Link reference: Tracking Issue for {BTreeMap,BTreeSet}::retain rust-lang/rust#79025), therefore if the new rust version has any constraint that needs more testing. I'm willing to support more testing or change the current method to filter the payload key type.

@generall
Copy link
Member

Hi @haicheviet thanks for the PR! I will review it as soon as possible.

Copy link
Member

@generall generall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great PR @haicheviet! There are just a few nits from my side. After those adjustments I will need to also generate new OpenAPI schema and update python client.

#[derive(Debug, Deserialize, Serialize, JsonSchema, Clone)]
#[serde(deny_unknown_fields)]
#[serde(rename_all = "snake_case")]
pub struct FilterPayload {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name Filter overlaps with search filter param. I would prefer to came up with different name to avoid confusion.

Copy link
Contributor Author

@haicheviet haicheviet Oct 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed the name to CustomPayload but doesn't sure if it continue to cause confusion. If you have a better naming struct, plz recommend it.

@generall generall force-pushed the features/filter_payload branch from a9663be to e140933 Compare October 11, 2021 20:42
@generall generall force-pushed the features/filter_payload branch from e140933 to 3fbac53 Compare October 11, 2021 21:09
@generall
Copy link
Member

Hi @haicheviet, thanks for updates! I made some renaming and minor fixes in latest commit. I also had to do a rebase to prevent merge conflicts. Will merge it in a short time

@haicheviet
Copy link
Contributor Author

Tks @generall. If you need anything else, I'm willing to support

@generall generall merged commit f55e5aa into qdrant:master Oct 12, 2021
JoanFM pushed a commit to jina-ai/qdrant that referenced this pull request Dec 15, 2021
* fix slow HNSW search + fix looping in mmap optimizer + make segment removal atomic

* update tutorial link

* add payload schema to collection info + indexing fixes

* Update Tokio to the latest version  (qdrant#36)

* update tokio in collection crate qdrant#18

* update tokio in main and storage crates qdrant#18

* Implementation of points scroll API qdrant#38 (qdrant#40)

* WIP: filtered points iterator qdrant#38

* add paginated filtered point request function qdrant#38

* add scroll api + openapi definitions qdrant#38

* fix openapi qdrant#38

* docs: add trean as a contributor for code (qdrant#42)

* docs: update README.md [skip ci]

* docs: update .all-contributorsrc [skip ci]

Co-authored-by: allcontributors[bot] <46447321+allcontributors[bot]@users.noreply.github.com>

* Allow manual trigger docker build

* Avoid cloning ScoredPointOffset when peeking the top scores (qdrant#44)

* docs: add kgrech as a contributor for code (qdrant#47)

* docs: update README.md [skip ci]

* docs: update .all-contributorsrc [skip ci]

Co-authored-by: allcontributors[bot] <46447321+allcontributors[bot]@users.noreply.github.com>

* Applied and enforced rust fmt code formatting tool (qdrant#48)

* Apply cargo fmt command

* Enabled cargo fmt on build

* Add link to the Documentation

* [CLIPPY] Fix File::write and option_env! (qdrant#49)

* Replace File::write to write_all to ensure the whole vector is written

* [CLIPPY] Replace option_env! with env!

* Avoid useless vector copy during scoring (qdrant#51)

* Avoid vector copy during scoring

* Fixing ptr_arg clippy rules for &[VectorElementType]

* [Clippy] Fix a range of warnings (qdrant#52)

* [CLIPPY] Fix the last portion of rules and enable CI check (qdrant#53)

* [CLIPPY] Fixed the warning for references of the user defined types

* [CLIPPY] Fix module naming issue

* [CLIPPY] Fix the last set of warnings and enable clippy check during CI

* Moved cargo fmt and cargo clippy into it's own action

* Actix update (qdrant#55)

* Updated actix to 4.0.0-beta.8

* Refactored search, scroll, update and collection operation APIs to be async

* Revert "Actix update (qdrant#55)" (qdrant#56)

This reverts commit 12e2508.

* Revert "Revert "Actix update (qdrant#55)" (qdrant#56)" (qdrant#57)

This reverts commit 53ddce3.

* Own search runtime out of the async scope

* rm unused cli.rs

* Update README.md

* fix clippy warnings

* fix fmt

* fix serde deserialisation issue in WAL

* disable debug test

* add extreme classification demo

* fix typo

* Small cosmetics (qdrant#66)

* some small cosmetics

* fixed "expect" linting

* fix formation

* docs: add kekonen as a contributor for code (qdrant#70)

* docs: update README.md [skip ci]

* docs: update .all-contributorsrc [skip ci]

Co-authored-by: allcontributors[bot] <46447321+allcontributors[bot]@users.noreply.github.com>

* Update docker usage example for pre-built images (qdrant#71)

* docs: add vearutop as a contributor for doc (qdrant#73)

* docs: update README.md [skip ci]

* docs: update .all-contributorsrc [skip ci]

Co-authored-by: allcontributors[bot] <46447321+allcontributors[bot]@users.noreply.github.com>

* fix new clippy suggestions

* [GRPC] Introduce GRPC API based on tonic (qdrant#76)

* Remove AtomicRefCell wrapper for condition checker (qdrant#84)

* docs: add galibey as a contributor for code (qdrant#86)

* docs: update README.md [skip ci]

* docs: update .all-contributorsrc [skip ci]

Co-authored-by: allcontributors[bot] <46447321+allcontributors[bot]@users.noreply.github.com>

* Deadlock fix (qdrant#91)

* refactor: segment managers -> collection managers

* fix segments holder deadlock

* apply cargo fmt

* fix cargo clippy

* replace sequential segment locking with multiple try_lock attempts to prevent deadlocks

* skip moved points (double) processing for segments updater in apply_points_to_appendable

* Decouple searcher and updater from collection (qdrant#93)

* refactor api function param names

* handle limit=0 corner case in scroll api qdrant#90

* apply fmt

* dynamic arch (qdrant#79)

* dynamic arch

* fix fmt

* Fix point retrieve from copy-on-write proxy segment (qdrant#94)

* v0.3.6

* Exposed update_colletions operation over grpc (qdrant#89)

* raise 404 error instead of 400 in case if collection not found qdrant#99

* Push images to docker-hub (qdrant#102)

* [GRPC] Exposed get collections, update and delete collection RPCs (qdrant#96)

* [GRPC] Exposed get collections, update and delete collection RPCs

* Moved every collection operation into the separate rpc

* fix alias operation deadlock (qdrant#103) (qdrant#105)

* fix alias operation deadlock (qdrant#103)

* cargo fmt (qdrant#103)

* use custom openblas fork wit implemented dynamic arch flag (qdrant#106)

* use custom openblas fork wit implemented dynamic arch flag

* add comment

* use seeded random number generator in search graph tests

* Features/filter payload (qdrant#104)

* update more test

* update fmt

* reduce non usecode and update docker version

* update commend code

* update name filter

* renames and minor fixes

* fix linter

Co-authored-by: hai che <haiche@jobhop.com>
Co-authored-by: Andrey Vasnetsov <andrey@vasnetsov.com>
Co-authored-by: Andrey Vasnetsov <vasnetsov93@gmail.com>

* docs: add HaiCheViet as a contributor for code (qdrant#109)

* docs: update README.md [skip ci]

* docs: update .all-contributorsrc [skip ci]

Co-authored-by: allcontributors[bot] <46447321+allcontributors[bot]@users.noreply.github.com>

* fix seeded rng in caused blinking test

* data consistency fixes and updates (qdrant#112)

* update segment version after completed update only

* more stable updates: check pre-existing points on update, fail recovery, WAL proper ack. check_unprocessed_points WIP

* switch to async channel

* perform update operations in a separate thread (qdrant#111)

* perform update operations in a separate thread

* ordered sending update signal

* locate a segment merging versioning bug

* rename id_mapper -> id_tracker

* per-record versioning

* clippy fixes

* cargo fmt

* rm limit of open files

* fail recovery test

* cargo fmt

* wait for worker stops befor dropping the runtime

* update OpenAPI schema

* default payload return is null + update openAPI

* upd version to 0.4.1

* add openblas patch to the root package

* Add various refactorings (qdrant#118)

* docs: add tranzystorek-io as a contributor for code (qdrant#120)

* docs: update README.md [skip ci]

* docs: update .all-contributorsrc [skip ci]

Co-authored-by: allcontributors[bot] <46447321+allcontributors[bot]@users.noreply.github.com>

* Fix ndarray imports in segment benches (qdrant#121)

* Add more refactorings (qdrant#122)

* fix refactoring (qdrant#124)

* Upgrade rust version (qdrant#127)

* - update rust version in Dockerfile
- use rust edition 2021

* - update rust edition for libs

* remove *.lock from gitignore

* docs: add anveq as a contributor for code (qdrant#129)

* docs: update README.md [skip ci]

* docs: update .all-contributorsrc [skip ci]

Co-authored-by: allcontributors[bot] <46447321+allcontributors[bot]@users.noreply.github.com>

* fix build of docker image - update version of libgfortran (qdrant#130)

* fix debian version in dockerfile (qdrant#131)

* replace incompatible hashlib (qdrant#132)

* Update api descriptions 37 (qdrant#125)

* upd API description in OpenAPI qdrant#37

* upd openapi

* Rustdoc and README for internal entities and processes (qdrant#123)

* extend comments for strorage crate

* update comments and readme for collection crate

* apply cargo fmt

* fix tests

* apply fmt

* fix new version of clippy

* add comments for segment entitites (qdrant#136)

* add comments for segment entitites

* fmt

* cargo fmt

* Multithread optimizer (qdrant#134)

* run optimizers on tokio thread pool for cpu-bound tasks

* [WIP] move check condition to another thread

* [WIP] optimizer iter live not long enough

* [WIP] change Box to Arc in optimizers vector

* add blocking handles management

* cargo fmt apply

* Update lib/collection/src/update_handler.rs

Co-authored-by: Andrey Vasnetsov <andrey@vasnetsov.com>

* [WIP] optimizer iter live not long enough

* [WIP] change Box to Arc in optimizers vector

* add blocking handles management

* fix code review issues

* use CollectionConfig available cpu value

* apply updated fmt

* [WIP] move count of optimization threads to OptimizersConfig

* optimization options

* fix formatting

* fix proto for optimizer config

* fmt

* Update config/config.yaml

related task: qdrant#30

Co-authored-by: Andrey Vasnetsov <andrey@vasnetsov.com>

* [GRPC] Expose upsert points API (qdrant#107)

* [GRPC] Expose upsert points API

* refactor PointSruct: use map instead of list + grpc tests

Co-authored-by: Andrey Vasnetsov <andrey@vasnetsov.com>

* Split collection update API into several endpoints (qdrant#126)

* split storage operation structures qdrant#32

* cargo fmt qdrant#32

* split collection update api into several endpoints qdrant#32

* cargo fmt qdrant#32

* fix tonic-related code with new structures

* upd alias structures

* use ytt teplate engine for OpenAPI Endpoint schema generation

* refactor: replace parking_lot with tokio mutex for WAL (qdrant#140)

* refactor: replace parking_lot with tokio mutex for WAL

* cargo fmt

* pre-build proto structures qdrant#138 (qdrant#141)

* Improve error reporting in collection_loader (qdrant#143)

* Fix benches compilation for TestRawScorerProducer::new (qdrant#142)

* docs: add agourlay as a contributor for code (qdrant#145)

* docs: update README.md [skip ci]

* docs: update .all-contributorsrc [skip ci]

Co-authored-by: allcontributors[bot] <46447321+allcontributors[bot]@users.noreply.github.com>

* Ctrl-c with tonic (qdrant#146)

* wip

* use tokio signals and serve_with_shutdown

* keep it simple

Co-authored-by: Andrey Vasnetsov <andrey@vasnetsov.com>
Co-authored-by: trean <trean.mi@gmail.com>
Co-authored-by: allcontributors[bot] <46447321+allcontributors[bot]@users.noreply.github.com>
Co-authored-by: Konstantin <kgrech@users.noreply.github.com>
Co-authored-by: Konstantin G <kgrech@mail.ru>
Co-authored-by: Daniil Naumetc <11177808+kekonen@users.noreply.github.com>
Co-authored-by: Viacheslav Poturaev <vearutop@users.noreply.github.com>
Co-authored-by: Alexander Galibey <48586936+galibey@users.noreply.github.com>
Co-authored-by: HaiCheViet <cheviethai123@gmail.com>
Co-authored-by: hai che <haiche@jobhop.com>
Co-authored-by: Andrey Vasnetsov <vasnetsov93@gmail.com>
Co-authored-by: Marcin Puc <5671049+tranzystorek-io@users.noreply.github.com>
Co-authored-by: Anton V <94402218+anveq@users.noreply.github.com>
Co-authored-by: Arnaud Gourlay <arnaud.gourlay@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants