Skip to content

feat: add manifest namespace extended properties#6215

Open
wojiaodoubao wants to merge 1 commit into
lance-format:mainfrom
wojiaodoubao:manifest-extended-properties
Open

feat: add manifest namespace extended properties#6215
wojiaodoubao wants to merge 1 commit into
lance-format:mainfrom
wojiaodoubao:manifest-extended-properties

Conversation

@wojiaodoubao

Copy link
Copy Markdown
Contributor

This is a sub-task of #5741

@github-actions github-actions Bot added enhancement New feature or request A-python Python bindings A-java Java bindings + JNI labels Mar 17, 2026
@github-actions

Copy link
Copy Markdown
Contributor

PR Review: feat: add manifest namespace extended properties

Performance: query_manifest reads all columns (P1)

The new unified query_manifest() method removes the .project() call that the old query_manifest_for_table and query_manifest_for_namespace had. Previously, table lookups only read object_id + location, and namespace lookups only read object_id + metadata. Now every query reads all columns including all extended property columns.

As the number of extended properties grows, this becomes increasingly wasteful for simple existence checks and lookups. Consider keeping a projection — at minimum project the basic columns plus the extended columns, or re-add targeted projections for the two call sites.

Performance: Row-at-a-time slicing in query_manifest (P1)

for row_idx in 0..batch.num_rows() {
    let sliced_columns: Vec<Arc<dyn Array>> = batch
        .columns()
        .iter()
        .map(|col| col.slice(row_idx, 1))
        .collect();
    let row = RecordBatch::try_new(batch.schema(), sliced_columns)?;
    objects.push(parse_manifest_object(&row)?);
}

This creates a new RecordBatch per row. For queries that match many rows (e.g., listing all tables), this is O(rows × columns) allocations. Consider indexing into the batch directly in parse_manifest_object using a row index parameter instead of slicing.

Bug: Unused f32, f64 imports (P0 — clippy will fail)

use std::{
    collections::HashMap,
    f32, f64,
    hash::{DefaultHasher, Hash, Hasher},

These module imports (std::f32, std::f64) appear unused. This will likely trigger a clippy warning/error with -D warnings.

Bug: is_multiple_of requires nightly (P0)

if !s.len().is_multiple_of(2) {

usize::is_multiple_of is a nightly-only API (int_roundings feature). This won't compile on stable Rust. Use s.len() % 2 != 0 instead.

Design: Duplicated merge logic (P1)

The pattern of matching (extended_record, &request.properties) with 4 arms is copy-pasted across create_namespace_extended, declare_table_extended, and create_table_extended. Consider extracting a helper like fn resolve_extended_batch(...) to reduce duplication and the risk of the branches diverging.

Minor

  • build_metadata_json silently drops properties that fail to serialize (.ok()? on serde_json::to_string). HashMap<String, String> serialization cannot fail in practice, but the silent swallow is worth noting.
  • Comment typo: "Collect basic columns to excluded" → "to exclude"

🤖 Generated with Claude Code

@wojiaodoubao wojiaodoubao mentioned this pull request Mar 17, 2026
7 tasks
@codecov

codecov Bot commented Mar 17, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 50.44053% with 225 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance-namespace-impls/src/dir/manifest.rs 50.44% 199 Missing and 26 partials ⚠️

📢 Thoughts on this report? Let us know!

@wjones127 wjones127 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm feeling a little uncertain about the efficiency of this code, particularly around the handling of batches. Though I also think the bot is right about projecting many columns.

Comment thread rust/lance-namespace-impls/src/dir/manifest.rs Outdated
Comment thread rust/lance-namespace-impls/src/dir/manifest.rs Outdated
Comment thread rust/lance-namespace-impls/src/dir/manifest.rs
Comment thread rust/lance-namespace-impls/src/dir/manifest.rs Outdated
Comment thread rust/lance-namespace-impls/src/dir/manifest.rs Outdated
@wojiaodoubao wojiaodoubao force-pushed the manifest-extended-properties branch 2 times, most recently from 2afc019 to 935c6b2 Compare March 28, 2026 08:24
@wojiaodoubao wojiaodoubao force-pushed the manifest-extended-properties branch from 935c6b2 to 9574665 Compare March 28, 2026 08:58
@wojiaodoubao

Copy link
Copy Markdown
Contributor Author

I’m very sorry for the long delay. I’ve been tied up with something urgent recently. I’ve updated the PR—please take a look when you have time. Thanks very much! @wjones127

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-java Java bindings + JNI A-python Python bindings enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants