Skip to content

graphql-alt: add event scanning#25097

Merged
henryachen merged 5 commits intohc/dvx-1290-read_txnsfrom
hc/dvx-1290-read_ev
Mar 4, 2026
Merged

graphql-alt: add event scanning#25097
henryachen merged 5 commits intohc/dvx-1290-read_txnsfrom
hc/dvx-1290-read_ev

Conversation

@henryachen
Copy link
Copy Markdown
Collaborator

@henryachen henryachen commented Jan 26, 2026

Description

  • Added scanEvent GraphQL query endpoint for scanning events using bloom filters
  • Reuse existing scan infrastructure (bloom querying, pagination) from
    Transaction Scanning
  • Added e2e tests covering filtering by sender, module, type, and pagination

Test plan

cargo nextest run -p sui-indexer-alt-graphql
cargo nextest run -p sui-indexer-alt-graphql -- schema_sdl
cargo nextest run -p sui-indexer-alt-graphql --features staging -- schema_sdl
cargo nextest run -p sui-indexer-alt-e2e-tests -- graphql/

Stack:

#24788
#24900
#25097 👈


Release notes

Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.

For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.

  • Protocol:
  • Nodes (Validators and Full nodes):
  • gRPC:
  • JSON-RPC:
  • GraphQL:
  • CLI:
  • Rust SDK:
  • Indexing Framework:

@vercel
Copy link
Copy Markdown

vercel bot commented Jan 26, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
sui-docs Ready Ready Preview, Comment Mar 4, 2026 6:15pm
2 Skipped Deployments
Project Deployment Actions Updated (UTC)
multisig-toolkit Ignored Ignored Preview Mar 4, 2026 6:15pm
sui-kiosk Ignored Ignored Preview Mar 4, 2026 6:15pm

Request Review

Copy link
Copy Markdown
Contributor

@amnn amnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll wait to review this after the comments in the stack below are resolved -- just hit re-request review when that's done!

}

/// The transaction index bounds `[tx_lo, tx_hi)` within a checkpoint based on cursor positions.
pub(crate) fn cp_tx_bounds<C: ScanCursor>(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at its callsites, when we get here, we've already bloom-filtered for the candidate_cps. It's impossible for candidates_cps to contain checkpoints outside the cursor range. So at its core we're checking if the after cursor is a slice into the first checkpoint's txns, and the before cursor is a slice somewhere into the middle of the last checkpoint's txns. For most other checkpoints, we will return Range<checkpoint_digests.len()>

I think this was a bit obscured by the impl., but I don't think we need to modify the impl, could we add some calrification to the docs with the intent of what this fn is doing?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed cp clamping here makes the function harder to reason able, this was simplified this based on changes in transaction scans!

};
let bounds = directional_iter(
page,
cp_tx_bounds(page, cp_sequence_number, checkpoint_digests.len()),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh i feel like cp_tx_bounds and directional_iter belong in the paginate module, rather than disbursed in cursor and scan

pipelines.insert("obj_versions".to_string());
};
Query.[events] |pipelines, filters| {
Query.[events, eventsScan] |pipelines, filters| {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eventsScan only depends on the cp_blooms + block tables right? they have entierly different retentions?

henryachen added a commit that referenced this pull request Jan 30, 2026
## Description 

Adds two new indexer pipelines to support transaction scanning queries
via bloom filters:

  cp_blooms concurrent pipeline: Per-checkpoint bloom filters
- Indexes function calls, affected objects, senders, and recipients for
each checkpoint
- Folds bloom filter if it is sparse up to a certain min size or density
  - Stored in cp_blooms table with one row per checkpoint

cp_bloom_blocks sequential pipeline: Blocked bloom filters spanning 1000
checkpoints
  - Splits each 256KB bloom into 128 separate smaller 2KB blocks
- Stored in cp_bloom_blocks table with ~128 rows per cp_block (sparse,
only non-zero blocks)
  - ORs bloom blocks on conflicts

  Schema changes:
  - cp_blooms: Per-checkpoint bloom filters
- cp_bloom_blocks: Blocked bloom filters with (cp_block_id,
bloom_block_index) primary key

## Test plan
```
cargo nextest run -p sui-indexer-alt
cargo nextest run -p sui-indexer-alt-schema
```

## Stack:
#24788 👈 
#24900
#25097


---

## Release notes

Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.

For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.

- [ ] Protocol: 
- [ ] Nodes (Validators and Full nodes): 
- [ ] gRPC:
- [ ] JSON-RPC: 
- [x] GraphQL:   add bloom filter pipelines for scanning APIs
- [ ] CLI: 
- [ ] Rust SDK:
- [ ] Indexing Framework:
@henryachen henryachen force-pushed the hc/dvx-1290-read_txns branch from ea20bd1 to 0173e16 Compare February 2, 2026 21:10
henryachen added a commit that referenced this pull request Feb 6, 2026
## Description 

Adds two new indexer pipelines to support transaction scanning queries
via bloom filters:

  cp_blooms concurrent pipeline: Per-checkpoint bloom filters
- Indexes function calls, affected objects, senders, and recipients for
each checkpoint
- Folds bloom filter if it is sparse up to a certain min size or density
  - Stored in cp_blooms table with one row per checkpoint

cp_bloom_blocks sequential pipeline: Blocked bloom filters spanning 1000
checkpoints
  - Splits each 256KB bloom into 128 separate smaller 2KB blocks
- Stored in cp_bloom_blocks table with ~128 rows per cp_block (sparse,
only non-zero blocks)
  - ORs bloom blocks on conflicts

  Schema changes:
  - cp_blooms: Per-checkpoint bloom filters
- cp_bloom_blocks: Blocked bloom filters with (cp_block_id,
bloom_block_index) primary key

## Test plan
```
cargo nextest run -p sui-indexer-alt
cargo nextest run -p sui-indexer-alt-schema
```

## Stack:
#24788 👈 
#24900
#25097


---

## Release notes

Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.

For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.

- [ ] Protocol: 
- [ ] Nodes (Validators and Full nodes): 
- [ ] gRPC:
- [ ] JSON-RPC: 
- [x] GraphQL:   add bloom filter pipelines for scanning APIs
- [ ] CLI: 
- [ ] Rust SDK:
- [ ] Indexing Framework:
henryachen added a commit that referenced this pull request Feb 6, 2026
## Description 

Adds two new indexer pipelines to support transaction scanning queries
via bloom filters:

  cp_blooms concurrent pipeline: Per-checkpoint bloom filters
- Indexes function calls, affected objects, senders, and recipients for
each checkpoint
- Folds bloom filter if it is sparse up to a certain min size or density
  - Stored in cp_blooms table with one row per checkpoint

cp_bloom_blocks sequential pipeline: Blocked bloom filters spanning 1000
checkpoints
  - Splits each 256KB bloom into 128 separate smaller 2KB blocks
- Stored in cp_bloom_blocks table with ~128 rows per cp_block (sparse,
only non-zero blocks)
  - ORs bloom blocks on conflicts

  Schema changes:
  - cp_blooms: Per-checkpoint bloom filters
- cp_bloom_blocks: Blocked bloom filters with (cp_block_id,
bloom_block_index) primary key

## Test plan
```
cargo nextest run -p sui-indexer-alt
cargo nextest run -p sui-indexer-alt-schema
```

## Stack:
#24788 👈 
#24900
#25097


---

## Release notes

Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.

For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.

- [ ] Protocol: 
- [ ] Nodes (Validators and Full nodes): 
- [ ] gRPC:
- [ ] JSON-RPC: 
- [x] GraphQL:   add bloom filter pipelines for scanning APIs
- [ ] CLI: 
- [ ] Rust SDK:
- [ ] Indexing Framework:
@henryachen henryachen force-pushed the hc/dvx-1290-read_txns branch from 496ead4 to 9d35a40 Compare February 11, 2026 22:36
@henryachen henryachen temporarily deployed to sui-typescript-aws-kms-test-env February 12, 2026 08:07 — with GitHub Actions Inactive
Copy link
Copy Markdown
Contributor

@amnn amnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of small comments, but this is otherwise good to go as well!

let watermarks: &Arc<Watermarks> = ctx.data()?;
let available_range_key = AvailableRangeKey {
type_: "Query".to_string(),
field: Some("eventsScan".to_string()),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
field: Some("eventsScan".to_string()),
field: Some("scanEvents".to_string()),

return Ok(Connection::new(false, false));
};

let Some(cp_bounds) = checkpoint_bounds(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned on the transaction scanning PR that we should be failing earlier if we aren't supplied with the appropriate checkpoint bounds -- practically what that means is that the scan_range > limits.max_scan_limit check below gets moved above this line, and looks something like this:

match filter {
    // Assume the max scan limit is >= 1
    EventFilter { at_checkpoint: Some(_), .. } => { /* nop */ },

    EventFilter {
        after_checkpoint: Some(after),
        before_checkpoint: Some(before),
        ..
    } => {
        let after: u64 = after.into();
        let before: u64 = before.into();
        let requested = before.saturating_sub(after).saturating_sub(1);

        if range > limits.max_scan_limit {
            return Err(bad_use_input(ScanError::LimitExceeded {
                requested,
                max: limits.max_scan_limit,
            });
        }
    }

    // New error case: complain that the query is missing checkpoint bounds.
    _ => { /* ... */ }
}

You still need to run things through checkpoint_bounds, for consistency reasons, but this makes the nature of the constraint less dynamic and therefore clearer for the user.

.collect()
}

pub(super) type EventsBySequenceNumbers = BTreeMap<EventCursor, Event>;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: types to the top. Also, should bloom/mod.rs be moved to just bloom.rs now that it's the only module here?

Comment on lines +365 to +368
Query.[scanEvents] |pipelines, _filters| {
pipelines.insert("cp_blooms".to_string());
pipelines.insert("cp_bloom_blocks".to_string());
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Maybe just merge this with scanTransactions below?

last: Option<u64>,
before: Option<CEvent>,
filter: Option<EventFilter>,
) -> Result<Option<Connection<String, Event>>, RpcError<ScanError>> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Option<Result<...>>

@henryachen henryachen temporarily deployed to sui-typescript-aws-kms-test-env March 4, 2026 18:11 — with GitHub Actions Inactive
@henryachen henryachen merged commit 9c831b9 into hc/dvx-1290-read_txns Mar 4, 2026
51 of 55 checks passed
@henryachen henryachen deleted the hc/dvx-1290-read_ev branch March 4, 2026 18:53
henryachen added a commit that referenced this pull request Mar 5, 2026
## Description 

- Added scanEvent GraphQL query endpoint for scanning events using bloom
filters
  - Reuse existing scan infrastructure (bloom querying, pagination) from
 Transaction Scanning
- Added e2e tests covering filtering by sender, module, type, and
pagination

## Test plan 

```
cargo nextest run -p sui-indexer-alt-graphql
cargo nextest run -p sui-indexer-alt-graphql -- schema_sdl
cargo nextest run -p sui-indexer-alt-graphql --features staging -- schema_sdl
cargo nextest run -p sui-indexer-alt-e2e-tests -- graphql/
```

## Stack:
#24788 
#24900 
#25097 👈 

---

## Release notes

Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.

For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.

- [ ] Protocol: 
- [ ] Nodes (Validators and Full nodes): 
- [ ] gRPC:
- [ ] JSON-RPC: 
- [ ] GraphQL: 
- [ ] CLI: 
- [ ] Rust SDK:
- [ ] Indexing Framework:
henryachen added a commit that referenced this pull request Mar 5, 2026
## Description 

Adds a new transactionsScan query that scans checkpoints using bloom
filters to find transactions matching filter criteria.

  Changes

  GraphQL (sui-indexer-alt-graphql)
  - New transactionsScan query endpoint
  - maxScanLimit service config to limit checkpoints scanned per query
  - Transaction filter and scan logic in scan.rs
  
  Schema (sui-indexer-alt-schema)
- `bloom_contains` SQL function for checking probe membership in bloom
filters

  Reader (sui-indexer-alt-reader)
  - `cp_blooms` loader for batch-loading bloom filter data


## Test plan 

```
cargo nextest run -p sui-indexer-alt-graphql graphql_scan_limit_tests
cargo nextest run -p sui-indexer-alt-graphql
cargo nextest run -p sui-indexer-alt
cargo nextest run -p sui-indexer-alt-schema
```

Query Plan:
```
Limit  (cost=1911.69..1358704.33 rows=62 width=8) (actual time=14.454..14.836 rows=62 loops=1)
  Buffers: shared hit=7287
  CTE block_bit_probes
    ->  ProjectSet  (cost=0.00..8.20 rows=1635 width=18) (actual time=0.004..0.431 rows=1635 loops=1)
          ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1)
  ->  Nested Loop  (cost=1903.49..1358696.13 rows=62 width=8) (actual time=14.453..14.830 rows=62 loops=1)
        Buffers: shared hit=7287
        ->  Limit  (cost=1903.06..1903.22 rows=62 width=24) (actual time=14.411..14.414 rows=1 loops=1)
              Buffers: shared hit=6976
              CTE block_lookup
                ->  Nested Loop Left Join  (cost=41.30..1723.38 rows=200 width=131) (actual time=0.654..2.899 rows=327 loops=1)
                      Buffers: shared hit=1348
                      ->  HashAggregate  (cost=40.88..42.88 rows=200 width=10) (actual time=0.620..0.682 rows=327 loops=1)
                            Group Key: block_bit_probes_1.cp_block_index, block_bit_probes_1.bloom_idx
                            Batches: 1  Memory Usage: 61kB
                            ->  CTE Scan on block_bit_probes block_bit_probes_1  (cost=0.00..32.70 rows=1635 width=10) (actual time=0.000..0.159 rows=1635 loops=1)
                      ->  Index Scan using cp_bloom_blocks_pkey on cp_bloom_blocks bb  (cost=0.42..8.40 rows=1 width=131) (actual time=0.006..0.006 rows=1 loops=327)
                            Index Cond: ((cp_block_index = block_bit_probes_1.cp_block_index) AND (bloom_block_index = block_bit_probes_1.bloom_idx))
                            Buffers: shared hit=1343
              ->  Sort  (cost=179.69..180.17 rows=192 width=24) (actual time=14.410..14.412 rows=1 loops=1)
                    Sort Key: ((block_bit_probes.cp_block_index * '1000'::bigint))
                    Sort Method: quicksort  Memory: 25kB
                    Buffers: shared hit=6976
                    ->  Hash Right Anti Join  (cost=48.29..173.01 rows=192 width=24) (actual time=14.392..14.400 rows=5 loops=1)
                          Hash Cond: (p.cp_block_index = block_bit_probes.cp_block_index)
                          Buffers: shared hit=6976
                          ->  Hash Join  (cost=7.00..129.78 rows=8 width=8) (actual time=3.224..12.355 rows=1011 loops=1)
                                Hash Cond: ((p.cp_block_index = bl.cp_block_index) AND (p.bloom_idx = bl.bloom_idx))
                                Join Filter: ((bl.bloom_filter IS NULL) OR (p.bit_mask <> get_byte(bl.bloom_filter, (p.byte_pos % length(bl.bloom_filter)))))
                                Rows Removed by Join Filter: 624
                                Buffers: shared hit=6976
                                ->  CTE Scan on block_bit_probes p  (cost=0.00..32.70 rows=1635 width=18) (actual time=0.000..0.211 rows=1635 loops=1)
                                ->  Hash  (cost=4.00..4.00 rows=200 width=42) (actual time=3.142..3.142 rows=327 loops=1)
                                      Buckets: 1024  Batches: 1  Memory Usage: 93kB
                                      Buffers: shared hit=1348
                                      ->  CTE Scan on block_lookup bl  (cost=0.00..4.00 rows=200 width=42) (actual time=0.656..3.060 rows=327 loops=1)
                                            Buffers: shared hit=1348
                          ->  Hash  (cost=38.79..38.79 rows=200 width=8) (actual time=1.773..1.774 rows=327 loops=1)
                                Buckets: 1024  Batches: 1  Memory Usage: 21kB
                                ->  HashAggregate  (cost=36.79..38.79 rows=200 width=8) (actual time=1.668..1.717 rows=327 loops=1)
                                      Group Key: block_bit_probes.cp_block_index
                                      Batches: 1  Memory Usage: 61kB
                                      ->  CTE Scan on block_bit_probes  (cost=0.00..32.70 rows=1635 width=8) (actual time=0.007..1.162 rows=1635 loops=1)
        ->  Index Scan using cp_blooms_pkey on cp_blooms cb  (cost=0.43..21883.74 rows=1 width=8) (actual time=0.040..0.408 rows=62 loops=1)
              Index Cond: ((cp_sequence_number >= ((block_bit_probes.cp_block_index * '1000'::bigint))) AND (cp_sequence_number <= ((((block_bit_probes.cp_block_index * '1000'::bigint) + '1000'::bigint) - 1))))
              Filter: (((get_byte(bloom_filter, (208 % length(bloom_filter))) & 8) = 8) AND ((get_byte(bloom_filter, (5988 % length(bloom_filter))) & 32) = 32) AND ((get_byte(bloom_filter, (8084 % length(bloom_filter))) & 32) = 32) AND ((get_byte(bloom_filter, (15059 % length(bloom_filter))) & 32) = 32) AND ((get_byte(bloom_filter, (8330 % length(bloom_filter))) & 64) = 64) AND ((get_byte(bloom_filter, (10313 % length(bloom_filter))) & 4) = 4))
              Rows Removed by Filter: 318
              Buffers: shared hit=311
Planning:
  Buffers: shared hit=78
Planning Time: 2.181 ms
Execution Time: 15.060 ms
```

## Stack:
#24788 
#24900 👈 
#25097
---

## Release notes

Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.

For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.

- [ ] Protocol: 
- [ ] Nodes (Validators and Full nodes): 
- [ ] gRPC:
- [ ] JSON-RPC: 
- [ ] GraphQL: 
- [ ] CLI: 
- [ ] Rust SDK:
- [ ] Indexing Framework:
henryachen added a commit that referenced this pull request Mar 5, 2026
## Description 

Adds a new transactionsScan query that scans checkpoints using bloom
filters to find transactions matching filter criteria.

  Changes

  GraphQL (sui-indexer-alt-graphql)
  - New transactionsScan query endpoint
  - maxScanLimit service config to limit checkpoints scanned per query
  - Transaction filter and scan logic in scan.rs
  
  Schema (sui-indexer-alt-schema)
- `bloom_contains` SQL function for checking probe membership in bloom
filters

  Reader (sui-indexer-alt-reader)
  - `cp_blooms` loader for batch-loading bloom filter data


## Test plan 

```
cargo nextest run -p sui-indexer-alt-graphql graphql_scan_limit_tests
cargo nextest run -p sui-indexer-alt-graphql
cargo nextest run -p sui-indexer-alt
cargo nextest run -p sui-indexer-alt-schema
```

Query Plan:
```
Limit  (cost=1911.69..1358704.33 rows=62 width=8) (actual time=14.454..14.836 rows=62 loops=1)
  Buffers: shared hit=7287
  CTE block_bit_probes
    ->  ProjectSet  (cost=0.00..8.20 rows=1635 width=18) (actual time=0.004..0.431 rows=1635 loops=1)
          ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1)
  ->  Nested Loop  (cost=1903.49..1358696.13 rows=62 width=8) (actual time=14.453..14.830 rows=62 loops=1)
        Buffers: shared hit=7287
        ->  Limit  (cost=1903.06..1903.22 rows=62 width=24) (actual time=14.411..14.414 rows=1 loops=1)
              Buffers: shared hit=6976
              CTE block_lookup
                ->  Nested Loop Left Join  (cost=41.30..1723.38 rows=200 width=131) (actual time=0.654..2.899 rows=327 loops=1)
                      Buffers: shared hit=1348
                      ->  HashAggregate  (cost=40.88..42.88 rows=200 width=10) (actual time=0.620..0.682 rows=327 loops=1)
                            Group Key: block_bit_probes_1.cp_block_index, block_bit_probes_1.bloom_idx
                            Batches: 1  Memory Usage: 61kB
                            ->  CTE Scan on block_bit_probes block_bit_probes_1  (cost=0.00..32.70 rows=1635 width=10) (actual time=0.000..0.159 rows=1635 loops=1)
                      ->  Index Scan using cp_bloom_blocks_pkey on cp_bloom_blocks bb  (cost=0.42..8.40 rows=1 width=131) (actual time=0.006..0.006 rows=1 loops=327)
                            Index Cond: ((cp_block_index = block_bit_probes_1.cp_block_index) AND (bloom_block_index = block_bit_probes_1.bloom_idx))
                            Buffers: shared hit=1343
              ->  Sort  (cost=179.69..180.17 rows=192 width=24) (actual time=14.410..14.412 rows=1 loops=1)
                    Sort Key: ((block_bit_probes.cp_block_index * '1000'::bigint))
                    Sort Method: quicksort  Memory: 25kB
                    Buffers: shared hit=6976
                    ->  Hash Right Anti Join  (cost=48.29..173.01 rows=192 width=24) (actual time=14.392..14.400 rows=5 loops=1)
                          Hash Cond: (p.cp_block_index = block_bit_probes.cp_block_index)
                          Buffers: shared hit=6976
                          ->  Hash Join  (cost=7.00..129.78 rows=8 width=8) (actual time=3.224..12.355 rows=1011 loops=1)
                                Hash Cond: ((p.cp_block_index = bl.cp_block_index) AND (p.bloom_idx = bl.bloom_idx))
                                Join Filter: ((bl.bloom_filter IS NULL) OR (p.bit_mask <> get_byte(bl.bloom_filter, (p.byte_pos % length(bl.bloom_filter)))))
                                Rows Removed by Join Filter: 624
                                Buffers: shared hit=6976
                                ->  CTE Scan on block_bit_probes p  (cost=0.00..32.70 rows=1635 width=18) (actual time=0.000..0.211 rows=1635 loops=1)
                                ->  Hash  (cost=4.00..4.00 rows=200 width=42) (actual time=3.142..3.142 rows=327 loops=1)
                                      Buckets: 1024  Batches: 1  Memory Usage: 93kB
                                      Buffers: shared hit=1348
                                      ->  CTE Scan on block_lookup bl  (cost=0.00..4.00 rows=200 width=42) (actual time=0.656..3.060 rows=327 loops=1)
                                            Buffers: shared hit=1348
                          ->  Hash  (cost=38.79..38.79 rows=200 width=8) (actual time=1.773..1.774 rows=327 loops=1)
                                Buckets: 1024  Batches: 1  Memory Usage: 21kB
                                ->  HashAggregate  (cost=36.79..38.79 rows=200 width=8) (actual time=1.668..1.717 rows=327 loops=1)
                                      Group Key: block_bit_probes.cp_block_index
                                      Batches: 1  Memory Usage: 61kB
                                      ->  CTE Scan on block_bit_probes  (cost=0.00..32.70 rows=1635 width=8) (actual time=0.007..1.162 rows=1635 loops=1)
        ->  Index Scan using cp_blooms_pkey on cp_blooms cb  (cost=0.43..21883.74 rows=1 width=8) (actual time=0.040..0.408 rows=62 loops=1)
              Index Cond: ((cp_sequence_number >= ((block_bit_probes.cp_block_index * '1000'::bigint))) AND (cp_sequence_number <= ((((block_bit_probes.cp_block_index * '1000'::bigint) + '1000'::bigint) - 1))))
              Filter: (((get_byte(bloom_filter, (208 % length(bloom_filter))) & 8) = 8) AND ((get_byte(bloom_filter, (5988 % length(bloom_filter))) & 32) = 32) AND ((get_byte(bloom_filter, (8084 % length(bloom_filter))) & 32) = 32) AND ((get_byte(bloom_filter, (15059 % length(bloom_filter))) & 32) = 32) AND ((get_byte(bloom_filter, (8330 % length(bloom_filter))) & 64) = 64) AND ((get_byte(bloom_filter, (10313 % length(bloom_filter))) & 4) = 4))
              Rows Removed by Filter: 318
              Buffers: shared hit=311
Planning:
  Buffers: shared hit=78
Planning Time: 2.181 ms
Execution Time: 15.060 ms
```

## Stack:
#24788 
#24900 👈 
#25097
---

## Release notes

Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.

For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.

- [ ] Protocol: 
- [ ] Nodes (Validators and Full nodes): 
- [ ] gRPC:
- [ ] JSON-RPC: 
- [ ] GraphQL: 
- [ ] CLI: 
- [ ] Rust SDK:
- [ ] Indexing Framework:
@henryachen henryachen mentioned this pull request Mar 5, 2026
16 tasks
mystenmark pushed a commit that referenced this pull request Mar 5, 2026
## Description 

Adds a new transactionsScan query that scans checkpoints using bloom
filters to find transactions matching filter criteria.

  Changes

  GraphQL (sui-indexer-alt-graphql)
  - New transactionsScan query endpoint
  - maxScanLimit service config to limit checkpoints scanned per query
  - Transaction filter and scan logic in scan.rs
  
  Schema (sui-indexer-alt-schema)
- `bloom_contains` SQL function for checking probe membership in bloom
filters

  Reader (sui-indexer-alt-reader)
  - `cp_blooms` loader for batch-loading bloom filter data


## Test plan 

```
cargo nextest run -p sui-indexer-alt-graphql graphql_scan_limit_tests
cargo nextest run -p sui-indexer-alt-graphql
cargo nextest run -p sui-indexer-alt
cargo nextest run -p sui-indexer-alt-schema
```

Query Plan:
```
Limit  (cost=1911.69..1358704.33 rows=62 width=8) (actual time=14.454..14.836 rows=62 loops=1)
  Buffers: shared hit=7287
  CTE block_bit_probes
    ->  ProjectSet  (cost=0.00..8.20 rows=1635 width=18) (actual time=0.004..0.431 rows=1635 loops=1)
          ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1)
  ->  Nested Loop  (cost=1903.49..1358696.13 rows=62 width=8) (actual time=14.453..14.830 rows=62 loops=1)
        Buffers: shared hit=7287
        ->  Limit  (cost=1903.06..1903.22 rows=62 width=24) (actual time=14.411..14.414 rows=1 loops=1)
              Buffers: shared hit=6976
              CTE block_lookup
                ->  Nested Loop Left Join  (cost=41.30..1723.38 rows=200 width=131) (actual time=0.654..2.899 rows=327 loops=1)
                      Buffers: shared hit=1348
                      ->  HashAggregate  (cost=40.88..42.88 rows=200 width=10) (actual time=0.620..0.682 rows=327 loops=1)
                            Group Key: block_bit_probes_1.cp_block_index, block_bit_probes_1.bloom_idx
                            Batches: 1  Memory Usage: 61kB
                            ->  CTE Scan on block_bit_probes block_bit_probes_1  (cost=0.00..32.70 rows=1635 width=10) (actual time=0.000..0.159 rows=1635 loops=1)
                      ->  Index Scan using cp_bloom_blocks_pkey on cp_bloom_blocks bb  (cost=0.42..8.40 rows=1 width=131) (actual time=0.006..0.006 rows=1 loops=327)
                            Index Cond: ((cp_block_index = block_bit_probes_1.cp_block_index) AND (bloom_block_index = block_bit_probes_1.bloom_idx))
                            Buffers: shared hit=1343
              ->  Sort  (cost=179.69..180.17 rows=192 width=24) (actual time=14.410..14.412 rows=1 loops=1)
                    Sort Key: ((block_bit_probes.cp_block_index * '1000'::bigint))
                    Sort Method: quicksort  Memory: 25kB
                    Buffers: shared hit=6976
                    ->  Hash Right Anti Join  (cost=48.29..173.01 rows=192 width=24) (actual time=14.392..14.400 rows=5 loops=1)
                          Hash Cond: (p.cp_block_index = block_bit_probes.cp_block_index)
                          Buffers: shared hit=6976
                          ->  Hash Join  (cost=7.00..129.78 rows=8 width=8) (actual time=3.224..12.355 rows=1011 loops=1)
                                Hash Cond: ((p.cp_block_index = bl.cp_block_index) AND (p.bloom_idx = bl.bloom_idx))
                                Join Filter: ((bl.bloom_filter IS NULL) OR (p.bit_mask <> get_byte(bl.bloom_filter, (p.byte_pos % length(bl.bloom_filter)))))
                                Rows Removed by Join Filter: 624
                                Buffers: shared hit=6976
                                ->  CTE Scan on block_bit_probes p  (cost=0.00..32.70 rows=1635 width=18) (actual time=0.000..0.211 rows=1635 loops=1)
                                ->  Hash  (cost=4.00..4.00 rows=200 width=42) (actual time=3.142..3.142 rows=327 loops=1)
                                      Buckets: 1024  Batches: 1  Memory Usage: 93kB
                                      Buffers: shared hit=1348
                                      ->  CTE Scan on block_lookup bl  (cost=0.00..4.00 rows=200 width=42) (actual time=0.656..3.060 rows=327 loops=1)
                                            Buffers: shared hit=1348
                          ->  Hash  (cost=38.79..38.79 rows=200 width=8) (actual time=1.773..1.774 rows=327 loops=1)
                                Buckets: 1024  Batches: 1  Memory Usage: 21kB
                                ->  HashAggregate  (cost=36.79..38.79 rows=200 width=8) (actual time=1.668..1.717 rows=327 loops=1)
                                      Group Key: block_bit_probes.cp_block_index
                                      Batches: 1  Memory Usage: 61kB
                                      ->  CTE Scan on block_bit_probes  (cost=0.00..32.70 rows=1635 width=8) (actual time=0.007..1.162 rows=1635 loops=1)
        ->  Index Scan using cp_blooms_pkey on cp_blooms cb  (cost=0.43..21883.74 rows=1 width=8) (actual time=0.040..0.408 rows=62 loops=1)
              Index Cond: ((cp_sequence_number >= ((block_bit_probes.cp_block_index * '1000'::bigint))) AND (cp_sequence_number <= ((((block_bit_probes.cp_block_index * '1000'::bigint) + '1000'::bigint) - 1))))
              Filter: (((get_byte(bloom_filter, (208 % length(bloom_filter))) & 8) = 8) AND ((get_byte(bloom_filter, (5988 % length(bloom_filter))) & 32) = 32) AND ((get_byte(bloom_filter, (8084 % length(bloom_filter))) & 32) = 32) AND ((get_byte(bloom_filter, (15059 % length(bloom_filter))) & 32) = 32) AND ((get_byte(bloom_filter, (8330 % length(bloom_filter))) & 64) = 64) AND ((get_byte(bloom_filter, (10313 % length(bloom_filter))) & 4) = 4))
              Rows Removed by Filter: 318
              Buffers: shared hit=311
Planning:
  Buffers: shared hit=78
Planning Time: 2.181 ms
Execution Time: 15.060 ms
```

## Stack:
#24788 
#24900 👈 
#25097
---

## Release notes

Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.

For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.

- [ ] Protocol: 
- [ ] Nodes (Validators and Full nodes): 
- [ ] gRPC:
- [ ] JSON-RPC: 
- [ ] GraphQL: 
- [ ] CLI: 
- [ ] Rust SDK:
- [ ] Indexing Framework:
jessiemongeon1 pushed a commit to jessiemongeon1/sui that referenced this pull request Mar 5, 2026
## Description 

Adds a new transactionsScan query that scans checkpoints using bloom
filters to find transactions matching filter criteria.

  Changes

  GraphQL (sui-indexer-alt-graphql)
  - New transactionsScan query endpoint
  - maxScanLimit service config to limit checkpoints scanned per query
  - Transaction filter and scan logic in scan.rs
  
  Schema (sui-indexer-alt-schema)
- `bloom_contains` SQL function for checking probe membership in bloom
filters

  Reader (sui-indexer-alt-reader)
  - `cp_blooms` loader for batch-loading bloom filter data


## Test plan 

```
cargo nextest run -p sui-indexer-alt-graphql graphql_scan_limit_tests
cargo nextest run -p sui-indexer-alt-graphql
cargo nextest run -p sui-indexer-alt
cargo nextest run -p sui-indexer-alt-schema
```

Query Plan:
```
Limit  (cost=1911.69..1358704.33 rows=62 width=8) (actual time=14.454..14.836 rows=62 loops=1)
  Buffers: shared hit=7287
  CTE block_bit_probes
    ->  ProjectSet  (cost=0.00..8.20 rows=1635 width=18) (actual time=0.004..0.431 rows=1635 loops=1)
          ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1)
  ->  Nested Loop  (cost=1903.49..1358696.13 rows=62 width=8) (actual time=14.453..14.830 rows=62 loops=1)
        Buffers: shared hit=7287
        ->  Limit  (cost=1903.06..1903.22 rows=62 width=24) (actual time=14.411..14.414 rows=1 loops=1)
              Buffers: shared hit=6976
              CTE block_lookup
                ->  Nested Loop Left Join  (cost=41.30..1723.38 rows=200 width=131) (actual time=0.654..2.899 rows=327 loops=1)
                      Buffers: shared hit=1348
                      ->  HashAggregate  (cost=40.88..42.88 rows=200 width=10) (actual time=0.620..0.682 rows=327 loops=1)
                            Group Key: block_bit_probes_1.cp_block_index, block_bit_probes_1.bloom_idx
                            Batches: 1  Memory Usage: 61kB
                            ->  CTE Scan on block_bit_probes block_bit_probes_1  (cost=0.00..32.70 rows=1635 width=10) (actual time=0.000..0.159 rows=1635 loops=1)
                      ->  Index Scan using cp_bloom_blocks_pkey on cp_bloom_blocks bb  (cost=0.42..8.40 rows=1 width=131) (actual time=0.006..0.006 rows=1 loops=327)
                            Index Cond: ((cp_block_index = block_bit_probes_1.cp_block_index) AND (bloom_block_index = block_bit_probes_1.bloom_idx))
                            Buffers: shared hit=1343
              ->  Sort  (cost=179.69..180.17 rows=192 width=24) (actual time=14.410..14.412 rows=1 loops=1)
                    Sort Key: ((block_bit_probes.cp_block_index * '1000'::bigint))
                    Sort Method: quicksort  Memory: 25kB
                    Buffers: shared hit=6976
                    ->  Hash Right Anti Join  (cost=48.29..173.01 rows=192 width=24) (actual time=14.392..14.400 rows=5 loops=1)
                          Hash Cond: (p.cp_block_index = block_bit_probes.cp_block_index)
                          Buffers: shared hit=6976
                          ->  Hash Join  (cost=7.00..129.78 rows=8 width=8) (actual time=3.224..12.355 rows=1011 loops=1)
                                Hash Cond: ((p.cp_block_index = bl.cp_block_index) AND (p.bloom_idx = bl.bloom_idx))
                                Join Filter: ((bl.bloom_filter IS NULL) OR (p.bit_mask <> get_byte(bl.bloom_filter, (p.byte_pos % length(bl.bloom_filter)))))
                                Rows Removed by Join Filter: 624
                                Buffers: shared hit=6976
                                ->  CTE Scan on block_bit_probes p  (cost=0.00..32.70 rows=1635 width=18) (actual time=0.000..0.211 rows=1635 loops=1)
                                ->  Hash  (cost=4.00..4.00 rows=200 width=42) (actual time=3.142..3.142 rows=327 loops=1)
                                      Buckets: 1024  Batches: 1  Memory Usage: 93kB
                                      Buffers: shared hit=1348
                                      ->  CTE Scan on block_lookup bl  (cost=0.00..4.00 rows=200 width=42) (actual time=0.656..3.060 rows=327 loops=1)
                                            Buffers: shared hit=1348
                          ->  Hash  (cost=38.79..38.79 rows=200 width=8) (actual time=1.773..1.774 rows=327 loops=1)
                                Buckets: 1024  Batches: 1  Memory Usage: 21kB
                                ->  HashAggregate  (cost=36.79..38.79 rows=200 width=8) (actual time=1.668..1.717 rows=327 loops=1)
                                      Group Key: block_bit_probes.cp_block_index
                                      Batches: 1  Memory Usage: 61kB
                                      ->  CTE Scan on block_bit_probes  (cost=0.00..32.70 rows=1635 width=8) (actual time=0.007..1.162 rows=1635 loops=1)
        ->  Index Scan using cp_blooms_pkey on cp_blooms cb  (cost=0.43..21883.74 rows=1 width=8) (actual time=0.040..0.408 rows=62 loops=1)
              Index Cond: ((cp_sequence_number >= ((block_bit_probes.cp_block_index * '1000'::bigint))) AND (cp_sequence_number <= ((((block_bit_probes.cp_block_index * '1000'::bigint) + '1000'::bigint) - 1))))
              Filter: (((get_byte(bloom_filter, (208 % length(bloom_filter))) & 8) = 8) AND ((get_byte(bloom_filter, (5988 % length(bloom_filter))) & 32) = 32) AND ((get_byte(bloom_filter, (8084 % length(bloom_filter))) & 32) = 32) AND ((get_byte(bloom_filter, (15059 % length(bloom_filter))) & 32) = 32) AND ((get_byte(bloom_filter, (8330 % length(bloom_filter))) & 64) = 64) AND ((get_byte(bloom_filter, (10313 % length(bloom_filter))) & 4) = 4))
              Rows Removed by Filter: 318
              Buffers: shared hit=311
Planning:
  Buffers: shared hit=78
Planning Time: 2.181 ms
Execution Time: 15.060 ms
```

## Stack:
MystenLabs#24788 
MystenLabs#24900 👈 
MystenLabs#25097
---

## Release notes

Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.

For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.

- [ ] Protocol: 
- [ ] Nodes (Validators and Full nodes): 
- [ ] gRPC:
- [ ] JSON-RPC: 
- [ ] GraphQL: 
- [ ] CLI: 
- [ ] Rust SDK:
- [ ] Indexing Framework:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants