graphql-alt: add event scanning#25097
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
amnn
left a comment
There was a problem hiding this comment.
I'll wait to review this after the comments in the stack below are resolved -- just hit re-request review when that's done!
a8ac40e to
ea20bd1
Compare
23e091a to
a0c3745
Compare
| } | ||
|
|
||
| /// The transaction index bounds `[tx_lo, tx_hi)` within a checkpoint based on cursor positions. | ||
| pub(crate) fn cp_tx_bounds<C: ScanCursor>( |
There was a problem hiding this comment.
Looking at its callsites, when we get here, we've already bloom-filtered for the candidate_cps. It's impossible for candidates_cps to contain checkpoints outside the cursor range. So at its core we're checking if the after cursor is a slice into the first checkpoint's txns, and the before cursor is a slice somewhere into the middle of the last checkpoint's txns. For most other checkpoints, we will return Range<checkpoint_digests.len()>
I think this was a bit obscured by the impl., but I don't think we need to modify the impl, could we add some calrification to the docs with the intent of what this fn is doing?
There was a problem hiding this comment.
Agreed cp clamping here makes the function harder to reason able, this was simplified this based on changes in transaction scans!
| }; | ||
| let bounds = directional_iter( | ||
| page, | ||
| cp_tx_bounds(page, cp_sequence_number, checkpoint_digests.len()), |
There was a problem hiding this comment.
tbh i feel like cp_tx_bounds and directional_iter belong in the paginate module, rather than disbursed in cursor and scan
| pipelines.insert("obj_versions".to_string()); | ||
| }; | ||
| Query.[events] |pipelines, filters| { | ||
| Query.[events, eventsScan] |pipelines, filters| { |
There was a problem hiding this comment.
eventsScan only depends on the cp_blooms + block tables right? they have entierly different retentions?
## Description Adds two new indexer pipelines to support transaction scanning queries via bloom filters: cp_blooms concurrent pipeline: Per-checkpoint bloom filters - Indexes function calls, affected objects, senders, and recipients for each checkpoint - Folds bloom filter if it is sparse up to a certain min size or density - Stored in cp_blooms table with one row per checkpoint cp_bloom_blocks sequential pipeline: Blocked bloom filters spanning 1000 checkpoints - Splits each 256KB bloom into 128 separate smaller 2KB blocks - Stored in cp_bloom_blocks table with ~128 rows per cp_block (sparse, only non-zero blocks) - ORs bloom blocks on conflicts Schema changes: - cp_blooms: Per-checkpoint bloom filters - cp_bloom_blocks: Blocked bloom filters with (cp_block_id, bloom_block_index) primary key ## Test plan ``` cargo nextest run -p sui-indexer-alt cargo nextest run -p sui-indexer-alt-schema ``` ## Stack: #24788 👈 #24900 #25097 --- ## Release notes Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required. For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates. - [ ] Protocol: - [ ] Nodes (Validators and Full nodes): - [ ] gRPC: - [ ] JSON-RPC: - [x] GraphQL: add bloom filter pipelines for scanning APIs - [ ] CLI: - [ ] Rust SDK: - [ ] Indexing Framework:
ea20bd1 to
0173e16
Compare
## Description Adds two new indexer pipelines to support transaction scanning queries via bloom filters: cp_blooms concurrent pipeline: Per-checkpoint bloom filters - Indexes function calls, affected objects, senders, and recipients for each checkpoint - Folds bloom filter if it is sparse up to a certain min size or density - Stored in cp_blooms table with one row per checkpoint cp_bloom_blocks sequential pipeline: Blocked bloom filters spanning 1000 checkpoints - Splits each 256KB bloom into 128 separate smaller 2KB blocks - Stored in cp_bloom_blocks table with ~128 rows per cp_block (sparse, only non-zero blocks) - ORs bloom blocks on conflicts Schema changes: - cp_blooms: Per-checkpoint bloom filters - cp_bloom_blocks: Blocked bloom filters with (cp_block_id, bloom_block_index) primary key ## Test plan ``` cargo nextest run -p sui-indexer-alt cargo nextest run -p sui-indexer-alt-schema ``` ## Stack: #24788 👈 #24900 #25097 --- ## Release notes Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required. For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates. - [ ] Protocol: - [ ] Nodes (Validators and Full nodes): - [ ] gRPC: - [ ] JSON-RPC: - [x] GraphQL: add bloom filter pipelines for scanning APIs - [ ] CLI: - [ ] Rust SDK: - [ ] Indexing Framework:
## Description Adds two new indexer pipelines to support transaction scanning queries via bloom filters: cp_blooms concurrent pipeline: Per-checkpoint bloom filters - Indexes function calls, affected objects, senders, and recipients for each checkpoint - Folds bloom filter if it is sparse up to a certain min size or density - Stored in cp_blooms table with one row per checkpoint cp_bloom_blocks sequential pipeline: Blocked bloom filters spanning 1000 checkpoints - Splits each 256KB bloom into 128 separate smaller 2KB blocks - Stored in cp_bloom_blocks table with ~128 rows per cp_block (sparse, only non-zero blocks) - ORs bloom blocks on conflicts Schema changes: - cp_blooms: Per-checkpoint bloom filters - cp_bloom_blocks: Blocked bloom filters with (cp_block_id, bloom_block_index) primary key ## Test plan ``` cargo nextest run -p sui-indexer-alt cargo nextest run -p sui-indexer-alt-schema ``` ## Stack: #24788 👈 #24900 #25097 --- ## Release notes Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required. For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates. - [ ] Protocol: - [ ] Nodes (Validators and Full nodes): - [ ] gRPC: - [ ] JSON-RPC: - [x] GraphQL: add bloom filter pipelines for scanning APIs - [ ] CLI: - [ ] Rust SDK: - [ ] Indexing Framework:
496ead4 to
9d35a40
Compare
a0c3745 to
a4138a4
Compare
a4138a4 to
a619462
Compare
amnn
left a comment
There was a problem hiding this comment.
Couple of small comments, but this is otherwise good to go as well!
| let watermarks: &Arc<Watermarks> = ctx.data()?; | ||
| let available_range_key = AvailableRangeKey { | ||
| type_: "Query".to_string(), | ||
| field: Some("eventsScan".to_string()), |
There was a problem hiding this comment.
| field: Some("eventsScan".to_string()), | |
| field: Some("scanEvents".to_string()), |
| return Ok(Connection::new(false, false)); | ||
| }; | ||
|
|
||
| let Some(cp_bounds) = checkpoint_bounds( |
There was a problem hiding this comment.
I mentioned on the transaction scanning PR that we should be failing earlier if we aren't supplied with the appropriate checkpoint bounds -- practically what that means is that the scan_range > limits.max_scan_limit check below gets moved above this line, and looks something like this:
match filter {
// Assume the max scan limit is >= 1
EventFilter { at_checkpoint: Some(_), .. } => { /* nop */ },
EventFilter {
after_checkpoint: Some(after),
before_checkpoint: Some(before),
..
} => {
let after: u64 = after.into();
let before: u64 = before.into();
let requested = before.saturating_sub(after).saturating_sub(1);
if range > limits.max_scan_limit {
return Err(bad_use_input(ScanError::LimitExceeded {
requested,
max: limits.max_scan_limit,
});
}
}
// New error case: complain that the query is missing checkpoint bounds.
_ => { /* ... */ }
}You still need to run things through checkpoint_bounds, for consistency reasons, but this makes the nature of the constraint less dynamic and therefore clearer for the user.
| .collect() | ||
| } | ||
|
|
||
| pub(super) type EventsBySequenceNumbers = BTreeMap<EventCursor, Event>; |
There was a problem hiding this comment.
nit: types to the top. Also, should bloom/mod.rs be moved to just bloom.rs now that it's the only module here?
| Query.[scanEvents] |pipelines, _filters| { | ||
| pipelines.insert("cp_blooms".to_string()); | ||
| pipelines.insert("cp_bloom_blocks".to_string()); | ||
| }; |
There was a problem hiding this comment.
nit: Maybe just merge this with scanTransactions below?
| last: Option<u64>, | ||
| before: Option<CEvent>, | ||
| filter: Option<EventFilter>, | ||
| ) -> Result<Option<Connection<String, Event>>, RpcError<ScanError>> { |
a619462 to
ecdf347
Compare
## Description - Added scanEvent GraphQL query endpoint for scanning events using bloom filters - Reuse existing scan infrastructure (bloom querying, pagination) from Transaction Scanning - Added e2e tests covering filtering by sender, module, type, and pagination ## Test plan ``` cargo nextest run -p sui-indexer-alt-graphql cargo nextest run -p sui-indexer-alt-graphql -- schema_sdl cargo nextest run -p sui-indexer-alt-graphql --features staging -- schema_sdl cargo nextest run -p sui-indexer-alt-e2e-tests -- graphql/ ``` ## Stack: #24788 #24900 #25097 👈 --- ## Release notes Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required. For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates. - [ ] Protocol: - [ ] Nodes (Validators and Full nodes): - [ ] gRPC: - [ ] JSON-RPC: - [ ] GraphQL: - [ ] CLI: - [ ] Rust SDK: - [ ] Indexing Framework:
## Description
Adds a new transactionsScan query that scans checkpoints using bloom
filters to find transactions matching filter criteria.
Changes
GraphQL (sui-indexer-alt-graphql)
- New transactionsScan query endpoint
- maxScanLimit service config to limit checkpoints scanned per query
- Transaction filter and scan logic in scan.rs
Schema (sui-indexer-alt-schema)
- `bloom_contains` SQL function for checking probe membership in bloom
filters
Reader (sui-indexer-alt-reader)
- `cp_blooms` loader for batch-loading bloom filter data
## Test plan
```
cargo nextest run -p sui-indexer-alt-graphql graphql_scan_limit_tests
cargo nextest run -p sui-indexer-alt-graphql
cargo nextest run -p sui-indexer-alt
cargo nextest run -p sui-indexer-alt-schema
```
Query Plan:
```
Limit (cost=1911.69..1358704.33 rows=62 width=8) (actual time=14.454..14.836 rows=62 loops=1)
Buffers: shared hit=7287
CTE block_bit_probes
-> ProjectSet (cost=0.00..8.20 rows=1635 width=18) (actual time=0.004..0.431 rows=1635 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1)
-> Nested Loop (cost=1903.49..1358696.13 rows=62 width=8) (actual time=14.453..14.830 rows=62 loops=1)
Buffers: shared hit=7287
-> Limit (cost=1903.06..1903.22 rows=62 width=24) (actual time=14.411..14.414 rows=1 loops=1)
Buffers: shared hit=6976
CTE block_lookup
-> Nested Loop Left Join (cost=41.30..1723.38 rows=200 width=131) (actual time=0.654..2.899 rows=327 loops=1)
Buffers: shared hit=1348
-> HashAggregate (cost=40.88..42.88 rows=200 width=10) (actual time=0.620..0.682 rows=327 loops=1)
Group Key: block_bit_probes_1.cp_block_index, block_bit_probes_1.bloom_idx
Batches: 1 Memory Usage: 61kB
-> CTE Scan on block_bit_probes block_bit_probes_1 (cost=0.00..32.70 rows=1635 width=10) (actual time=0.000..0.159 rows=1635 loops=1)
-> Index Scan using cp_bloom_blocks_pkey on cp_bloom_blocks bb (cost=0.42..8.40 rows=1 width=131) (actual time=0.006..0.006 rows=1 loops=327)
Index Cond: ((cp_block_index = block_bit_probes_1.cp_block_index) AND (bloom_block_index = block_bit_probes_1.bloom_idx))
Buffers: shared hit=1343
-> Sort (cost=179.69..180.17 rows=192 width=24) (actual time=14.410..14.412 rows=1 loops=1)
Sort Key: ((block_bit_probes.cp_block_index * '1000'::bigint))
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=6976
-> Hash Right Anti Join (cost=48.29..173.01 rows=192 width=24) (actual time=14.392..14.400 rows=5 loops=1)
Hash Cond: (p.cp_block_index = block_bit_probes.cp_block_index)
Buffers: shared hit=6976
-> Hash Join (cost=7.00..129.78 rows=8 width=8) (actual time=3.224..12.355 rows=1011 loops=1)
Hash Cond: ((p.cp_block_index = bl.cp_block_index) AND (p.bloom_idx = bl.bloom_idx))
Join Filter: ((bl.bloom_filter IS NULL) OR (p.bit_mask <> get_byte(bl.bloom_filter, (p.byte_pos % length(bl.bloom_filter)))))
Rows Removed by Join Filter: 624
Buffers: shared hit=6976
-> CTE Scan on block_bit_probes p (cost=0.00..32.70 rows=1635 width=18) (actual time=0.000..0.211 rows=1635 loops=1)
-> Hash (cost=4.00..4.00 rows=200 width=42) (actual time=3.142..3.142 rows=327 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 93kB
Buffers: shared hit=1348
-> CTE Scan on block_lookup bl (cost=0.00..4.00 rows=200 width=42) (actual time=0.656..3.060 rows=327 loops=1)
Buffers: shared hit=1348
-> Hash (cost=38.79..38.79 rows=200 width=8) (actual time=1.773..1.774 rows=327 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 21kB
-> HashAggregate (cost=36.79..38.79 rows=200 width=8) (actual time=1.668..1.717 rows=327 loops=1)
Group Key: block_bit_probes.cp_block_index
Batches: 1 Memory Usage: 61kB
-> CTE Scan on block_bit_probes (cost=0.00..32.70 rows=1635 width=8) (actual time=0.007..1.162 rows=1635 loops=1)
-> Index Scan using cp_blooms_pkey on cp_blooms cb (cost=0.43..21883.74 rows=1 width=8) (actual time=0.040..0.408 rows=62 loops=1)
Index Cond: ((cp_sequence_number >= ((block_bit_probes.cp_block_index * '1000'::bigint))) AND (cp_sequence_number <= ((((block_bit_probes.cp_block_index * '1000'::bigint) + '1000'::bigint) - 1))))
Filter: (((get_byte(bloom_filter, (208 % length(bloom_filter))) & 8) = 8) AND ((get_byte(bloom_filter, (5988 % length(bloom_filter))) & 32) = 32) AND ((get_byte(bloom_filter, (8084 % length(bloom_filter))) & 32) = 32) AND ((get_byte(bloom_filter, (15059 % length(bloom_filter))) & 32) = 32) AND ((get_byte(bloom_filter, (8330 % length(bloom_filter))) & 64) = 64) AND ((get_byte(bloom_filter, (10313 % length(bloom_filter))) & 4) = 4))
Rows Removed by Filter: 318
Buffers: shared hit=311
Planning:
Buffers: shared hit=78
Planning Time: 2.181 ms
Execution Time: 15.060 ms
```
## Stack:
#24788
#24900 👈
#25097
---
## Release notes
Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.
For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.
- [ ] Protocol:
- [ ] Nodes (Validators and Full nodes):
- [ ] gRPC:
- [ ] JSON-RPC:
- [ ] GraphQL:
- [ ] CLI:
- [ ] Rust SDK:
- [ ] Indexing Framework:
## Description
Adds a new transactionsScan query that scans checkpoints using bloom
filters to find transactions matching filter criteria.
Changes
GraphQL (sui-indexer-alt-graphql)
- New transactionsScan query endpoint
- maxScanLimit service config to limit checkpoints scanned per query
- Transaction filter and scan logic in scan.rs
Schema (sui-indexer-alt-schema)
- `bloom_contains` SQL function for checking probe membership in bloom
filters
Reader (sui-indexer-alt-reader)
- `cp_blooms` loader for batch-loading bloom filter data
## Test plan
```
cargo nextest run -p sui-indexer-alt-graphql graphql_scan_limit_tests
cargo nextest run -p sui-indexer-alt-graphql
cargo nextest run -p sui-indexer-alt
cargo nextest run -p sui-indexer-alt-schema
```
Query Plan:
```
Limit (cost=1911.69..1358704.33 rows=62 width=8) (actual time=14.454..14.836 rows=62 loops=1)
Buffers: shared hit=7287
CTE block_bit_probes
-> ProjectSet (cost=0.00..8.20 rows=1635 width=18) (actual time=0.004..0.431 rows=1635 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1)
-> Nested Loop (cost=1903.49..1358696.13 rows=62 width=8) (actual time=14.453..14.830 rows=62 loops=1)
Buffers: shared hit=7287
-> Limit (cost=1903.06..1903.22 rows=62 width=24) (actual time=14.411..14.414 rows=1 loops=1)
Buffers: shared hit=6976
CTE block_lookup
-> Nested Loop Left Join (cost=41.30..1723.38 rows=200 width=131) (actual time=0.654..2.899 rows=327 loops=1)
Buffers: shared hit=1348
-> HashAggregate (cost=40.88..42.88 rows=200 width=10) (actual time=0.620..0.682 rows=327 loops=1)
Group Key: block_bit_probes_1.cp_block_index, block_bit_probes_1.bloom_idx
Batches: 1 Memory Usage: 61kB
-> CTE Scan on block_bit_probes block_bit_probes_1 (cost=0.00..32.70 rows=1635 width=10) (actual time=0.000..0.159 rows=1635 loops=1)
-> Index Scan using cp_bloom_blocks_pkey on cp_bloom_blocks bb (cost=0.42..8.40 rows=1 width=131) (actual time=0.006..0.006 rows=1 loops=327)
Index Cond: ((cp_block_index = block_bit_probes_1.cp_block_index) AND (bloom_block_index = block_bit_probes_1.bloom_idx))
Buffers: shared hit=1343
-> Sort (cost=179.69..180.17 rows=192 width=24) (actual time=14.410..14.412 rows=1 loops=1)
Sort Key: ((block_bit_probes.cp_block_index * '1000'::bigint))
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=6976
-> Hash Right Anti Join (cost=48.29..173.01 rows=192 width=24) (actual time=14.392..14.400 rows=5 loops=1)
Hash Cond: (p.cp_block_index = block_bit_probes.cp_block_index)
Buffers: shared hit=6976
-> Hash Join (cost=7.00..129.78 rows=8 width=8) (actual time=3.224..12.355 rows=1011 loops=1)
Hash Cond: ((p.cp_block_index = bl.cp_block_index) AND (p.bloom_idx = bl.bloom_idx))
Join Filter: ((bl.bloom_filter IS NULL) OR (p.bit_mask <> get_byte(bl.bloom_filter, (p.byte_pos % length(bl.bloom_filter)))))
Rows Removed by Join Filter: 624
Buffers: shared hit=6976
-> CTE Scan on block_bit_probes p (cost=0.00..32.70 rows=1635 width=18) (actual time=0.000..0.211 rows=1635 loops=1)
-> Hash (cost=4.00..4.00 rows=200 width=42) (actual time=3.142..3.142 rows=327 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 93kB
Buffers: shared hit=1348
-> CTE Scan on block_lookup bl (cost=0.00..4.00 rows=200 width=42) (actual time=0.656..3.060 rows=327 loops=1)
Buffers: shared hit=1348
-> Hash (cost=38.79..38.79 rows=200 width=8) (actual time=1.773..1.774 rows=327 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 21kB
-> HashAggregate (cost=36.79..38.79 rows=200 width=8) (actual time=1.668..1.717 rows=327 loops=1)
Group Key: block_bit_probes.cp_block_index
Batches: 1 Memory Usage: 61kB
-> CTE Scan on block_bit_probes (cost=0.00..32.70 rows=1635 width=8) (actual time=0.007..1.162 rows=1635 loops=1)
-> Index Scan using cp_blooms_pkey on cp_blooms cb (cost=0.43..21883.74 rows=1 width=8) (actual time=0.040..0.408 rows=62 loops=1)
Index Cond: ((cp_sequence_number >= ((block_bit_probes.cp_block_index * '1000'::bigint))) AND (cp_sequence_number <= ((((block_bit_probes.cp_block_index * '1000'::bigint) + '1000'::bigint) - 1))))
Filter: (((get_byte(bloom_filter, (208 % length(bloom_filter))) & 8) = 8) AND ((get_byte(bloom_filter, (5988 % length(bloom_filter))) & 32) = 32) AND ((get_byte(bloom_filter, (8084 % length(bloom_filter))) & 32) = 32) AND ((get_byte(bloom_filter, (15059 % length(bloom_filter))) & 32) = 32) AND ((get_byte(bloom_filter, (8330 % length(bloom_filter))) & 64) = 64) AND ((get_byte(bloom_filter, (10313 % length(bloom_filter))) & 4) = 4))
Rows Removed by Filter: 318
Buffers: shared hit=311
Planning:
Buffers: shared hit=78
Planning Time: 2.181 ms
Execution Time: 15.060 ms
```
## Stack:
#24788
#24900 👈
#25097
---
## Release notes
Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.
For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.
- [ ] Protocol:
- [ ] Nodes (Validators and Full nodes):
- [ ] gRPC:
- [ ] JSON-RPC:
- [ ] GraphQL:
- [ ] CLI:
- [ ] Rust SDK:
- [ ] Indexing Framework:
## Description
Adds a new transactionsScan query that scans checkpoints using bloom
filters to find transactions matching filter criteria.
Changes
GraphQL (sui-indexer-alt-graphql)
- New transactionsScan query endpoint
- maxScanLimit service config to limit checkpoints scanned per query
- Transaction filter and scan logic in scan.rs
Schema (sui-indexer-alt-schema)
- `bloom_contains` SQL function for checking probe membership in bloom
filters
Reader (sui-indexer-alt-reader)
- `cp_blooms` loader for batch-loading bloom filter data
## Test plan
```
cargo nextest run -p sui-indexer-alt-graphql graphql_scan_limit_tests
cargo nextest run -p sui-indexer-alt-graphql
cargo nextest run -p sui-indexer-alt
cargo nextest run -p sui-indexer-alt-schema
```
Query Plan:
```
Limit (cost=1911.69..1358704.33 rows=62 width=8) (actual time=14.454..14.836 rows=62 loops=1)
Buffers: shared hit=7287
CTE block_bit_probes
-> ProjectSet (cost=0.00..8.20 rows=1635 width=18) (actual time=0.004..0.431 rows=1635 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1)
-> Nested Loop (cost=1903.49..1358696.13 rows=62 width=8) (actual time=14.453..14.830 rows=62 loops=1)
Buffers: shared hit=7287
-> Limit (cost=1903.06..1903.22 rows=62 width=24) (actual time=14.411..14.414 rows=1 loops=1)
Buffers: shared hit=6976
CTE block_lookup
-> Nested Loop Left Join (cost=41.30..1723.38 rows=200 width=131) (actual time=0.654..2.899 rows=327 loops=1)
Buffers: shared hit=1348
-> HashAggregate (cost=40.88..42.88 rows=200 width=10) (actual time=0.620..0.682 rows=327 loops=1)
Group Key: block_bit_probes_1.cp_block_index, block_bit_probes_1.bloom_idx
Batches: 1 Memory Usage: 61kB
-> CTE Scan on block_bit_probes block_bit_probes_1 (cost=0.00..32.70 rows=1635 width=10) (actual time=0.000..0.159 rows=1635 loops=1)
-> Index Scan using cp_bloom_blocks_pkey on cp_bloom_blocks bb (cost=0.42..8.40 rows=1 width=131) (actual time=0.006..0.006 rows=1 loops=327)
Index Cond: ((cp_block_index = block_bit_probes_1.cp_block_index) AND (bloom_block_index = block_bit_probes_1.bloom_idx))
Buffers: shared hit=1343
-> Sort (cost=179.69..180.17 rows=192 width=24) (actual time=14.410..14.412 rows=1 loops=1)
Sort Key: ((block_bit_probes.cp_block_index * '1000'::bigint))
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=6976
-> Hash Right Anti Join (cost=48.29..173.01 rows=192 width=24) (actual time=14.392..14.400 rows=5 loops=1)
Hash Cond: (p.cp_block_index = block_bit_probes.cp_block_index)
Buffers: shared hit=6976
-> Hash Join (cost=7.00..129.78 rows=8 width=8) (actual time=3.224..12.355 rows=1011 loops=1)
Hash Cond: ((p.cp_block_index = bl.cp_block_index) AND (p.bloom_idx = bl.bloom_idx))
Join Filter: ((bl.bloom_filter IS NULL) OR (p.bit_mask <> get_byte(bl.bloom_filter, (p.byte_pos % length(bl.bloom_filter)))))
Rows Removed by Join Filter: 624
Buffers: shared hit=6976
-> CTE Scan on block_bit_probes p (cost=0.00..32.70 rows=1635 width=18) (actual time=0.000..0.211 rows=1635 loops=1)
-> Hash (cost=4.00..4.00 rows=200 width=42) (actual time=3.142..3.142 rows=327 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 93kB
Buffers: shared hit=1348
-> CTE Scan on block_lookup bl (cost=0.00..4.00 rows=200 width=42) (actual time=0.656..3.060 rows=327 loops=1)
Buffers: shared hit=1348
-> Hash (cost=38.79..38.79 rows=200 width=8) (actual time=1.773..1.774 rows=327 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 21kB
-> HashAggregate (cost=36.79..38.79 rows=200 width=8) (actual time=1.668..1.717 rows=327 loops=1)
Group Key: block_bit_probes.cp_block_index
Batches: 1 Memory Usage: 61kB
-> CTE Scan on block_bit_probes (cost=0.00..32.70 rows=1635 width=8) (actual time=0.007..1.162 rows=1635 loops=1)
-> Index Scan using cp_blooms_pkey on cp_blooms cb (cost=0.43..21883.74 rows=1 width=8) (actual time=0.040..0.408 rows=62 loops=1)
Index Cond: ((cp_sequence_number >= ((block_bit_probes.cp_block_index * '1000'::bigint))) AND (cp_sequence_number <= ((((block_bit_probes.cp_block_index * '1000'::bigint) + '1000'::bigint) - 1))))
Filter: (((get_byte(bloom_filter, (208 % length(bloom_filter))) & 8) = 8) AND ((get_byte(bloom_filter, (5988 % length(bloom_filter))) & 32) = 32) AND ((get_byte(bloom_filter, (8084 % length(bloom_filter))) & 32) = 32) AND ((get_byte(bloom_filter, (15059 % length(bloom_filter))) & 32) = 32) AND ((get_byte(bloom_filter, (8330 % length(bloom_filter))) & 64) = 64) AND ((get_byte(bloom_filter, (10313 % length(bloom_filter))) & 4) = 4))
Rows Removed by Filter: 318
Buffers: shared hit=311
Planning:
Buffers: shared hit=78
Planning Time: 2.181 ms
Execution Time: 15.060 ms
```
## Stack:
#24788
#24900 👈
#25097
---
## Release notes
Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.
For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.
- [ ] Protocol:
- [ ] Nodes (Validators and Full nodes):
- [ ] gRPC:
- [ ] JSON-RPC:
- [ ] GraphQL:
- [ ] CLI:
- [ ] Rust SDK:
- [ ] Indexing Framework:
## Description
Adds a new transactionsScan query that scans checkpoints using bloom
filters to find transactions matching filter criteria.
Changes
GraphQL (sui-indexer-alt-graphql)
- New transactionsScan query endpoint
- maxScanLimit service config to limit checkpoints scanned per query
- Transaction filter and scan logic in scan.rs
Schema (sui-indexer-alt-schema)
- `bloom_contains` SQL function for checking probe membership in bloom
filters
Reader (sui-indexer-alt-reader)
- `cp_blooms` loader for batch-loading bloom filter data
## Test plan
```
cargo nextest run -p sui-indexer-alt-graphql graphql_scan_limit_tests
cargo nextest run -p sui-indexer-alt-graphql
cargo nextest run -p sui-indexer-alt
cargo nextest run -p sui-indexer-alt-schema
```
Query Plan:
```
Limit (cost=1911.69..1358704.33 rows=62 width=8) (actual time=14.454..14.836 rows=62 loops=1)
Buffers: shared hit=7287
CTE block_bit_probes
-> ProjectSet (cost=0.00..8.20 rows=1635 width=18) (actual time=0.004..0.431 rows=1635 loops=1)
-> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1)
-> Nested Loop (cost=1903.49..1358696.13 rows=62 width=8) (actual time=14.453..14.830 rows=62 loops=1)
Buffers: shared hit=7287
-> Limit (cost=1903.06..1903.22 rows=62 width=24) (actual time=14.411..14.414 rows=1 loops=1)
Buffers: shared hit=6976
CTE block_lookup
-> Nested Loop Left Join (cost=41.30..1723.38 rows=200 width=131) (actual time=0.654..2.899 rows=327 loops=1)
Buffers: shared hit=1348
-> HashAggregate (cost=40.88..42.88 rows=200 width=10) (actual time=0.620..0.682 rows=327 loops=1)
Group Key: block_bit_probes_1.cp_block_index, block_bit_probes_1.bloom_idx
Batches: 1 Memory Usage: 61kB
-> CTE Scan on block_bit_probes block_bit_probes_1 (cost=0.00..32.70 rows=1635 width=10) (actual time=0.000..0.159 rows=1635 loops=1)
-> Index Scan using cp_bloom_blocks_pkey on cp_bloom_blocks bb (cost=0.42..8.40 rows=1 width=131) (actual time=0.006..0.006 rows=1 loops=327)
Index Cond: ((cp_block_index = block_bit_probes_1.cp_block_index) AND (bloom_block_index = block_bit_probes_1.bloom_idx))
Buffers: shared hit=1343
-> Sort (cost=179.69..180.17 rows=192 width=24) (actual time=14.410..14.412 rows=1 loops=1)
Sort Key: ((block_bit_probes.cp_block_index * '1000'::bigint))
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=6976
-> Hash Right Anti Join (cost=48.29..173.01 rows=192 width=24) (actual time=14.392..14.400 rows=5 loops=1)
Hash Cond: (p.cp_block_index = block_bit_probes.cp_block_index)
Buffers: shared hit=6976
-> Hash Join (cost=7.00..129.78 rows=8 width=8) (actual time=3.224..12.355 rows=1011 loops=1)
Hash Cond: ((p.cp_block_index = bl.cp_block_index) AND (p.bloom_idx = bl.bloom_idx))
Join Filter: ((bl.bloom_filter IS NULL) OR (p.bit_mask <> get_byte(bl.bloom_filter, (p.byte_pos % length(bl.bloom_filter)))))
Rows Removed by Join Filter: 624
Buffers: shared hit=6976
-> CTE Scan on block_bit_probes p (cost=0.00..32.70 rows=1635 width=18) (actual time=0.000..0.211 rows=1635 loops=1)
-> Hash (cost=4.00..4.00 rows=200 width=42) (actual time=3.142..3.142 rows=327 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 93kB
Buffers: shared hit=1348
-> CTE Scan on block_lookup bl (cost=0.00..4.00 rows=200 width=42) (actual time=0.656..3.060 rows=327 loops=1)
Buffers: shared hit=1348
-> Hash (cost=38.79..38.79 rows=200 width=8) (actual time=1.773..1.774 rows=327 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 21kB
-> HashAggregate (cost=36.79..38.79 rows=200 width=8) (actual time=1.668..1.717 rows=327 loops=1)
Group Key: block_bit_probes.cp_block_index
Batches: 1 Memory Usage: 61kB
-> CTE Scan on block_bit_probes (cost=0.00..32.70 rows=1635 width=8) (actual time=0.007..1.162 rows=1635 loops=1)
-> Index Scan using cp_blooms_pkey on cp_blooms cb (cost=0.43..21883.74 rows=1 width=8) (actual time=0.040..0.408 rows=62 loops=1)
Index Cond: ((cp_sequence_number >= ((block_bit_probes.cp_block_index * '1000'::bigint))) AND (cp_sequence_number <= ((((block_bit_probes.cp_block_index * '1000'::bigint) + '1000'::bigint) - 1))))
Filter: (((get_byte(bloom_filter, (208 % length(bloom_filter))) & 8) = 8) AND ((get_byte(bloom_filter, (5988 % length(bloom_filter))) & 32) = 32) AND ((get_byte(bloom_filter, (8084 % length(bloom_filter))) & 32) = 32) AND ((get_byte(bloom_filter, (15059 % length(bloom_filter))) & 32) = 32) AND ((get_byte(bloom_filter, (8330 % length(bloom_filter))) & 64) = 64) AND ((get_byte(bloom_filter, (10313 % length(bloom_filter))) & 4) = 4))
Rows Removed by Filter: 318
Buffers: shared hit=311
Planning:
Buffers: shared hit=78
Planning Time: 2.181 ms
Execution Time: 15.060 ms
```
## Stack:
MystenLabs#24788
MystenLabs#24900 👈
MystenLabs#25097
---
## Release notes
Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.
For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.
- [ ] Protocol:
- [ ] Nodes (Validators and Full nodes):
- [ ] gRPC:
- [ ] JSON-RPC:
- [ ] GraphQL:
- [ ] CLI:
- [ ] Rust SDK:
- [ ] Indexing Framework:
Description
Transaction Scanning
Test plan
Stack:
#24788
#24900
#25097 👈
Release notes
Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.
For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.