Skip to content

Commit 546143e

Browse files
jackye1995claude
andauthored
feat: improve caching (#45)
In general, ensure that: 1. we use the same session across datasets and cache session 2. we fix dataset version to read so each version is cached --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 28001ea commit 546143e

21 files changed

Lines changed: 715 additions & 328 deletions

docs/src/.pages

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,5 @@ nav:
22
- Welcome: index.md
33
- Install: install.md
44
- Config: config.md
5+
- Performance: performance.md
56
- Operations: operations

docs/src/config.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,17 @@ CREATE TABLE lance.myschema.users (id BIGINT, name VARCHAR);
6868
| `lance.single_level_ns` | Enable single-level mode with virtual `default` schema | `false` |
6969
| `lance.parent` | Parent namespace prefix (levels separated by `$`) | - |
7070

71+
## Read and Write Settings
72+
73+
Control batch sizes for vectorized operations:
74+
75+
| Property | Description | Default |
76+
|----------|-------------|---------|
77+
| `lance.read_batch_size` | Rows per batch during vectorized reads | `8192` |
78+
| `lance.write_batch_size` | Rows to batch before writing to Arrow | `10000` |
79+
| `lance.max_rows_per_file` | Maximum rows per Lance file | `1000000` |
80+
| `lance.max_rows_per_group` | Maximum rows per row group | `100000` |
81+
7182
## Examples
7283

7384
### Directory Namespace

docs/src/performance.md

Lines changed: 134 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,134 @@
1+
# Performance Tuning
2+
3+
This guide covers performance tuning for Lance Trino operations in large-scale analytics scenarios.
4+
5+
## Understanding Lance's Default Optimization
6+
7+
Lance is **optimized by default for random access patterns** - fast point lookups, vector searches, and selective column reads.
8+
These defaults work well for ML/AI workloads where you frequently access individual records or small batches.
9+
10+
For **large-scale batch ETL and scan-heavy OLAP operations** (writing millions of rows, full table scans, bulk exports),
11+
you can tune Lance's environment variables and connector properties to better utilize available resources.
12+
13+
## Caching
14+
15+
Lance Trino uses a multi-level caching strategy to minimize redundant I/O and improve query performance.
16+
17+
### How Caching Works
18+
19+
Lance Trino implements two levels of caching:
20+
21+
1. **Session Cache** - Contains index and metadata caches:
22+
23+
- **Index Cache**: Caches opened vector indices, fragment reuse indices, and index metadata
24+
- **Metadata Cache**: Caches manifests, transactions, deletion files, row ID indices, and file metadata
25+
26+
2. **Dataset Cache** - Caches opened datasets by `(userIdentity, tablePath, version)` key. Since a dataset at a specific version is immutable, this ensures:
27+
28+
- Each dataset is opened only once per worker
29+
- All workers read the same version for snapshot isolation
30+
- Schema and fragment metadata are reused from the cached dataset
31+
32+
### Cache Configuration
33+
34+
Configure caching behavior via connector properties in your catalog file:
35+
36+
```properties
37+
# Session cache settings
38+
lance.cache.session.max_entries=100 # Maximum cached sessions (default: 100)
39+
lance.cache.session.ttl_minutes=60 # Session cache TTL in minutes (default: 60)
40+
lance.cache.session.index_cache_size_bytes=6442450944 # Index cache size: 6GB
41+
lance.cache.session.metadata_cache_size_bytes=1073741824 # Metadata cache size: 1GB
42+
43+
# Dataset cache settings
44+
lance.cache.dataset.max_entries=100 # Maximum cached datasets (default: 100)
45+
lance.cache.dataset.ttl_minutes=30 # Dataset cache TTL in minutes (default: 30)
46+
```
47+
48+
| Property | Description | Default |
49+
|----------|-------------|---------|
50+
| `lance.cache.session.max_entries` | Maximum number of cached sessions | `100` |
51+
| `lance.cache.session.ttl_minutes` | Session cache TTL in minutes | `60` |
52+
| `lance.cache.session.index_cache_size_bytes` | Index cache size in bytes | Lance default (6GB) |
53+
| `lance.cache.session.metadata_cache_size_bytes` | Metadata cache size in bytes | Lance default (1GB) |
54+
| `lance.cache.dataset.max_entries` | Maximum number of cached datasets | `100` |
55+
| `lance.cache.dataset.ttl_minutes` | Dataset cache TTL in minutes | `30` |
56+
57+
The index cache stores vector indices which can be large but provide significant speedup for vector search queries.
58+
Increase this if you frequently query tables with vector indices.
59+
60+
The metadata cache stores manifests, file metadata, and other dataset metadata.
61+
Each column's metadata can be around 40MB, so increase this if your tables have many columns.
62+
63+
## Lance Environment Variables
64+
65+
Lance uses environment variables for low-level I/O tuning. Set these on your Trino coordinator and worker nodes.
66+
67+
### Read Performance
68+
69+
#### I/O Threads
70+
71+
Set via environment variable `LANCE_IO_THREADS` (default: 64).
72+
73+
Controls the number of I/O threads used for parallel reads from storage.
74+
For large scans, increasing this to match your CPU core count enables more concurrent S3 requests.
75+
76+
```bash
77+
export LANCE_IO_THREADS=128
78+
```
79+
80+
### Write Performance
81+
82+
#### Upload Concurrency
83+
84+
Set via environment variable `LANCE_UPLOAD_CONCURRENCY` (default: 10).
85+
86+
Controls the number of concurrent multipart upload streams to S3.
87+
Increasing this to match your CPU core count can improve throughput.
88+
89+
```bash
90+
export LANCE_UPLOAD_CONCURRENCY=32
91+
```
92+
93+
#### Upload Part Size
94+
95+
Set via environment variable `LANCE_INITIAL_UPLOAD_SIZE` (default: 5MB).
96+
97+
Controls the initial part size for S3 multipart uploads.
98+
Larger part sizes reduce the number of API calls and can improve throughput for large writes.
99+
However, larger part sizes use more memory and may increase latency for small writes.
100+
Use the default for interactive workloads.
101+
102+
!!!note
103+
Lance automatically increments the multipart upload size by 5MB every 100 uploads,
104+
so large file writes progressively use increasingly large upload parts.
105+
There is no configuration for a fixed upload size.
106+
107+
```bash
108+
export LANCE_INITIAL_UPLOAD_SIZE=33554432 # 32MB
109+
```
110+
111+
### Environment Variables Summary
112+
113+
| Variable | Description | Default |
114+
|----------|-------------|---------|
115+
| `LANCE_IO_THREADS` | Number of I/O threads for parallel reads | `64` |
116+
| `LANCE_UPLOAD_CONCURRENCY` | Number of concurrent S3 upload streams | `10` |
117+
| `LANCE_INITIAL_UPLOAD_SIZE` | Initial S3 multipart upload part size (bytes) | `5242880` (5MB) |
118+
119+
## Index-Aware Split Planning
120+
121+
Lance Trino optimizes split planning based on index availability. When a table has indexes on filtered columns, larger splits are used because index lookups are efficient.
122+
123+
```properties
124+
# Rows per split when btree index is used (default: 100M)
125+
lance.index.btree.rows_per_split=100000000
126+
127+
# Rows per split when bitmap index is used (default: 10M)
128+
lance.index.bitmap.rows_per_split=10000000
129+
```
130+
131+
| Property | Description | Default |
132+
|----------|-------------|---------|
133+
| `lance.index.btree.rows_per_split` | Row count threshold for btree-indexed splits | `100000000` (100M) |
134+
| `lance.index.bitmap.rows_per_split` | Row count threshold for bitmap-indexed splits | `10000000` (10M) |

plugin/trino-lance/pom.xml

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -108,7 +108,7 @@
108108
<dependency>
109109
<groupId>org.lance</groupId>
110110
<artifactId>lance-core</artifactId>
111-
<version>2.0.0</version>
111+
<version>3.0.0-beta.4</version>
112112
<exclusions>
113113
<exclusion>
114114
<groupId>org.junit.jupiter</groupId>
@@ -418,7 +418,15 @@
418418
<ignoredUnusedDeclaredDependencies combine.children="append">
419419
<!-- Required at compile time for substrait Expression class hierarchy -->
420420
<ignoredUnusedDeclaredDependency>com.google.protobuf:protobuf-java</ignoredUnusedDeclaredDependency>
421+
<!-- Currently unused but kept for potential future use -->
422+
<ignoredUnusedDeclaredDependency>io.trino:trino-cache</ignoredUnusedDeclaredDependency>
423+
<!-- Used for @SuppressModernizer annotation -->
424+
<ignoredUnusedDeclaredDependency>org.gaul:modernizer-maven-annotations</ignoredUnusedDeclaredDependency>
421425
</ignoredUnusedDeclaredDependencies>
426+
<ignoredUsedUndeclaredDependencies>
427+
<!-- Provided transitively by the parent pom -->
428+
<ignoredUsedUndeclaredDependency>org.gaul:modernizer-maven-annotations</ignoredUsedUndeclaredDependency>
429+
</ignoredUsedUndeclaredDependencies>
422430
</configuration>
423431
</plugin>
424432
<plugin>

plugin/trino-lance/src/main/java/io/trino/plugin/lance/LanceArrowToPageScanner.java

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -94,9 +94,11 @@ public LanceArrowToPageScanner(
9494
ScannerFactory scannerFactory,
9595
Map<String, String> storageOptions,
9696
Optional<ByteBuffer> substraitFilter,
97-
OptionalLong limit)
97+
OptionalLong limit,
98+
String userIdentity,
99+
Long datasetVersion)
98100
{
99-
this(allocator, path, columns, List.of(), scannerFactory, storageOptions, substraitFilter, limit);
101+
this(allocator, path, columns, List.of(), scannerFactory, storageOptions, substraitFilter, limit, userIdentity, datasetVersion);
100102
}
101103

102104
public LanceArrowToPageScanner(
@@ -107,7 +109,9 @@ public LanceArrowToPageScanner(
107109
ScannerFactory scannerFactory,
108110
Map<String, String> storageOptions,
109111
Optional<ByteBuffer> substraitFilter,
110-
OptionalLong limit)
112+
OptionalLong limit,
113+
String userIdentity,
114+
Long datasetVersion)
111115
{
112116
this.allocator = requireNonNull(allocator, "allocator is null");
113117
requireNonNull(columns, "columns is null");
@@ -148,7 +152,7 @@ public LanceArrowToPageScanner(
148152
}
149153
}
150154

151-
lanceScanner = scannerFactory.open(path, allocator, projectionColumns, storageOptions, substraitFilter, limit);
155+
lanceScanner = scannerFactory.open(path, allocator, projectionColumns, storageOptions, substraitFilter, limit, userIdentity, datasetVersion);
152156
this.arrowReader = lanceScanner.scanBatches();
153157
try {
154158
this.vectorSchemaRoot = arrowReader.getVectorSchemaRoot();

plugin/trino-lance/src/main/java/io/trino/plugin/lance/LanceBasePageSource.java

Lines changed: 6 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -19,13 +19,11 @@
1919
import io.trino.spi.connector.ConnectorPageSource;
2020
import org.apache.arrow.memory.BufferAllocator;
2121
import org.apache.arrow.memory.RootAllocator;
22-
import org.apache.arrow.util.VisibleForTesting;
2322

2423
import java.util.List;
2524
import java.util.Map;
2625
import java.util.concurrent.atomic.AtomicBoolean;
2726
import java.util.concurrent.atomic.AtomicLong;
28-
import java.util.stream.Collectors;
2927

3028
import static com.google.common.base.Preconditions.checkState;
3129
import static com.google.common.collect.ImmutableList.toImmutableList;
@@ -46,12 +44,12 @@ public abstract class LanceBasePageSource
4644
protected final BufferAllocator bufferAllocator;
4745
protected final PageBuilder pageBuilder;
4846

49-
public LanceBasePageSource(LanceTableHandle tableHandle, List<LanceColumnHandle> columns, ScannerFactory scannerFactory, Map<String, String> storageOptions)
47+
public LanceBasePageSource(LanceTableHandle tableHandle, List<LanceColumnHandle> columns, ScannerFactory scannerFactory, Map<String, String> storageOptions, String userIdentity)
5048
{
51-
this(tableHandle, columns, List.of(), scannerFactory, storageOptions);
49+
this(tableHandle, columns, List.of(), scannerFactory, storageOptions, userIdentity);
5250
}
5351

54-
public LanceBasePageSource(LanceTableHandle tableHandle, List<LanceColumnHandle> columns, List<String> filterProjectionColumns, ScannerFactory scannerFactory, Map<String, String> storageOptions)
52+
public LanceBasePageSource(LanceTableHandle tableHandle, List<LanceColumnHandle> columns, List<String> filterProjectionColumns, ScannerFactory scannerFactory, Map<String, String> storageOptions, String userIdentity)
5553
{
5654
this.tableHandle = tableHandle;
5755
this.bufferAllocator = allocator.newChildAllocator(tableHandle.getTableName(), 1024, Long.MAX_VALUE);
@@ -66,7 +64,9 @@ public LanceBasePageSource(LanceTableHandle tableHandle, List<LanceColumnHandle>
6664
scannerFactory,
6765
storageOptions,
6866
tableHandle.getSubstraitFilterBuffer(),
69-
tableHandle.getLimit());
67+
tableHandle.getLimit(),
68+
userIdentity,
69+
tableHandle.getDatasetVersion());
7070
}
7171
catch (RuntimeException e) {
7272
// Handle concurrent modification errors (e.g., fragment not found due to concurrent update)
@@ -101,13 +101,6 @@ private static boolean isConcurrentModificationError(RuntimeException e)
101101
return false;
102102
}
103103

104-
@VisibleForTesting
105-
public static List<LanceColumnHandle> toColumnHandles(LanceTableHandle tableHandle, Map<String, String> storageOptions)
106-
{
107-
return LanceDatasetCache.getColumnHandles(tableHandle.getTablePath(), storageOptions).values().stream()
108-
.map(c -> (LanceColumnHandle) c).collect(Collectors.toList());
109-
}
110-
111104
@Override
112105
public long getCompletedBytes()
113106
{

plugin/trino-lance/src/main/java/io/trino/plugin/lance/LanceConfig.java

Lines changed: 89 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -178,7 +178,7 @@ public long getBtreeIndexedRowsPerSplit()
178178
return btreeIndexedRowsPerSplit;
179179
}
180180

181-
@Config("lance.btree_indexed_rows_per_split")
181+
@Config("lance.index.btree.rows_per_split")
182182
@ConfigDescription("Row count threshold for grouping btree-indexed fragments per split (default 100M)")
183183
public LanceConfig setBtreeIndexedRowsPerSplit(long btreeIndexedRowsPerSplit)
184184
{
@@ -191,11 +191,98 @@ public long getBitmapIndexedRowsPerSplit()
191191
return bitmapIndexedRowsPerSplit;
192192
}
193193

194-
@Config("lance.bitmap_indexed_rows_per_split")
194+
@Config("lance.index.bitmap.rows_per_split")
195195
@ConfigDescription("Row count threshold for grouping bitmap-indexed fragments per split (default 10M)")
196196
public LanceConfig setBitmapIndexedRowsPerSplit(long bitmapIndexedRowsPerSplit)
197197
{
198198
this.bitmapIndexedRowsPerSplit = bitmapIndexedRowsPerSplit;
199199
return this;
200200
}
201+
202+
// ===== Cache Configuration =====
203+
204+
private int cacheSessionMaxEntries = 100;
205+
private int cacheSessionTtlMinutes = 60;
206+
private Long cacheSessionIndexCacheSizeBytes; // null = use Lance default
207+
private Long cacheSessionMetadataCacheSizeBytes; // null = use Lance default
208+
private int cacheDatasetMaxEntries = 100;
209+
private int cacheDatasetTtlMinutes = 30;
210+
211+
public int getCacheSessionMaxEntries()
212+
{
213+
return cacheSessionMaxEntries;
214+
}
215+
216+
@Config("lance.cache.session.max_entries")
217+
@ConfigDescription("Maximum number of cached sessions (default 100)")
218+
public LanceConfig setCacheSessionMaxEntries(int cacheSessionMaxEntries)
219+
{
220+
this.cacheSessionMaxEntries = cacheSessionMaxEntries;
221+
return this;
222+
}
223+
224+
public int getCacheSessionTtlMinutes()
225+
{
226+
return cacheSessionTtlMinutes;
227+
}
228+
229+
@Config("lance.cache.session.ttl_minutes")
230+
@ConfigDescription("Session cache TTL in minutes (default 60)")
231+
public LanceConfig setCacheSessionTtlMinutes(int cacheSessionTtlMinutes)
232+
{
233+
this.cacheSessionTtlMinutes = cacheSessionTtlMinutes;
234+
return this;
235+
}
236+
237+
public Long getCacheSessionIndexCacheSizeBytes()
238+
{
239+
return cacheSessionIndexCacheSizeBytes;
240+
}
241+
242+
@Config("lance.cache.session.index_cache_size_bytes")
243+
@ConfigDescription("Lance session index cache size in bytes (default: Lance default)")
244+
public LanceConfig setCacheSessionIndexCacheSizeBytes(Long cacheSessionIndexCacheSizeBytes)
245+
{
246+
this.cacheSessionIndexCacheSizeBytes = cacheSessionIndexCacheSizeBytes;
247+
return this;
248+
}
249+
250+
public Long getCacheSessionMetadataCacheSizeBytes()
251+
{
252+
return cacheSessionMetadataCacheSizeBytes;
253+
}
254+
255+
@Config("lance.cache.session.metadata_cache_size_bytes")
256+
@ConfigDescription("Lance session metadata cache size in bytes (default: Lance default)")
257+
public LanceConfig setCacheSessionMetadataCacheSizeBytes(Long cacheSessionMetadataCacheSizeBytes)
258+
{
259+
this.cacheSessionMetadataCacheSizeBytes = cacheSessionMetadataCacheSizeBytes;
260+
return this;
261+
}
262+
263+
public int getCacheDatasetMaxEntries()
264+
{
265+
return cacheDatasetMaxEntries;
266+
}
267+
268+
@Config("lance.cache.dataset.max_entries")
269+
@ConfigDescription("Maximum number of cached datasets (default 100)")
270+
public LanceConfig setCacheDatasetMaxEntries(int cacheDatasetMaxEntries)
271+
{
272+
this.cacheDatasetMaxEntries = cacheDatasetMaxEntries;
273+
return this;
274+
}
275+
276+
public int getCacheDatasetTtlMinutes()
277+
{
278+
return cacheDatasetTtlMinutes;
279+
}
280+
281+
@Config("lance.cache.dataset.ttl_minutes")
282+
@ConfigDescription("Dataset cache TTL in minutes (default 30)")
283+
public LanceConfig setCacheDatasetTtlMinutes(int cacheDatasetTtlMinutes)
284+
{
285+
this.cacheDatasetTtlMinutes = cacheDatasetTtlMinutes;
286+
return this;
287+
}
201288
}

0 commit comments

Comments
 (0)