Skip to main content
Filter by
Sorted by
Tagged with
Tooling
0 votes
0 replies
15 views

I am working on a co-simulation framework. As part of its job, it connects to several other tools to send and receive information from them as time in the simulation progresses. This data mostly takes ...
Eike Schulte's user avatar
0 votes
0 answers
97 views

I am using the ADBC Flight SQL driver to query a StarRocks database. This works well (and is insanely fast) when the query is a SELECT on a single table. But as soon as I add a JOIN to the query, all ...
usdn's user avatar
  • 512
0 votes
1 answer
102 views

I am trying to plot a histogram using a huge file (45 gb 600M rows 10 columns .tsv). The file is structured as follows: Image | X | Y | Channel1 | Channel2 | Channel3 | Channel4 | Channel1/Channel2 | ...
João Ribeiro's user avatar
3 votes
1 answer
212 views

I set up a folder of partitioned parquet files for a project at work, and I'm experiencing severe performance issues. Several hours to do the aggregation. I made this minimal example to show the ...
Arthur's user avatar
  • 2,492
0 votes
1 answer
283 views

I'm working on writing data to an Iceberg table using PyIceberg (0.6.0+) with a Ceph S3-compatible backend, via Lakekeeper (https://github.com/lakekeeper/lakekeeper) as my REST catalog and metadata ...
amavi's user avatar
  • 21
0 votes
0 answers
64 views

I'm working on a Tauri application that uses tauri-specta for type safety and I can't figure out how to properly serialize dates. This is the file where most of the serialization and deserialization ...
Andrew's user avatar
  • 642
0 votes
0 answers
181 views

struct Widget { std::string foo; std::string bar; int baz; }; So far, I've been saving Widget structs directly to binary files. To read them back, I use reinterpret_cast to convert raw ...
remo's user avatar
  • 1
-3 votes
1 answer
2k views

I'm trying to read an .arrow format file with Python pandas. pandas does not have a read_arrow function. However, it does have read_csv, read_parquet, and other similarly named functions. How can I ...
user2138149's user avatar
  • 18.8k
0 votes
0 answers
41 views

In Java Apache Arrow, I have an existing VectorSchemaRoot that's created following this documentation: BitVector bitVector = new BitVector("boolean", allocator); bitVector.allocateNew(); for ...
jjbskir's user avatar
  • 11.4k
0 votes
0 answers
77 views

We are looking at developing an exchange and archival format for data that can be represented as multiple tables: one-to-three tables to be specific, each with a different schema. I am looking at ...
Szabolcs's user avatar
  • 25.9k
0 votes
0 answers
110 views

I'm working with a Rust-based data processing pipeline using the polars and arrow2 crates. I have a flow where I batch-read CSVs and write them to an Arrow IPC file using IpcWriter with compression ...
Nirav Patel's user avatar
0 votes
0 answers
41 views

While chasing down performance and cleaner code, I've ran into this problem I've found that UDF's are seemingly connection dependent, not database dependent. If I, for instance, used a trigger on ...
Desmond Spicer's user avatar
0 votes
0 answers
310 views

Issue Writing Polars DataFrame in Chunks to Arrow/Parquet Without Corruption What I Am Trying to Do I'm trying to write a Polars DataFrame in chunks to either an Arrow IPC file or a Parquet file ...
Nirav Patel's user avatar
0 votes
2 answers
176 views

I have a setup where I'm utilizing two connections for sqlite: A dbapi-based sqlite connection from Arrow ADBC so I can have access to ingesting and fetching arrow data, and a native sqlite3 ...
Desmond Spicer's user avatar
0 votes
2 answers
111 views

I am using the ODBC diver via .NET 4.8 to connect to Dremio but getting this error: System.Data.Odbc.OdbcException HResult=0x80131937 Message=ERROR [HYC00] [Apache Arrow][Flight SQL] (100) ...
chrisb's user avatar
  • 1,425

15 30 50 per page
1
2 3 4 5
47