Optimize for Arrow in-memory format

As seen in #1328 there is a substantial difference in performance in using a graph stored using C++ vectors (~2 seconds) vs when benchmarking end to end from python + pandas (~9 seconds). [Benchmark script](https://github.com/adsharma/leiden-communities-openmp/blob/python2/test/benchmark_nk.py)

A lot of data scientists use python and arrow ( a standardized columnar format) to read graphs into memory.

Polars (implemented in rust) uses arrow natively. Pandas optionally since 2.0:

```
df = pd.DataFrame({"a": [1, 2, 3]}, dtype_backend="pyarrow")
```

Optimizing for this use case could unlock even more performance for networkit users.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize for Arrow in-memory format #1331

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Optimize for Arrow in-memory format #1331

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions