[C#] Performance issue of reading StringArray

### Describe the enhancement requested

The general principle of Zero Copy does not work well in case of StringArray in the C# library. This is because the value buffer is UTF 8 encoded, while C# uses wide char. So, for reading each value, we need to go through UTF 8 decoding. This is especially bad in the case of reading a DictonaryArray of string value type since the value array is guaranteed to store unique strings, but the StringArray API forces reader code to decode string on encountered offsets repeatedly. In our profiling, we tested dictionary array of string and int columns in the same RecordBatch, and we see dominant CPU used on calling StringArray.GetString() comparing to reading int column. 

C++ library on the other hand does not have this issue because Arrow C++ API exposes std::string and std::string_view, which work with UTF 8 natively.

### Component(s)

C#

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[C#] Performance issue of reading StringArray #41047

Describe the enhancement requested

Component(s)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[C#] Performance issue of reading StringArray #41047

Description

Describe the enhancement requested

Component(s)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions