-
Notifications
You must be signed in to change notification settings - Fork 4k
Closed
Description
We're reaching a point where we may need to be careful about decisions that increase code size:
-
Instantiating too many templates for code that isn't performance sensitive, or where some templates may do the same thing (e.g. Int32Type kernels may do the same thing as a Date32Type kernel)
-
Inlining functions that don't need to be inline
Code size tends to correlate also with compilation times, but not always.
I'll use this umbrella issue to organize issues related to reducing compiled code size
At this moment (2020-05-27), here are the 25 largest object files in a -O2 build
524896 src/arrow/CMakeFiles/arrow_objlib.dir/array/builder_dict.cc.o 531920 src/arrow/CMakeFiles/arrow_objlib.dir/filesystem/s3fs.cc.o 552000 src/arrow/CMakeFiles/arrow_objlib.dir/json/converter.cc.o 575920 src/arrow/CMakeFiles/arrow_objlib.dir/csv/converter.cc.o 595112 src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/scalar_cast_string.cc.o 645728 src/arrow/CMakeFiles/arrow_objlib.dir/type.cc.o 683040 src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/scalar_set_lookup.cc.o 702232 src/arrow/CMakeFiles/arrow_objlib.dir/ipc/reader.cc.o 729912 src/arrow/CMakeFiles/arrow_objlib.dir/tensor/coo_converter.cc.o 752776 src/arrow/CMakeFiles/arrow_objlib.dir/tensor/csc_converter.cc.o 752776 src/arrow/CMakeFiles/arrow_objlib.dir/tensor/csr_converter.cc.o 877680 src/arrow/CMakeFiles/arrow_objlib.dir/array/dict_internal.cc.o 885624 src/arrow/CMakeFiles/arrow_objlib.dir/builder.cc.o 919072 src/arrow/CMakeFiles/arrow_objlib.dir/scalar.cc.o 941776 src/arrow/CMakeFiles/arrow_objlib.dir/ipc/json_internal.cc.o 1055248 src/arrow/CMakeFiles/arrow_objlib.dir/ipc/json_simple.cc.o 1233304 src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/scalar_compare.cc.o 1265160 src/arrow/CMakeFiles/arrow_objlib.dir/sparse_tensor.cc.o 1343480 src/arrow/CMakeFiles/arrow_objlib.dir/tensor/csf_converter.cc.o 1346928 src/arrow/CMakeFiles/arrow_objlib.dir/array.cc.o 1502568 src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/vector_hash.cc.o 1609760 src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/scalar_cast_numeric.cc.o 1794416 src/arrow/CMakeFiles/arrow_objlib.dir/array/diff.cc.o 2759552 src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/vector_filter.cc.o 7609432 src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/vector_take.cc.o
Reporter: Wes McKinney / @wesm
Assignee: Wes McKinney / @wesm
Related issues:
- [C++] Reducing the code size of the tensor module (relates to)
- [C++] Reduce number of take kernels (relates to)
- [C++] Reduce generated code in vector_hash.cc (relates to)
- [C++] Reduce generated code in compute/kernels/scalar_compare.cc (relates to)
- [C++] Optimize Filter implementation (relates to)
- [C++] diff.cc is extremely slow to compile (relates to)
- [C++] Refine TransferBitmap template parameters (relates to)
- [C++] BitUtil::SetBitsTo probably doesn't need to be inline (is related to)
Note: This issue was originally created as ARROW-8970. Please see the migration documentation for further details.