error: try to reduce amount of codegen by forcefully unlining error constructors by BurntSushi · Pull Request #379 · BurntSushi/jiff

BurntSushi · 2025-05-17T19:06:00Z

I had tried this several weeks ago, but recent exploration in #373
prompted me to revisit it. Indeed, cargo llvm-lines reveals that Jiff
is emitting a fair bit of code to LLVM just due to its error values. I
don't know how much of this is inherent to Jiff's attention to good
error messages or whether it's tied to how I've gone about it.

With that said... This isn't a pareto distribution. It's not like Jiff's
error handling is contributing to 80% of Jiff's LLVM lines or something.
It's just the biggest chunk. But that chunk seems to be around 11% or
so.

This PR tries to reduce that chunk, mostly by being a little more
careful about which error constructors are inlined and which aren't.
Basically, we want the error branch to be inlined into the calling
code (because that's likely a critical path), but we want everything
within the branch to be a call to a function that is ideally not
duplicated too much. I think there's a limit to how well we can do here,
but this PR does seem to improve things without runtime regressions.

Here's the first ~30% of where Jiff is emitting LLVM lines (as run from
the root of Jiff's repository) on current master:

  Lines                 Copies              Function name
  -----                 ------              -------------
  203564                5977                (TOTAL)
   10182 (5.0%,  5.0%)    86 (1.4%,  1.4%)  <jiff::error::Error as jiff::error::ErrorContext>::with_context
    5688 (2.8%,  7.8%)   204 (3.4%,  4.9%)  core::result::Result<T,E>::map_err
    3801 (1.9%,  9.7%)   145 (2.4%,  7.3%)  <core::result::Result<T,E> as core::ops::try_trait::Try>::branch
    2837 (1.4%, 11.1%)   116 (1.9%,  9.2%)  core::option::Option<T>::ok_or_else
    2120 (1.0%, 12.1%)     2 (0.0%,  9.3%)  jiff::fmt::friendly::printer::SpanPrinter::print_duration_designators
    1822 (0.9%, 13.0%)     2 (0.0%,  9.3%)  jiff::fmt::temporal::printer::SpanPrinter::print_span
    1658 (0.8%, 13.8%)     1 (0.0%,  9.3%)  jiff::fmt::strtime::parse::Parser::parse
    1652 (0.8%, 14.6%)     1 (0.0%,  9.3%)  jiff::fmt::strtime::format::Formatter<W,L>::format_one
    1548 (0.8%, 15.4%)     6 (0.1%,  9.4%)  jiff::fmt::temporal::printer::DateTimePrinter::print_time
    1312 (0.6%, 16.0%)   110 (1.8%, 11.3%)  <core::result::Result<T,F> as core::ops::try_trait::FromResidual<core::result::Result<core::convert::Infallible,E>>>::from_residual
    1260 (0.6%, 16.6%)     6 (0.1%, 11.4%)  jiff::fmt::temporal::printer::DateTimePrinter::print_date
    1219 (0.6%, 17.2%)    47 (0.8%, 12.1%)  core::option::Option<T>::map
    1160 (0.6%, 17.8%)    22 (0.4%, 12.5%)  core::option::Option<T>::map_or
    1071 (0.5%, 18.3%)    28 (0.5%, 13.0%)  core::result::Result<T,E>::unwrap
     997 (0.5%, 18.8%)    21 (0.4%, 13.3%)  core::option::Option<T>::or_else
     986 (0.5%, 19.3%)     1 (0.0%, 13.4%)  jiff::span::Span::from_invariant_nanoseconds
     976 (0.5%, 19.8%)    16 (0.3%, 13.6%)  <jiff::util::rangeint::RangedDebug<_,_> as core::fmt::Debug>::fmt
     920 (0.5%, 20.2%)    10 (0.2%, 13.8%)  <jiff::util::rangeint::ri32<_,_> as jiff::util::rangeint::RFrom<jiff::util::rangeint::ri64<_,_>>>::rfrom
     920 (0.5%, 20.7%)     4 (0.1%, 13.9%)  jiff::fmt::temporal::printer::DateTimePrinter::print_offset_rounded
     912 (0.4%, 21.1%)     6 (0.1%, 14.0%)  jiff::fmt::friendly::printer::DesignatorWriter<W>::write
     890 (0.4%, 21.6%)     2 (0.0%, 14.0%)  jiff::fmt::temporal::printer::SpanPrinter::print_duration
     840 (0.4%, 22.0%)    12 (0.2%, 14.2%)  jiff::util::rangeint::ri8<_,_>::get
     828 (0.4%, 22.4%)     6 (0.1%, 14.3%)  core::slice::<impl [T]>::binary_search_by
     819 (0.4%, 22.8%)    32 (0.5%, 14.8%)  core::result::Result<T,E>::unwrap_or_else
     802 (0.4%, 23.2%)     2 (0.0%, 14.9%)  jiff::fmt::friendly::printer::SpanPrinter::print_span_hms
     777 (0.4%, 23.6%)    10 (0.2%, 15.0%)  jiff::util::rangeint::Composite<T>::map
     758 (0.4%, 24.0%)     2 (0.0%, 15.1%)  jiff::fmt::friendly::printer::SpanPrinter::print_duration_hms
     744 (0.4%, 24.3%)     8 (0.1%, 15.2%)  core::array::try_from_fn_erased
     744 (0.4%, 24.7%)     2 (0.0%, 15.2%)  jiff::fmt::temporal::printer::DateTimePrinter::print_pieces
     720 (0.4%, 25.0%)     2 (0.0%, 15.3%)  jiff::fmt::friendly::printer::SpanPrinter::print_span_designators_non_fraction
     700 (0.3%, 25.4%)    10 (0.2%, 15.4%)  <core::iter::adapters::zip::Zip<A,B> as core::iter::adapters::zip::ZipImpl<A,B>>::next
     700 (0.3%, 25.7%)     7 (0.1%, 15.5%)  <jiff::util::rangeint::ri8<_,_> as jiff::util::rangeint::RFrom<jiff::util::rangeint::ri64<_,_>>>::rfrom
     682 (0.3%, 26.1%)     6 (0.1%, 15.6%)  jiff::fmt::strtime::format::<impl jiff::fmt::strtime::Extension>::write_int
     660 (0.3%, 26.4%)    10 (0.2%, 15.8%)  <core::iter::adapters::enumerate::Enumerate<I> as core::iter::traits::iterator::Iterator>::next
     659 (0.3%, 26.7%)     1 (0.0%, 15.8%)  jiff::fmt::rfc2822::DateTimePrinter::print_civil_with_offset
     600 (0.3%, 27.0%)     6 (0.1%, 15.9%)  <jiff::util::rangeint::ri8<_,_> as jiff::util::rangeint::RFrom<jiff::util::rangeint::ri32<_,_>>>::rfrom
     588 (0.3%, 27.3%)    12 (0.2%, 16.1%)  jiff::util::rangeint::ri8<_,_>::try_new
     588 (0.3%, 27.6%)     1 (0.0%, 16.1%)  jiff::tz::db::zoneinfo::inner::walk
     584 (0.3%, 27.9%)    86 (1.4%, 17.6%)  <core::result::Result<T,jiff::error::Error> as jiff::error::ErrorContext>::with_context::{{closure}}
     583 (0.3%, 28.2%)     1 (0.0%, 17.6%)  jiff::civil::date::DateDifference::since_with_largest_unit
     581 (0.3%, 28.4%)    11 (0.2%, 17.8%)  jiff::error::RangeError::new
     574 (0.3%, 28.7%)     2 (0.0%, 17.8%)  core::slice::sort::stable::quicksort::stable_partition
     552 (0.3%, 29.0%)     2 (0.0%, 17.9%)  core::str::pattern::TwoWaySearcher::next_back
     545 (0.3%, 29.3%)    97 (1.6%, 19.5%)  <T as jiff::util::rangeint::RInto<U>>::rinto
     544 (0.3%, 29.5%)     2 (0.0%, 19.5%)  jiff::fmt::strtime::format::write_offset
     536 (0.3%, 29.8%)     4 (0.1%, 19.6%)  jiff::fmt::temporal::printer::DateTimePrinter::print_datetime

Other than error handling, it doesn't look like there is any one thing
that is contributing the most here. That means that I think real
improvements here are probably going to require combing through these
and ensuring that the number of copies of each function is as small as
it can be.

This PR does decrease the number of LLVM lines, but only marginally:

  Lines                 Copies              Function name
  -----                 ------              -------------
  192748                2457                (TOTAL)
    3978 (2.1%,  2.1%)     1 (0.0%,  0.0%)  jiff::fmt::strtime::format::Formatter<W,L>::format_one
    3785 (2.0%,  4.0%)     1 (0.0%,  0.1%)  jiff::fmt::strtime::parse::Parser::parse
    3600 (1.9%,  5.9%)     2 (0.1%,  0.2%)  jiff::fmt::temporal::printer::SpanPrinter::print_span
    3340 (1.7%,  7.6%)     2 (0.1%,  0.2%)  jiff::fmt::friendly::printer::SpanPrinter::print_duration_designators
    3250 (1.7%,  9.3%)    86 (3.5%,  3.7%)  <core::result::Result<T,jiff::error::Error> as jiff::error::ErrorContext>::with_context::{{closure}}
    2910 (1.5%, 10.8%)     6 (0.2%,  4.0%)  jiff::fmt::temporal::printer::DateTimePrinter::print_time
    2190 (1.1%, 12.0%)     1 (0.0%,  4.0%)  jiff::span::Span::from_invariant_nanoseconds
    2166 (1.1%, 13.1%)     6 (0.2%,  4.3%)  jiff::fmt::temporal::printer::DateTimePrinter::print_date
    1760 (0.9%, 14.0%)     2 (0.1%,  4.4%)  jiff::fmt::temporal::printer::SpanPrinter::print_duration
    1588 (0.8%, 14.8%)     2 (0.1%,  4.4%)  jiff::fmt::friendly::printer::SpanPrinter::print_span_hms
    1450 (0.8%, 15.6%)     2 (0.1%,  4.5%)  jiff::fmt::friendly::printer::SpanPrinter::print_span_designators_non_fraction
    1432 (0.7%, 16.3%)     4 (0.2%,  4.7%)  jiff::fmt::temporal::printer::DateTimePrinter::print_offset_rounded
    1398 (0.7%, 17.0%)     2 (0.1%,  4.8%)  jiff::fmt::temporal::printer::DateTimePrinter::print_pieces
    1368 (0.7%, 17.8%)     6 (0.2%,  5.0%)  jiff::fmt::friendly::printer::DesignatorWriter<W>::write
    1364 (0.7%, 18.5%)     2 (0.1%,  5.1%)  jiff::fmt::friendly::printer::SpanPrinter::print_duration_hms
    1248 (0.6%, 19.1%)    16 (0.7%,  5.7%)  <jiff::util::rangeint::RangedDebug<_,_> as core::fmt::Debug>::fmt
    1243 (0.6%, 19.8%)     1 (0.0%,  5.8%)  jiff::fmt::rfc2822::DateTimePrinter::print_civil_with_offset
    1222 (0.6%, 20.4%)    13 (0.5%,  6.3%)  jiff::util::rangeint::ri8<_,_>::new
    1108 (0.6%, 21.0%)    28 (1.1%,  7.4%)  core::result::Result<T,E>::unwrap
    1097 (0.6%, 21.5%)     1 (0.0%,  7.5%)  jiff::shared::tzif::Header::parse
    1086 (0.6%, 22.1%)    22 (0.9%,  8.4%)  core::option::Option<T>::map_or
     975 (0.5%, 22.6%)    21 (0.9%,  9.2%)  core::option::Option<T>::or_else
     970 (0.5%, 23.1%)     1 (0.0%,  9.3%)  jiff::fmt::rfc2822::DateTimePrinter::print_civil_always_utc
     923 (0.5%, 23.6%)     1 (0.0%,  9.3%)  jiff::fmt::offset::Parser::parse_numeric
     919 (0.5%, 24.1%)     1 (0.0%,  9.4%)  jiff::zoned::ZonedDifference::until_with_largest_unit
     884 (0.5%, 24.5%)     2 (0.1%,  9.4%)  jiff::tz::concatenated::Header::read
     842 (0.4%, 25.0%)     2 (0.1%,  9.5%)  jiff::fmt::strtime::format::write_offset
     822 (0.4%, 25.4%)     6 (0.2%,  9.8%)  core::slice::<impl [T]>::binary_search_by
     816 (0.4%, 25.8%)     6 (0.2%, 10.0%)  jiff::fmt::strtime::format::<impl jiff::fmt::strtime::Extension>::write_int
     796 (0.4%, 26.2%)     1 (0.0%, 10.1%)  jiff::shared::tzif::<impl jiff::shared::Tzif<alloc::string::String,jiff::shared::util::array_str::ArrayStr<30_usize>,alloc::vec::Vec<jiff::shared::TzifLocalTimeType>,alloc::vec::Vec<i64>,alloc::vec::Vec<jiff::shared::TzifDateTime>,alloc::vec::Vec<jiff::shared::TzifDateTime>,alloc::vec::Vec<jiff::shared::TzifTransitionInfo>>>::parse64
     792 (0.4%, 26.6%)    22 (0.9%, 10.9%)  <core::result::Result<T,jiff::error::Error> as jiff::error::ErrorContext>::context::{{closure}}
     784 (0.4%, 27.0%)     4 (0.2%, 11.1%)  jiff::fmt::temporal::printer::DateTimePrinter::print_datetime
     773 (0.4%, 27.4%)     2 (0.1%, 11.2%)  jiff::tz::offset::OffsetConflict::resolve_via_reject
     758 (0.4%, 27.8%)     2 (0.1%, 11.3%)  jiff::shared::posix::<impl jiff::shared::PosixTimeZone<ABBREV>>::to_ambiguous_kind
     736 (0.4%, 28.2%)     4 (0.2%, 11.4%)  jiff::fmt::friendly::printer::FractionalPrinter::print
     687 (0.4%, 28.6%)     1 (0.0%, 11.5%)  jiff::tz::db::zoneinfo::inner::walk
     680 (0.4%, 28.9%)     1 (0.0%, 11.5%)  jiff::shared::tzif::<impl jiff::shared::Tzif<alloc::string::String,jiff::shared::util::array_str::ArrayStr<30_usize>,alloc::vec::Vec<jiff::shared::TzifLocalTimeType>,alloc::vec::Vec<i64>,alloc::vec::Vec<jiff::shared::TzifDateTime>,alloc::vec::Vec<jiff::shared::TzifDateTime>,alloc::vec::Vec<jiff::shared::TzifTransitionInfo>>>::parse_time_zone_designations
     654 (0.3%, 29.3%)     2 (0.1%, 11.6%)  jiff::shared::posix::<impl jiff::shared::PosixTimeZone<ABBREV>>::previous_transition
     652 (0.3%, 29.6%)     2 (0.1%, 11.7%)  jiff::shared::posix::<impl jiff::shared::PosixTimeZone<ABBREV>>::next_transition
     650 (0.3%, 29.9%)     1 (0.0%, 11.7%)  jiff::tz::system::get_env_tz

I wonder if my approach to error handling with just using strings
everywhere is hurting things. The main alternative would be structured
errors everywhere. But there are so many error messages that writing
out structured definitions for each is terrifying to me (even if I was
willing to use something like thiserror to reduce the boiler plate
aspect of it).

In terms of actual compile times, this does seem to have a positive
impact on compile times. My test setup here was to:

Add [patch.crates-io] with a jiff entry to
Biff's Cargo.toml.
Run
touch path/to/jiff/src/lib.rs && touch src/main.rs && time cargo build --release
from the root of Biff's repository.

The timings I get are wildly variable. But the minimum time I got with
this PR was 5.8s while the minimum time I've seen for Jiff master is
6.3s. This is on an otherwise quiet machine with the CPU governor set
to performance.

Finally, running the benchmarks has some noise, but there are no obvious
regressions:

$ critcmp base zchange3 -f '/jiff$' -t 5
group                                                    base                                   zchange3
-----                                                    ----                                   --------
civil_datetime/to_timestamp_tzdb_lookup/zoneinfo/jiff    1.00     43.5±0.27ns        ? ?/sec    1.24     54.1±0.23ns        ? ?/sec
date/add_days/one/duration/jiff                          1.00      5.2±0.05ns        ? ?/sec    1.08      5.6±0.04ns        ? ?/sec
date/add_days/one/span/jiff                              1.00      9.5±0.09ns        ? ?/sec    1.06     10.0±0.08ns        ? ?/sec
date/add_years_months_days/jiff                          1.08     24.7±0.20ns        ? ?/sec    1.00     22.8±0.16ns        ? ?/sec
parse/friendly/long/span/jiff                            1.06    122.8±0.18ns        ? ?/sec    1.00    115.9±0.27ns        ? ?/sec
parse/friendly/longer/span/jiff                          1.05    124.2±0.29ns        ? ?/sec    1.00    118.0±0.79ns        ? ?/sec
parse/friendly/medium/span/jiff                          1.05     67.8±0.22ns        ? ?/sec    1.00     64.5±0.26ns        ? ?/sec
parse/friendly/short/duration/jiff                       1.17     16.6±0.28ns        ? ?/sec    1.00     14.2±0.34ns        ? ?/sec
parse/rfc2822/jiff                                       1.25     29.4±0.31ns        ? ?/sec    1.00     23.5±0.26ns        ? ?/sec
parse/strptime/oneshot/jiff                              1.11     74.3±0.63ns        ? ?/sec    1.00     67.0±0.48ns        ? ?/sec
timestamp/every_hour_in_week/series/jiff                 1.33    143.8±0.70ns        ? ?/sec    1.00    107.7±0.61ns        ? ?/sec
tz/posix_datetime_to_offset/jiff                         1.00     31.0±0.24ns        ? ?/sec    1.05     32.5±0.20ns        ? ?/sec
zoned/fixed_offset_to_timestamp/jiff                     1.25      0.4±0.00ns        ? ?/sec    1.00      0.3±0.00ns        ? ?/sec

Given the reduction in LLVM times, the seeming reduction in compile
times and the lack of obvious runtime regressions, I think this PR is a
strict improvement.

Ref #373

…onstructors I had tried this several weeks ago, but recent exploration in #373 prompted me to revisit it. Indeed, `cargo llvm-lines` reveals that Jiff is emitting a _ton_ of code to LLVM just due to its error values. I don't know how much of this is inherent to Jiff's attention to good error messages or whether it's tied to how I've gone about it. With that said... This isn't a pareto distribution. It's not like Jiff's error handling is contributing to 80% of Jiff's LLVM lines or something. It's just the biggest chunk. But that chunk seems to be around 11% or so. This PR tries to reduce that chunk, but... doesn't succeed much. This PR almost certainly has perf regressions (haven't measured yet), but I wanted to put it up anyway to give folks an idea of what I'm trying. Also, here's the first ~30% of where Jiff is emitting LLVM lines (as run from the root of Jiff's repository): ``` Lines Copies Function name ----- ------ ------------- 203564 5977 (TOTAL) 10182 (5.0%, 5.0%) 86 (1.4%, 1.4%) <jiff::error::Error as jiff::error::ErrorContext>::with_context 5688 (2.8%, 7.8%) 204 (3.4%, 4.9%) core::result::Result<T,E>::map_err 3801 (1.9%, 9.7%) 145 (2.4%, 7.3%) <core::result::Result<T,E> as core::ops::try_trait::Try>::branch 2837 (1.4%, 11.1%) 116 (1.9%, 9.2%) core::option::Option<T>::ok_or_else 2120 (1.0%, 12.1%) 2 (0.0%, 9.3%) jiff::fmt::friendly::printer::SpanPrinter::print_duration_designators 1822 (0.9%, 13.0%) 2 (0.0%, 9.3%) jiff::fmt::temporal::printer::SpanPrinter::print_span 1658 (0.8%, 13.8%) 1 (0.0%, 9.3%) jiff::fmt::strtime::parse::Parser::parse 1652 (0.8%, 14.6%) 1 (0.0%, 9.3%) jiff::fmt::strtime::format::Formatter<W,L>::format_one 1548 (0.8%, 15.4%) 6 (0.1%, 9.4%) jiff::fmt::temporal::printer::DateTimePrinter::print_time 1312 (0.6%, 16.0%) 110 (1.8%, 11.3%) <core::result::Result<T,F> as core::ops::try_trait::FromResidual<core::result::Result<core::convert::Infallible,E>>>::from_residual 1260 (0.6%, 16.6%) 6 (0.1%, 11.4%) jiff::fmt::temporal::printer::DateTimePrinter::print_date 1219 (0.6%, 17.2%) 47 (0.8%, 12.1%) core::option::Option<T>::map 1160 (0.6%, 17.8%) 22 (0.4%, 12.5%) core::option::Option<T>::map_or 1071 (0.5%, 18.3%) 28 (0.5%, 13.0%) core::result::Result<T,E>::unwrap 997 (0.5%, 18.8%) 21 (0.4%, 13.3%) core::option::Option<T>::or_else 986 (0.5%, 19.3%) 1 (0.0%, 13.4%) jiff::span::Span::from_invariant_nanoseconds 976 (0.5%, 19.8%) 16 (0.3%, 13.6%) <jiff::util::rangeint::RangedDebug<_,_> as core::fmt::Debug>::fmt 920 (0.5%, 20.2%) 10 (0.2%, 13.8%) <jiff::util::rangeint::ri32<_,_> as jiff::util::rangeint::RFrom<jiff::util::rangeint::ri64<_,_>>>::rfrom 920 (0.5%, 20.7%) 4 (0.1%, 13.9%) jiff::fmt::temporal::printer::DateTimePrinter::print_offset_rounded 912 (0.4%, 21.1%) 6 (0.1%, 14.0%) jiff::fmt::friendly::printer::DesignatorWriter<W>::write 890 (0.4%, 21.6%) 2 (0.0%, 14.0%) jiff::fmt::temporal::printer::SpanPrinter::print_duration 840 (0.4%, 22.0%) 12 (0.2%, 14.2%) jiff::util::rangeint::ri8<_,_>::get 828 (0.4%, 22.4%) 6 (0.1%, 14.3%) core::slice::<impl [T]>::binary_search_by 819 (0.4%, 22.8%) 32 (0.5%, 14.8%) core::result::Result<T,E>::unwrap_or_else 802 (0.4%, 23.2%) 2 (0.0%, 14.9%) jiff::fmt::friendly::printer::SpanPrinter::print_span_hms 777 (0.4%, 23.6%) 10 (0.2%, 15.0%) jiff::util::rangeint::Composite<T>::map 758 (0.4%, 24.0%) 2 (0.0%, 15.1%) jiff::fmt::friendly::printer::SpanPrinter::print_duration_hms 744 (0.4%, 24.3%) 8 (0.1%, 15.2%) core::array::try_from_fn_erased 744 (0.4%, 24.7%) 2 (0.0%, 15.2%) jiff::fmt::temporal::printer::DateTimePrinter::print_pieces 720 (0.4%, 25.0%) 2 (0.0%, 15.3%) jiff::fmt::friendly::printer::SpanPrinter::print_span_designators_non_fraction 700 (0.3%, 25.4%) 10 (0.2%, 15.4%) <core::iter::adapters::zip::Zip<A,B> as core::iter::adapters::zip::ZipImpl<A,B>>::next 700 (0.3%, 25.7%) 7 (0.1%, 15.5%) <jiff::util::rangeint::ri8<_,_> as jiff::util::rangeint::RFrom<jiff::util::rangeint::ri64<_,_>>>::rfrom 682 (0.3%, 26.1%) 6 (0.1%, 15.6%) jiff::fmt::strtime::format::<impl jiff::fmt::strtime::Extension>::write_int 660 (0.3%, 26.4%) 10 (0.2%, 15.8%) <core::iter::adapters::enumerate::Enumerate<I> as core::iter::traits::iterator::Iterator>::next 659 (0.3%, 26.7%) 1 (0.0%, 15.8%) jiff::fmt::rfc2822::DateTimePrinter::print_civil_with_offset 600 (0.3%, 27.0%) 6 (0.1%, 15.9%) <jiff::util::rangeint::ri8<_,_> as jiff::util::rangeint::RFrom<jiff::util::rangeint::ri32<_,_>>>::rfrom 588 (0.3%, 27.3%) 12 (0.2%, 16.1%) jiff::util::rangeint::ri8<_,_>::try_new 588 (0.3%, 27.6%) 1 (0.0%, 16.1%) jiff::tz::db::zoneinfo::inner::walk 584 (0.3%, 27.9%) 86 (1.4%, 17.6%) <core::result::Result<T,jiff::error::Error> as jiff::error::ErrorContext>::with_context::{{closure}} 583 (0.3%, 28.2%) 1 (0.0%, 17.6%) jiff::civil::date::DateDifference::since_with_largest_unit 581 (0.3%, 28.4%) 11 (0.2%, 17.8%) jiff::error::RangeError::new 574 (0.3%, 28.7%) 2 (0.0%, 17.8%) core::slice::sort::stable::quicksort::stable_partition 552 (0.3%, 29.0%) 2 (0.0%, 17.9%) core::str::pattern::TwoWaySearcher::next_back 545 (0.3%, 29.3%) 97 (1.6%, 19.5%) <T as jiff::util::rangeint::RInto<U>>::rinto 544 (0.3%, 29.5%) 2 (0.0%, 19.5%) jiff::fmt::strtime::format::write_offset 536 (0.3%, 29.8%) 4 (0.1%, 19.6%) jiff::fmt::temporal::printer::DateTimePrinter::print_datetime ``` Other than error handling, it doesn't look like there is any one thing that is contributing the most here. That means that I think real improvements here are probably going to require combing through these and ensuring that the number of copies of each function is as small as it can be. In other words, I don't really see any low hanging fruit here, but maybe I'm not looking at this right. FWIW, this PR does decrease the number of LLVM lines, but only marginally: ``` Lines Copies Function name ----- ------ ------------- 195380 5768 (TOTAL) 3801 (1.9%, 1.9%) 145 (2.5%, 2.5%) <core::result::Result<T,E> as core::ops::try_trait::Try>::branch 3286 (1.7%, 3.6%) 86 (1.5%, 4.0%) <core::result::Result<T,jiff::error::Error> as jiff::error::ErrorContext>::with_context 2851 (1.5%, 5.1%) 96 (1.7%, 5.7%) core::result::Result<T,E>::map_err 2837 (1.5%, 6.5%) 116 (2.0%, 7.7%) core::option::Option<T>::ok_or_else 2120 (1.1%, 7.6%) 2 (0.0%, 7.7%) jiff::fmt::friendly::printer::SpanPrinter::print_duration_designators 1822 (0.9%, 8.6%) 2 (0.0%, 7.7%) jiff::fmt::temporal::printer::SpanPrinter::print_span 1658 (0.8%, 9.4%) 1 (0.0%, 7.8%) jiff::fmt::strtime::parse::Parser::parse 1652 (0.8%, 10.3%) 1 (0.0%, 7.8%) jiff::fmt::strtime::format::Formatter<W,L>::format_one 1548 (0.8%, 11.0%) 6 (0.1%, 7.9%) jiff::fmt::temporal::printer::DateTimePrinter::print_time 1312 (0.7%, 11.7%) 110 (1.9%, 9.8%) <core::result::Result<T,F> as core::ops::try_trait::FromResidual<core::result::Result<core::convert::Infallible,E>>>::from_residual 1260 (0.6%, 12.4%) 6 (0.1%, 9.9%) jiff::fmt::temporal::printer::DateTimePrinter::print_date 1219 (0.6%, 13.0%) 47 (0.8%, 10.7%) core::option::Option<T>::map 1160 (0.6%, 13.6%) 22 (0.4%, 11.1%) core::option::Option<T>::map_or 1071 (0.5%, 14.1%) 28 (0.5%, 11.6%) core::result::Result<T,E>::unwrap 997 (0.5%, 14.6%) 21 (0.4%, 11.9%) core::option::Option<T>::or_else 986 (0.5%, 15.1%) 1 (0.0%, 12.0%) jiff::span::Span::from_invariant_nanoseconds 976 (0.5%, 15.6%) 16 (0.3%, 12.2%) <jiff::util::rangeint::RangedDebug<_,_> as core::fmt::Debug>::fmt 925 (0.5%, 16.1%) 27 (0.5%, 12.7%) <core::result::Result<T,jiff::error::Error> as jiff::error::ErrorContext>::with_context::imp ``` I wonder if my approach to error handling with just using strings everywhere is hurting things. The main alternative would be structured errors everywhere. But there are _so many_ error messages that writing out structured definitions for each is terrifying to me (even if I was willing to use something like `thiserror` to reduce the boiler plate aspect of it). Ref #373

BurntSushi · 2025-05-17T19:24:18Z

Oof. Massive regressions:

$ critcmp base zchange1 -f '/jiff$' -t 50
group                                                    base                                   zchange1
-----                                                    ----                                   --------
civil_datetime/add_days/diffyear/duration/jiff           1.00     13.6±0.13ns        ? ?/sec    19.25   261.6±3.04ns        ? ?/sec
civil_datetime/add_days/diffyear/span/jiff               1.00     25.5±0.13ns        ? ?/sec    3.81     97.0±0.57ns        ? ?/sec
civil_datetime/add_days/sameyear/duration/jiff           1.00     13.6±0.11ns        ? ?/sec    18.85   256.1±2.76ns        ? ?/sec
civil_datetime/add_days/sameyear/span/jiff               1.00     25.5±0.20ns        ? ?/sec    3.81     97.0±0.67ns        ? ?/sec
civil_datetime/to_timestamp_static/bundled/jiff          1.00     10.5±0.10ns        ? ?/sec    13.57   143.1±1.75ns        ? ?/sec
civil_datetime/to_timestamp_static/zoneinfo/jiff         1.00     10.5±0.08ns        ? ?/sec    13.46   141.5±1.70ns        ? ?/sec
civil_datetime/to_timestamp_tzdb_lookup/zoneinfo/jiff    1.00     43.5±0.27ns        ? ?/sec    3.96    172.0±1.60ns        ? ?/sec
date/add_days/diffyear/duration/jiff                     1.00      6.5±0.08ns        ? ?/sec    14.36    93.9±1.92ns        ? ?/sec
date/add_days/sameyear/duration/jiff                     1.00      6.5±0.06ns        ? ?/sec    13.75    89.8±1.11ns        ? ?/sec
datetime/add_years_months_days/jiff                      1.00     34.6±0.41ns        ? ?/sec    3.54    122.4±1.21ns        ? ?/sec
parse/civil_datetime/jiff                                1.00     25.9±0.06ns        ? ?/sec    69.83 1807.8±22.11ns        ? ?/sec
parse/friendly/long/span/jiff                            1.00    122.8±0.18ns        ? ?/sec    6.41    786.9±9.44ns        ? ?/sec
parse/friendly/longer/span/jiff                          1.00    124.2±0.29ns        ? ?/sec    11.19  1390.2±4.17ns        ? ?/sec
parse/friendly/longest-time/duration/jiff                1.00     44.5±0.83ns        ? ?/sec    21.91   974.2±8.31ns        ? ?/sec
parse/friendly/longest-time/span/jiff                    1.00    138.3±0.44ns        ? ?/sec    10.68  1477.2±6.06ns        ? ?/sec
parse/friendly/longest/span/jiff                         1.00    188.7±0.74ns        ? ?/sec    10.89     2.1±0.01µs        ? ?/sec
parse/friendly/medium/span/jiff                          1.00     67.8±0.22ns        ? ?/sec    7.09    480.7±5.65ns        ? ?/sec
parse/friendly/short/duration/jiff                       1.00     16.6±0.28ns        ? ?/sec    12.77   212.2±3.53ns        ? ?/sec
parse/friendly/short/span/jiff                           1.00     52.8±0.17ns        ? ?/sec    6.91    364.8±3.32ns        ? ?/sec
parse/friendly/tiny/duration/jiff                        1.00      7.8±0.19ns        ? ?/sec    20.04   157.1±1.81ns        ? ?/sec
parse/friendly/tiny/span/jiff                            1.00     34.7±0.06ns        ? ?/sec    6.84    237.6±2.69ns        ? ?/sec
parse/rfc2822/jiff                                       1.00     29.4±0.31ns        ? ?/sec    35.28 1037.4±11.02ns        ? ?/sec
parse/strptime/oneshot/jiff                              1.00     74.3±0.63ns        ? ?/sec    24.45 1817.4±16.78ns        ? ?/sec
print/strftime/oneshot/jiff                              1.00    109.3±0.84ns        ? ?/sec    3.26    356.9±7.11ns        ? ?/sec
timestamp/add_time_secs/span/jiff                        1.00     10.8±0.11ns        ? ?/sec    15.17   164.4±2.22ns        ? ?/sec
timestamp/add_time_subsec/span/jiff                      1.00     18.3±0.25ns        ? ?/sec    9.95    181.8±5.05ns        ? ?/sec
timestamp/every_hour_in_week/series/jiff                 1.00    143.8±0.70ns        ? ?/sec    1.97    282.6±0.96ns        ? ?/sec
zoned/fixed_offset_add_time/span/jiff                    1.00     36.3±0.12ns        ? ?/sec    11.98   435.1±4.78ns        ? ?/sec

BurntSushi · 2025-05-17T19:28:14Z

This does seem to have a positive impact on compile times. My test setup here was to:

Add [patch.crates-io] with a jiff entry to Biff's Cargo.toml.
Run touch path/to/jiff/src/lib.rs && touch src/main.rs && time cargo build --release from the root of Biff's repository.

The timings I get are wildly variable. But the minimum time I got with this PR was 5.6s while the minimum time I've seen for Jiff master is 6.3s. This is on an otherwise quiet machine with the CPU governor set to performance.

I'm going to see if I can tweak things here to try and gain back some of the runtime perf losses here. Or maybe make them configurable based on the perf-inline feature. I guess there's somewhat of an inherent tension here. It should be possible to do better than master. Basically, we want the error case analysis to be inlined into the hot path, but then we want everything else to be off in a function call somewhere that does get duplicated and inlined everywhere it's used.

BurntSushi · 2025-05-18T12:59:38Z

I'm going to see if I can tweak things here to try and gain back some of the runtime perf losses here. Or maybe make them configurable based on the perf-inline feature. I guess there's somewhat of an inherent tension here. It should be possible to do better than master. Basically, we want the error case analysis to be inlined into the hot path, but then we want everything else to be off in a function call somewhere that does get duplicated and inlined everywhere it's used.

I ended up being able to do this. The key was making sure that the context() and with_context() extension methods on Result<T, E> get inlined, but where pretty much everything else is not inlined. This slightly increases LLVM lines (but still an overall net decrease from master). Compilation times still seem to be improved to roughly the same degree. And, importantly, there are no runtime regressions detectable outside of noise.

@charliermarsh

It turns out that #379 ended up pretty badly regressing datetime parsing performance. In particular, the `with_context` routine on `Result<T, E>` was _eagerly_ calling the caller-provided closure even in the success case. Which means error values (which are generally much more expensive to build than to parse a datetime) were being constructed on every parse. Owch. I'm not sure how I missed this. I ran the benchmarks in #379 and they didn't report anything egregious. I should double-back and improve benchmark coverage. Thankfully, this doesn't seem to regress the number of LLVM lines we're emitting: Lines Copies Function name ----- ------ ------------- 192746 2456 (TOTAL) 3978 (2.1%, 2.1%) 1 (0.0%, 0.0%) jiff::fmt::strtime::format::Formatter<W,L>::format_one 3785 (2.0%, 4.0%) 1 (0.0%, 0.1%) jiff::fmt::strtime::parse::Parser::parse 3600 (1.9%, 5.9%) 2 (0.1%, 0.2%) jiff::fmt::temporal::printer::SpanPrinter::print_span 3340 (1.7%, 7.6%) 2 (0.1%, 0.2%) jiff::fmt::friendly::printer::SpanPrinter::print_duration_designators 3250 (1.7%, 9.3%) 86 (3.5%, 3.7%) <core::result::Result<T,jiff::error::Error> as jiff::error::ErrorContext>::with_context::{{closure}} 2910 (1.5%, 10.8%) 6 (0.2%, 4.0%) jiff::fmt::temporal::printer::DateTimePrinter::print_time 2190 (1.1%, 12.0%) 1 (0.0%, 4.0%) jiff::span::Span::from_invariant_nanoseconds 2166 (1.1%, 13.1%) 6 (0.2%, 4.3%) jiff::fmt::temporal::printer::DateTimePrinter::print_date 1760 (0.9%, 14.0%) 2 (0.1%, 4.4%) jiff::fmt::temporal::printer::SpanPrinter::print_duration 1588 (0.8%, 14.8%) 2 (0.1%, 4.4%) jiff::fmt::friendly::printer::SpanPrinter::print_span_hms 1450 (0.8%, 15.6%) 2 (0.1%, 4.5%) jiff::fmt::friendly::printer::SpanPrinter::print_span_designators_non_fraction 1432 (0.7%, 16.3%) 4 (0.2%, 4.7%) jiff::fmt::temporal::printer::DateTimePrinter::print_offset_rounded 1398 (0.7%, 17.0%) 2 (0.1%, 4.8%) jiff::fmt::temporal::printer::DateTimePrinter::print_pieces 1368 (0.7%, 17.8%) 6 (0.2%, 5.0%) jiff::fmt::friendly::printer::DesignatorWriter<W>::write 1364 (0.7%, 18.5%) 2 (0.1%, 5.1%) jiff::fmt::friendly::printer::SpanPrinter::print_duration_hms 1248 (0.6%, 19.1%) 16 (0.7%, 5.7%) <jiff::util::rangeint::RangedDebug<_,_> as core::fmt::Debug>::fmt 1243 (0.6%, 19.8%) 1 (0.0%, 5.8%) jiff::fmt::rfc2822::DateTimePrinter::print_civil_with_offset 1222 (0.6%, 20.4%) 13 (0.5%, 6.3%) jiff::util::rangeint::ri8<_,_>::new 1108 (0.6%, 21.0%) 28 (1.1%, 7.5%) core::result::Result<T,E>::unwrap 1097 (0.6%, 21.5%) 1 (0.0%, 7.5%) jiff::shared::tzif::Header::parse 1086 (0.6%, 22.1%) 22 (0.9%, 8.4%) core::option::Option<T>::map_or Kudos to @charliermarsh for finding this while profiling uv.

@charliermarsh

It turns out that #379 ended up pretty badly regressing datetime parsing performance. In particular, the `with_context` routine on `Result<T, E>` was _eagerly_ calling the caller-provided closure even in the success case. Which means error values (which are generally much more expensive to build than to parse a datetime) were being constructed on every parse. Owch. I'm not sure how I missed this. I ran the benchmarks in #379 and they didn't report anything egregious. I should double-back and improve benchmark coverage. Thankfully, this doesn't seem to regress the number of LLVM lines we're emitting: Lines Copies Function name ----- ------ ------------- 192746 2456 (TOTAL) 3978 (2.1%, 2.1%) 1 (0.0%, 0.0%) jiff::fmt::strtime::format::Formatter<W,L>::format_one 3785 (2.0%, 4.0%) 1 (0.0%, 0.1%) jiff::fmt::strtime::parse::Parser::parse 3600 (1.9%, 5.9%) 2 (0.1%, 0.2%) jiff::fmt::temporal::printer::SpanPrinter::print_span 3340 (1.7%, 7.6%) 2 (0.1%, 0.2%) jiff::fmt::friendly::printer::SpanPrinter::print_duration_designators 3250 (1.7%, 9.3%) 86 (3.5%, 3.7%) <core::result::Result<T,jiff::error::Error> as jiff::error::ErrorContext>::with_context::{{closure}} 2910 (1.5%, 10.8%) 6 (0.2%, 4.0%) jiff::fmt::temporal::printer::DateTimePrinter::print_time 2190 (1.1%, 12.0%) 1 (0.0%, 4.0%) jiff::span::Span::from_invariant_nanoseconds 2166 (1.1%, 13.1%) 6 (0.2%, 4.3%) jiff::fmt::temporal::printer::DateTimePrinter::print_date 1760 (0.9%, 14.0%) 2 (0.1%, 4.4%) jiff::fmt::temporal::printer::SpanPrinter::print_duration 1588 (0.8%, 14.8%) 2 (0.1%, 4.4%) jiff::fmt::friendly::printer::SpanPrinter::print_span_hms 1450 (0.8%, 15.6%) 2 (0.1%, 4.5%) jiff::fmt::friendly::printer::SpanPrinter::print_span_designators_non_fraction 1432 (0.7%, 16.3%) 4 (0.2%, 4.7%) jiff::fmt::temporal::printer::DateTimePrinter::print_offset_rounded 1398 (0.7%, 17.0%) 2 (0.1%, 4.8%) jiff::fmt::temporal::printer::DateTimePrinter::print_pieces 1368 (0.7%, 17.8%) 6 (0.2%, 5.0%) jiff::fmt::friendly::printer::DesignatorWriter<W>::write 1364 (0.7%, 18.5%) 2 (0.1%, 5.1%) jiff::fmt::friendly::printer::SpanPrinter::print_duration_hms 1248 (0.6%, 19.1%) 16 (0.7%, 5.7%) <jiff::util::rangeint::RangedDebug<_,_> as core::fmt::Debug>::fmt 1243 (0.6%, 19.8%) 1 (0.0%, 5.8%) jiff::fmt::rfc2822::DateTimePrinter::print_civil_with_offset 1222 (0.6%, 20.4%) 13 (0.5%, 6.3%) jiff::util::rangeint::ri8<_,_>::new 1108 (0.6%, 21.0%) 28 (1.1%, 7.5%) core::result::Result<T,E>::unwrap 1097 (0.6%, 21.5%) 1 (0.0%, 7.5%) jiff::shared::tzif::Header::parse 1086 (0.6%, 22.1%) 22 (0.9%, 8.4%) core::option::Option<T>::map_or Kudos to @charliermarsh for finding this while profiling uv.

BurntSushi mentioned this pull request May 17, 2025

a different internal crate organization to help with build times #373

Open

BurntSushi merged commit ca06977 into master May 18, 2025
38 checks passed

BurntSushi deleted the ag/tweak-error-constructors branch May 18, 2025 12:59

BurntSushi mentioned this pull request Jun 13, 2025

error: partially re-inline error constructors #385

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

error: try to reduce amount of codegen by forcefully unlining error constructors#379

error: try to reduce amount of codegen by forcefully unlining error constructors#379
BurntSushi merged 1 commit intomasterfrom
ag/tweak-error-constructors

BurntSushi commented May 17, 2025 •

edited

Loading

Uh oh!

BurntSushi commented May 17, 2025 •

edited

Loading

Uh oh!

BurntSushi commented May 17, 2025 •

edited

Loading

Uh oh!

BurntSushi commented May 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

BurntSushi commented May 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BurntSushi commented May 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BurntSushi commented May 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BurntSushi commented May 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

BurntSushi commented May 17, 2025 •

edited

Loading

BurntSushi commented May 17, 2025 •

edited

Loading

BurntSushi commented May 17, 2025 •

edited

Loading