fix(profiling): fix crash due to Frame cache eviction#17456
Conversation
Performance SLOsComparing candidate kowalski/fix-profiling-fix-crash-due-to-frame-cache-eviction (90de791) with baseline main (c26e618) 📈 Performance Regressions (3 suites)📈 iastaspects - 118/118✅ add_aspectTime: ✅ 103.453µs (SLO: <130.000µs 📉 -20.4%) vs baseline: +0.9% Memory: ✅ 43.906MB (SLO: <46.000MB -4.6%) vs baseline: +5.6% ✅ add_inplace_aspectTime: ✅ 101.932µs (SLO: <130.000µs 📉 -21.6%) vs baseline: -2.0% Memory: ✅ 43.890MB (SLO: <46.000MB -4.6%) vs baseline: +5.3% ✅ add_inplace_noaspectTime: ✅ 28.286µs (SLO: <40.000µs 📉 -29.3%) vs baseline: -0.4% Memory: ✅ 43.971MB (SLO: <46.000MB -4.4%) vs baseline: +5.4% ✅ add_noaspectTime: ✅ 49.437µs (SLO: <70.000µs 📉 -29.4%) vs baseline: ~same Memory: ✅ 43.889MB (SLO: <46.000MB -4.6%) vs baseline: +5.4% ✅ bytearray_aspectTime: ✅ 261.200µs (SLO: <400.000µs 📉 -34.7%) vs baseline: +0.9% Memory: ✅ 43.935MB (SLO: <46.000MB -4.5%) vs baseline: +5.6% ✅ bytearray_extend_aspectTime: ✅ 665.136µs (SLO: <800.000µs 📉 -16.9%) vs baseline: +2.7% Memory: ✅ 43.846MB (SLO: <46.000MB -4.7%) vs baseline: +5.5% ✅ bytearray_extend_noaspectTime: ✅ 275.805µs (SLO: <400.000µs 📉 -31.0%) vs baseline: +5.5% Memory: ✅ 43.861MB (SLO: <46.000MB -4.7%) vs baseline: +5.2% ✅ bytearray_noaspectTime: ✅ 144.126µs (SLO: <300.000µs 📉 -52.0%) vs baseline: +2.1% Memory: ✅ 43.878MB (SLO: <46.000MB -4.6%) vs baseline: +5.3% ✅ bytes_aspectTime: ✅ 220.359µs (SLO: <300.000µs 📉 -26.5%) vs baseline: -0.5% Memory: ✅ 43.844MB (SLO: <46.000MB -4.7%) vs baseline: +5.4% ✅ bytes_noaspectTime: ✅ 134.346µs (SLO: <200.000µs 📉 -32.8%) vs baseline: ~same Memory: ✅ 43.870MB (SLO: <46.000MB -4.6%) vs baseline: +5.5% ✅ bytesio_aspectTime: ✅ 3.868ms (SLO: <5.000ms 📉 -22.6%) vs baseline: +0.9% Memory: ✅ 43.769MB (SLO: <46.000MB -4.8%) vs baseline: +5.1% ✅ bytesio_noaspectTime: ✅ 318.748µs (SLO: <420.000µs 📉 -24.1%) vs baseline: -0.8% Memory: ✅ 43.887MB (SLO: <46.000MB -4.6%) vs baseline: +5.4% ✅ capitalize_aspectTime: ✅ 88.446µs (SLO: <300.000µs 📉 -70.5%) vs baseline: -0.8% Memory: ✅ 43.879MB (SLO: <46.000MB -4.6%) vs baseline: +5.3% ✅ capitalize_noaspectTime: ✅ 270.103µs (SLO: <300.000µs -10.0%) vs baseline: +5.8% Memory: ✅ 43.894MB (SLO: <46.000MB -4.6%) vs baseline: +5.3% ✅ casefold_aspectTime: ✅ 87.908µs (SLO: <500.000µs 📉 -82.4%) vs baseline: -0.9% Memory: ✅ 43.880MB (SLO: <46.000MB -4.6%) vs baseline: +5.5% ✅ casefold_noaspectTime: ✅ 317.849µs (SLO: <500.000µs 📉 -36.4%) vs baseline: +2.0% Memory: ✅ 43.758MB (SLO: <46.000MB -4.9%) vs baseline: +5.3% ✅ decode_aspectTime: ✅ 87.339µs (SLO: <100.000µs 📉 -12.7%) vs baseline: +0.8% Memory: ✅ 44.014MB (SLO: <46.000MB -4.3%) vs baseline: +5.8% ✅ decode_noaspectTime: ✅ 156.709µs (SLO: <210.000µs 📉 -25.4%) vs baseline: +1.3% Memory: ✅ 43.871MB (SLO: <46.000MB -4.6%) vs baseline: +5.5% ✅ encode_aspectTime: ✅ 85.123µs (SLO: <200.000µs 📉 -57.4%) vs baseline: +0.5% Memory: ✅ 43.927MB (SLO: <46.000MB -4.5%) vs baseline: +5.3% ✅ encode_noaspectTime: ✅ 144.834µs (SLO: <200.000µs 📉 -27.6%) vs baseline: +0.5% Memory: ✅ 43.931MB (SLO: <46.000MB -4.5%) vs baseline: +5.5% ✅ format_aspectTime: ✅ 14.605ms (SLO: <19.200ms 📉 -23.9%) vs baseline: ~same Memory: ✅ 43.881MB (SLO: <46.000MB -4.6%) vs baseline: +5.3% ✅ format_map_aspectTime: ✅ 16.354ms (SLO: <21.500ms 📉 -23.9%) vs baseline: -0.9% Memory: ✅ 43.967MB (SLO: <46.000MB -4.4%) vs baseline: +5.5% ✅ format_map_noaspectTime: ✅ 364.961µs (SLO: <500.000µs 📉 -27.0%) vs baseline: -3.0% Memory: ✅ 43.959MB (SLO: <46.000MB -4.4%) vs baseline: +5.7% ✅ format_noaspectTime: ✅ 310.029µs (SLO: <500.000µs 📉 -38.0%) vs baseline: -1.8% Memory: ✅ 43.931MB (SLO: <46.000MB -4.5%) vs baseline: +5.5% ✅ index_aspectTime: ✅ 139.159µs (SLO: <300.000µs 📉 -53.6%) vs baseline: 📈 +10.0% Memory: ✅ 43.824MB (SLO: <46.000MB -4.7%) vs baseline: +5.4% ✅ index_noaspectTime: ✅ 40.469µs (SLO: <300.000µs 📉 -86.5%) vs baseline: +0.1% Memory: ✅ 43.830MB (SLO: <46.000MB -4.7%) vs baseline: +5.2% ✅ join_aspectTime: ✅ 206.696µs (SLO: <300.000µs 📉 -31.1%) vs baseline: -3.4% Memory: ✅ 43.889MB (SLO: <46.000MB -4.6%) vs baseline: +5.5% ✅ join_noaspectTime: ✅ 141.754µs (SLO: <300.000µs 📉 -52.7%) vs baseline: -1.6% Memory: ✅ 43.887MB (SLO: <46.000MB -4.6%) vs baseline: +5.2% ✅ ljust_aspectTime: ✅ 502.977µs (SLO: <700.000µs 📉 -28.1%) vs baseline: -3.0% Memory: ✅ 43.795MB (SLO: <46.000MB -4.8%) vs baseline: +5.2% ✅ ljust_noaspectTime: ✅ 273.824µs (SLO: <300.000µs -8.7%) vs baseline: +2.7% Memory: ✅ 43.831MB (SLO: <46.000MB -4.7%) vs baseline: +5.5% ✅ lower_aspectTime: ✅ 307.620µs (SLO: <500.000µs 📉 -38.5%) vs baseline: -1.1% Memory: ✅ 43.794MB (SLO: <46.000MB -4.8%) vs baseline: +5.2% ✅ lower_noaspectTime: ✅ 244.007µs (SLO: <300.000µs 📉 -18.7%) vs baseline: +1.5% Memory: ✅ 43.889MB (SLO: <46.000MB -4.6%) vs baseline: +5.5% ✅ lstrip_aspectTime: ✅ 0.278ms (SLO: <3.000ms 📉 -90.7%) vs baseline: +0.2% Memory: ✅ 43.888MB (SLO: <46.000MB -4.6%) vs baseline: +5.4% ✅ lstrip_noaspectTime: ✅ 0.179ms (SLO: <3.000ms 📉 -94.0%) vs baseline: +0.5% Memory: ✅ 43.814MB (SLO: <46.000MB -4.8%) vs baseline: +5.3% ✅ modulo_aspectTime: ✅ 14.242ms (SLO: <18.750ms 📉 -24.0%) vs baseline: -0.6% Memory: ✅ 43.886MB (SLO: <46.000MB -4.6%) vs baseline: +5.1% ✅ modulo_aspect_for_bytearray_bytearrayTime: ✅ 14.826ms (SLO: <19.350ms 📉 -23.4%) vs baseline: ~same Memory: ✅ 43.935MB (SLO: <46.000MB -4.5%) vs baseline: +5.3% ✅ modulo_aspect_for_bytesTime: ✅ 14.400ms (SLO: <18.900ms 📉 -23.8%) vs baseline: +0.1% Memory: ✅ 43.835MB (SLO: <46.000MB -4.7%) vs baseline: +4.9% ✅ modulo_aspect_for_bytes_bytearrayTime: ✅ 14.658ms (SLO: <19.150ms 📉 -23.5%) vs baseline: +0.7% Memory: ✅ 43.952MB (SLO: <46.000MB -4.5%) vs baseline: +5.1% ✅ modulo_noaspectTime: ✅ 0.364ms (SLO: <3.000ms 📉 -87.9%) vs baseline: -1.8% Memory: ✅ 43.832MB (SLO: <46.000MB -4.7%) vs baseline: +5.3% ✅ replace_aspectTime: ✅ 18.368ms (SLO: <24.000ms 📉 -23.5%) vs baseline: -0.2% Memory: ✅ 43.874MB (SLO: <46.000MB -4.6%) vs baseline: +5.1% ✅ replace_noaspectTime: ✅ 285.876µs (SLO: <400.000µs 📉 -28.5%) vs baseline: +1.2% Memory: ✅ 43.935MB (SLO: <46.000MB -4.5%) vs baseline: +5.6% ✅ repr_aspectTime: ✅ 316.054µs (SLO: <420.000µs 📉 -24.7%) vs baseline: -2.7% Memory: ✅ 43.850MB (SLO: <46.000MB -4.7%) vs baseline: +5.3% ✅ repr_noaspectTime: ✅ 46.883µs (SLO: <90.000µs 📉 -47.9%) vs baseline: -0.7% Memory: ✅ 43.857MB (SLO: <46.000MB -4.7%) vs baseline: +5.5% ✅ rstrip_aspectTime: ✅ 382.985µs (SLO: <500.000µs 📉 -23.4%) vs baseline: -2.0% Memory: ✅ 43.980MB (SLO: <46.000MB -4.4%) vs baseline: +5.6% ✅ rstrip_noaspectTime: ✅ 186.735µs (SLO: <300.000µs 📉 -37.8%) vs baseline: +2.4% Memory: ✅ 43.911MB (SLO: <46.000MB -4.5%) vs baseline: +5.5% ✅ slice_aspectTime: ✅ 186.011µs (SLO: <300.000µs 📉 -38.0%) vs baseline: +0.8% Memory: ✅ 43.824MB (SLO: <46.000MB -4.7%) vs baseline: +5.1% ✅ slice_noaspectTime: ✅ 54.352µs (SLO: <90.000µs 📉 -39.6%) vs baseline: +0.6% Memory: ✅ 43.871MB (SLO: <46.000MB -4.6%) vs baseline: +5.2% ✅ stringio_aspectTime: ✅ 4.459ms (SLO: <5.000ms 📉 -10.8%) vs baseline: 📈 +14.9% Memory: ✅ 43.809MB (SLO: <46.000MB -4.8%) vs baseline: +4.8% ✅ stringio_noaspectTime: ✅ 351.934µs (SLO: <500.000µs 📉 -29.6%) vs baseline: -0.7% Memory: ✅ 43.855MB (SLO: <46.000MB -4.7%) vs baseline: +5.6% ✅ strip_aspectTime: ✅ 275.634µs (SLO: <350.000µs 📉 -21.2%) vs baseline: +1.0% Memory: ✅ 43.810MB (SLO: <46.000MB -4.8%) vs baseline: +5.3% ✅ strip_noaspectTime: ✅ 179.645µs (SLO: <240.000µs 📉 -25.1%) vs baseline: +0.5% Memory: ✅ 43.824MB (SLO: <46.000MB -4.7%) vs baseline: +5.3% ✅ swapcase_aspectTime: ✅ 346.177µs (SLO: <500.000µs 📉 -30.8%) vs baseline: -0.8% Memory: ✅ 43.955MB (SLO: <46.000MB -4.4%) vs baseline: +5.6% ✅ swapcase_noaspectTime: ✅ 275.442µs (SLO: <400.000µs 📉 -31.1%) vs baseline: -0.3% Memory: ✅ 43.852MB (SLO: <46.000MB -4.7%) vs baseline: +5.3% ✅ title_aspectTime: ✅ 330.083µs (SLO: <500.000µs 📉 -34.0%) vs baseline: +0.7% Memory: ✅ 43.913MB (SLO: <46.000MB -4.5%) vs baseline: +5.4% ✅ title_noaspectTime: ✅ 267.079µs (SLO: <400.000µs 📉 -33.2%) vs baseline: +1.5% Memory: ✅ 43.821MB (SLO: <46.000MB -4.7%) vs baseline: +5.1% ✅ translate_aspectTime: ✅ 518.608µs (SLO: <700.000µs 📉 -25.9%) vs baseline: +2.8% Memory: ✅ 43.824MB (SLO: <46.000MB -4.7%) vs baseline: +4.9% ✅ translate_noaspectTime: ✅ 429.777µs (SLO: <500.000µs 📉 -14.0%) vs baseline: -1.1% Memory: ✅ 43.873MB (SLO: <46.000MB -4.6%) vs baseline: +5.4% ✅ upper_aspectTime: ✅ 300.824µs (SLO: <500.000µs 📉 -39.8%) vs baseline: -0.5% Memory: ✅ 43.795MB (SLO: <46.000MB -4.8%) vs baseline: +5.3% ✅ upper_noaspectTime: ✅ 246.773µs (SLO: <400.000µs 📉 -38.3%) vs baseline: +3.3% Memory: ✅ 43.873MB (SLO: <46.000MB -4.6%) vs baseline: +5.3% 📈 iastaspectsospath - 24/24✅ ospathbasename_aspectTime: ✅ 524.510µs (SLO: <700.000µs 📉 -25.1%) vs baseline: 📈 +22.2% Memory: ✅ 43.823MB (SLO: <46.000MB -4.7%) vs baseline: +5.5% ✅ ospathbasename_noaspectTime: ✅ 442.424µs (SLO: <700.000µs 📉 -36.8%) vs baseline: +1.4% Memory: ✅ 43.868MB (SLO: <46.000MB -4.6%) vs baseline: +5.5% ✅ ospathjoin_aspectTime: ✅ 643.100µs (SLO: <700.000µs -8.1%) vs baseline: +1.9% Memory: ✅ 43.842MB (SLO: <46.000MB -4.7%) vs baseline: +5.3% ✅ ospathjoin_noaspectTime: ✅ 650.010µs (SLO: <700.000µs -7.1%) vs baseline: +1.2% Memory: ✅ 43.817MB (SLO: <46.000MB -4.7%) vs baseline: +5.1% ✅ ospathnormcase_aspectTime: ✅ 360.487µs (SLO: <700.000µs 📉 -48.5%) vs baseline: +2.2% Memory: ✅ 43.854MB (SLO: <46.000MB -4.7%) vs baseline: +5.5% ✅ ospathnormcase_noaspectTime: ✅ 364.782µs (SLO: <700.000µs 📉 -47.9%) vs baseline: +1.0% Memory: ✅ 43.942MB (SLO: <46.000MB -4.5%) vs baseline: +5.7% ✅ ospathsplit_aspectTime: ✅ 497.730µs (SLO: <700.000µs 📉 -28.9%) vs baseline: +0.9% Memory: ✅ 43.762MB (SLO: <46.000MB -4.9%) vs baseline: +5.3% ✅ ospathsplit_noaspectTime: ✅ 502.271µs (SLO: <700.000µs 📉 -28.2%) vs baseline: -0.2% Memory: ✅ 43.821MB (SLO: <46.000MB -4.7%) vs baseline: +5.4% ✅ ospathsplitdrive_aspectTime: ✅ 379.438µs (SLO: <700.000µs 📉 -45.8%) vs baseline: +0.2% Memory: ✅ 43.885MB (SLO: <46.000MB -4.6%) vs baseline: +5.7% ✅ ospathsplitdrive_noaspectTime: ✅ 72.239µs (SLO: <700.000µs 📉 -89.7%) vs baseline: -1.1% Memory: ✅ 43.856MB (SLO: <46.000MB -4.7%) vs baseline: +5.5% ✅ ospathsplitext_aspectTime: ✅ 469.585µs (SLO: <700.000µs 📉 -32.9%) vs baseline: +1.9% Memory: ✅ 43.797MB (SLO: <46.000MB -4.8%) vs baseline: +5.5% ✅ ospathsplitext_noaspectTime: ✅ 477.921µs (SLO: <700.000µs 📉 -31.7%) vs baseline: +1.7% Memory: ✅ 43.800MB (SLO: <46.000MB -4.8%) vs baseline: +5.3% 📈 iastaspectssplit - 12/12✅ rsplit_aspectTime: ✅ 167.704µs (SLO: <250.000µs 📉 -32.9%) vs baseline: 📈 +10.6% Memory: ✅ 43.804MB (SLO: <46.000MB -4.8%) vs baseline: +5.3% ✅ rsplit_noaspectTime: ✅ 163.805µs (SLO: <250.000µs 📉 -34.5%) vs baseline: +1.8% Memory: ✅ 43.952MB (SLO: <46.000MB -4.5%) vs baseline: +5.6% ✅ split_aspectTime: ✅ 152.720µs (SLO: <250.000µs 📉 -38.9%) vs baseline: +0.7% Memory: ✅ 43.790MB (SLO: <46.000MB -4.8%) vs baseline: +5.2% ✅ split_noaspectTime: ✅ 160.364µs (SLO: <250.000µs 📉 -35.9%) vs baseline: -0.6% Memory: ✅ 43.889MB (SLO: <46.000MB -4.6%) vs baseline: +5.3% ✅ splitlines_aspectTime: ✅ 148.527µs (SLO: <250.000µs 📉 -40.6%) vs baseline: +2.0% Memory: ✅ 43.787MB (SLO: <46.000MB -4.8%) vs baseline: +5.3% ✅ splitlines_noaspectTime: ✅ 160.293µs (SLO: <250.000µs 📉 -35.9%) vs baseline: +4.0% Memory: ✅ 43.859MB (SLO: <46.000MB -4.7%) vs baseline: +5.2% 🟡 Near SLO Breach (5 suites)🟡 djangosimple - 30/30✅ appsecTime: ✅ 19.641ms (SLO: <22.300ms 📉 -11.9%) vs baseline: -0.3% Memory: ✅ 69.383MB (SLO: <73.500MB -5.6%) vs baseline: +5.2% ✅ exception-replay-enabledTime: ✅ 1.363ms (SLO: <1.450ms -6.0%) vs baseline: ~same Memory: ✅ 67.603MB (SLO: <71.500MB -5.4%) vs baseline: +5.1% ✅ iastTime: ✅ 19.684ms (SLO: <22.250ms 📉 -11.5%) vs baseline: -0.4% Memory: ✅ 69.380MB (SLO: <75.000MB -7.5%) vs baseline: +5.0% ✅ profilerTime: ✅ 15.105ms (SLO: <16.550ms -8.7%) vs baseline: -0.1% Memory: ✅ 60.163MB (SLO: <61.000MB 🟡 -1.4%) vs baseline: +5.7% ✅ resource-renamingTime: ✅ 19.508ms (SLO: <21.750ms 📉 -10.3%) vs baseline: -0.6% Memory: ✅ 69.530MB (SLO: <73.500MB -5.4%) vs baseline: +5.3% ✅ span-code-originTime: ✅ 20.074ms (SLO: <28.200ms 📉 -28.8%) vs baseline: +1.0% Memory: ✅ 69.363MB (SLO: <75.000MB -7.5%) vs baseline: +5.0% ✅ tracerTime: ✅ 19.669ms (SLO: <21.750ms -9.6%) vs baseline: ~same Memory: ✅ 69.422MB (SLO: <75.000MB -7.4%) vs baseline: +5.1% ✅ tracer-and-profilerTime: ✅ 20.968ms (SLO: <23.500ms 📉 -10.8%) vs baseline: -0.2% Memory: ✅ 71.339MB (SLO: <75.000MB -4.9%) vs baseline: +5.1% ✅ tracer-dont-create-db-spansTime: ✅ 19.834ms (SLO: <21.500ms -7.7%) vs baseline: +0.3% Memory: ✅ 69.392MB (SLO: <75.000MB -7.5%) vs baseline: +5.1% ✅ tracer-minimalTime: ✅ 16.681ms (SLO: <17.500ms -4.7%) vs baseline: ~same Memory: ✅ 69.396MB (SLO: <75.000MB -7.5%) vs baseline: +5.2% ✅ tracer-nativeTime: ✅ 19.708ms (SLO: <21.750ms -9.4%) vs baseline: +1.1% Memory: ✅ 69.404MB (SLO: <72.500MB -4.3%) vs baseline: +5.2% ✅ tracer-no-cachesTime: ✅ 17.489ms (SLO: <19.650ms 📉 -11.0%) vs baseline: +0.6% Memory: ✅ 69.272MB (SLO: <75.000MB -7.6%) vs baseline: +4.9% ✅ tracer-no-databasesTime: ✅ 19.231ms (SLO: <20.100ms -4.3%) vs baseline: ~same Memory: ✅ 69.344MB (SLO: <75.000MB -7.5%) vs baseline: +5.1% ✅ tracer-no-middlewareTime: ✅ 19.408ms (SLO: <21.500ms -9.7%) vs baseline: +0.1% Memory: ✅ 69.383MB (SLO: <75.000MB -7.5%) vs baseline: +5.0% ✅ tracer-no-templatesTime: ✅ 19.528ms (SLO: <22.000ms 📉 -11.2%) vs baseline: +1.2% Memory: ✅ 69.343MB (SLO: <73.500MB -5.7%) vs baseline: +5.0% 🟡 otelspan - 22/22✅ add-eventTime: ✅ 40.731ms (SLO: <47.150ms 📉 -13.6%) vs baseline: -0.4% Memory: ✅ 41.427MB (SLO: <47.000MB 📉 -11.9%) vs baseline: +5.9% ✅ add-metricsTime: ✅ 236.157ms (SLO: <344.800ms 📉 -31.5%) vs baseline: +0.2% Memory: ✅ 45.731MB (SLO: <47.500MB -3.7%) vs baseline: +5.6% ✅ add-tagsTime: ✅ 277.708ms (SLO: <330.000ms 📉 -15.8%) vs baseline: +2.0% Memory: ✅ 45.767MB (SLO: <47.500MB -3.6%) vs baseline: +5.3% ✅ get-contextTime: ✅ 83.691ms (SLO: <92.350ms -9.4%) vs baseline: ~same Memory: ✅ 41.583MB (SLO: <46.500MB 📉 -10.6%) vs baseline: +5.4% ✅ is-recordingTime: ✅ 39.098ms (SLO: <44.500ms 📉 -12.1%) vs baseline: -0.5% Memory: ✅ 41.173MB (SLO: <47.500MB 📉 -13.3%) vs baseline: +5.3% ✅ record-exceptionTime: ✅ 61.114ms (SLO: <67.650ms -9.7%) vs baseline: ~same Memory: ✅ 41.719MB (SLO: <47.000MB 📉 -11.2%) vs baseline: +5.1% ✅ set-statusTime: ✅ 45.202ms (SLO: <50.400ms 📉 -10.3%) vs baseline: +0.1% Memory: ✅ 41.131MB (SLO: <47.000MB 📉 -12.5%) vs baseline: +5.5% ✅ startTime: ✅ 40.420ms (SLO: <44.500ms -9.2%) vs baseline: +4.3% Memory: ✅ 41.136MB (SLO: <47.000MB 📉 -12.5%) vs baseline: +5.3% ✅ start-finishTime: ✅ 90.396ms (SLO: <91.000ms 🟡 -0.7%) vs baseline: +0.3% Memory: ✅ 38.751MB (SLO: <46.500MB 📉 -16.7%) vs baseline: +5.3% ✅ start-finish-telemetryTime: ✅ 91.929ms (SLO: <92.000ms 🟡 ~same) vs baseline: +0.3% Memory: ✅ 38.692MB (SLO: <46.500MB 📉 -16.8%) vs baseline: +5.2% ✅ update-nameTime: ✅ 40.211ms (SLO: <45.150ms 📉 -10.9%) vs baseline: -0.5% Memory: ✅ 41.127MB (SLO: <47.000MB 📉 -12.5%) vs baseline: +5.4% 🟡 samplingrules - 8/8✅ average_matchTime: ✅ 169.816µs (SLO: <290.000µs 📉 -41.4%) vs baseline: +0.7% Memory: ✅ 36.156MB (SLO: <38.000MB -4.9%) vs baseline: +5.3% ✅ high_matchTime: ✅ 214.941µs (SLO: <480.000µs 📉 -55.2%) vs baseline: -0.7% Memory: ✅ 36.176MB (SLO: <38.000MB -4.8%) vs baseline: +5.4% ✅ low_matchTime: ✅ 119.219µs (SLO: <120.000µs 🟡 -0.7%) vs baseline: +0.2% Memory: ✅ 701.561MB (SLO: <780.000MB 📉 -10.1%) vs baseline: +4.9% ✅ very_low_matchTime: ✅ 3.099ms (SLO: <8.500ms 📉 -63.5%) vs baseline: -0.6% Memory: ✅ 78.688MB (SLO: <85.000MB -7.4%) vs baseline: +5.0% 🟡 span - 26/26✅ add-eventTime: ✅ 19.520ms (SLO: <22.500ms 📉 -13.2%) vs baseline: -1.6% Memory: ✅ 38.456MB (SLO: <53.000MB 📉 -27.4%) vs baseline: +5.5% ✅ add-metricsTime: ✅ 89.687ms (SLO: <93.500ms -4.1%) vs baseline: +0.8% Memory: ✅ 42.850MB (SLO: <53.000MB 📉 -19.2%) vs baseline: +5.4% ✅ add-tagsTime: ✅ 148.101ms (SLO: <155.000ms -4.5%) vs baseline: ~same Memory: ✅ 42.865MB (SLO: <53.000MB 📉 -19.1%) vs baseline: +5.7% ✅ get-contextTime: ✅ 18.789ms (SLO: <20.500ms -8.3%) vs baseline: -1.4% Memory: ✅ 38.384MB (SLO: <53.000MB 📉 -27.6%) vs baseline: +5.6% ✅ is-recordingTime: ✅ 18.901ms (SLO: <20.500ms -7.8%) vs baseline: -1.0% Memory: ✅ 38.313MB (SLO: <53.000MB 📉 -27.7%) vs baseline: +5.6% ✅ record-exceptionTime: ✅ 38.522ms (SLO: <41.000ms -6.0%) vs baseline: -0.3% Memory: ✅ 38.809MB (SLO: <53.000MB 📉 -26.8%) vs baseline: +5.5% ✅ set-statusTime: ✅ 20.552ms (SLO: <22.000ms -6.6%) vs baseline: -1.2% Memory: ✅ 38.331MB (SLO: <53.000MB 📉 -27.7%) vs baseline: +5.6% ✅ startTime: ✅ 19.715ms (SLO: <20.500ms -3.8%) vs baseline: +4.2% Memory: ✅ 38.444MB (SLO: <53.000MB 📉 -27.5%) vs baseline: +5.8% ✅ start-finishTime: ✅ 58.498ms (SLO: <58.500ms 🟡 ~same) vs baseline: -0.1% Memory: ✅ 36.196MB (SLO: <38.000MB -4.7%) vs baseline: +5.1% ✅ start-finish-telemetryTime: ✅ 59.622ms (SLO: <60.000ms 🟡 -0.6%) vs baseline: -0.1% Memory: ✅ 36.255MB (SLO: <38.000MB -4.6%) vs baseline: +5.5% ✅ start-finish-traceid128Time: ✅ 60.896ms (SLO: <62.000ms 🟡 -1.8%) vs baseline: -0.4% Memory: ✅ 36.137MB (SLO: <38.000MB -4.9%) vs baseline: +5.0% ✅ start-traceid128Time: ✅ 18.745ms (SLO: <22.500ms 📉 -16.7%) vs baseline: -1.0% Memory: ✅ 38.333MB (SLO: <53.000MB 📉 -27.7%) vs baseline: +5.4% ✅ update-nameTime: ✅ 19.254ms (SLO: <22.000ms 📉 -12.5%) vs baseline: -1.8% Memory: ✅ 38.296MB (SLO: <53.000MB 📉 -27.7%) vs baseline: +4.9% 🟡 tracer - 6/6✅ largeTime: ✅ 33.081ms (SLO: <32.950ms +0.4%) vs baseline: ~same Memory: ✅ 37.827MB (SLO: <39.250MB -3.6%) vs baseline: +6.1% ✅ mediumTime: ✅ 3.308ms (SLO: <3.500ms -5.5%) vs baseline: -1.1% Memory: ✅ 36.294MB (SLO: <38.750MB -6.3%) vs baseline: +5.7% ✅ smallTime: ✅ 386.609µs (SLO: <390.000µs 🟡 -0.9%) vs baseline: +4.0% Memory: ✅ 36.235MB (SLO: <38.750MB -6.5%) vs baseline: +5.7%
|
50c3976 to
1113136
Compare
Codeowners resolved as |
This comment has been minimized.
This comment has been minimized.
1113136 to
730ee23
Compare
|
@codex review |
|
Codex Review: Didn't find any major issues. Swish! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Have you tried a higher |
Variance was already extremely low so no, I didn't. I can of course, but I honestly don't expect this to change anything. |
|
Do you have the numbers for memory usage from DoE, since this is what we're potentially affecting? It's not in the summary table. |
Added to the description! Approximately +1% on average in the three runs but it might be from the outlier. |
730ee23 to
bb2d1ca
Compare
Co-authored-by: Vlad Scherbich <vlad.scherbich@datadoghq.com>
https://datadoghq.atlassian.net/browse/PROF-13112 This fixes a crash in the Profiler codebase. The crash was caused by `ThreadInfo::unwind_task` trying to use `Frame` objects that it had a reference to (`Frame::Ref`) while those same Frames were being evicted from the cache, which can happen under high load. ``` Error UnixSignal: Process terminated with SI_KERNEL (SIGSEGV) ``` The proposed fix is to change `FrameStack` to extend `std::deque<Frame>` instead of `std::deque<Frame::Ref>`, which means pushing frames to `FrameStack` actually copies them and eliminates that race condition. This should be OK performance-wise because `Frame` is tiny and only has trivial fields to copy (no memory allocations, only integral types). Another possible fix would be to "lock" certain `Frame` objects in the cache (to prevent them from being evicted while any Task using them is being unwound) but this would be much more complicated to get right and although it would lead to less copies, the performance picture is blurry for more subtle reasons. So I suggest we don't go ahead with the alternative unless we absolutely need to (which we don't seem to). Running DoE with this change on the Enterprise archetype with asyncio/FastAPI (n=10) shows no significant difference (either in memory or CPU usage / latency), so I think this is safe to merge performance-wise. <details> | Commit | Run | p50 (ms) | p99 (ms) | CPU% | Mem (MB) | |--------|-----|----------|----------|------|----------| | `90de7913` (after) | 1 | 150.43 | 152.88 | 172.68 | 274.9 | | `90de7913` (after) | 2 | 150.45 | 152.38 | 172.17 | 266.2 | | `90de7913` (after) | 3 | 150.43 | 153.91 | 172.92 | 266.2 | | `90de7913` (after) | 4 | 150.46 | 154.16 | 172.83 | 265.8 | | `90de7913` (after) | 5 | 150.44 | 153.01 | 172.80 | 267.4 | | `90de7913` (after) | 6 | 150.43 | 153.28 | 172.70 | 265.8 | | `90de7913` (after) | 7 | 150.54 | 157.48 | 173.32 | 265.8 | | `90de7913` (after) | 8 | 150.51 | 156.61 | 172.18 | 265.5 | | `90de7913` (after) | 9 | 150.53 | 157.05 | 172.18 | 265.8 | | `90de7913` (after) | 10 | 150.45 | 152.84 | 172.08 | 265.5 | | `c26e6181` (before) | 1 | 150.51 | 152.80 | 173.63 | 265.5 | | `c26e6181` (before) | 2 | 150.48 | 152.41 | 173.49 | 265.4 | | `c26e6181` (before) | 3 | 150.47 | 156.31 | 172.53 | 265.9 | | `c26e6181` (before) | 4 | 150.49 | 152.60 | 172.86 | 265.5 | | `c26e6181` (before) | 5 | 150.40 | 152.86 | 172.99 | 266.0 | | `c26e6181` (before) | 6 | 150.46 | 152.66 | 172.53 | 265.5 | | `c26e6181` (before) | 7 | 150.49 | 154.91 | 172.74 | 265.3 | | `c26e6181` (before) | 8 | 150.46 | 152.46 | 171.93 | 265.7 | | `c26e6181` (before) | 9 | 150.52 | 156.70 | 172.96 | 265.7 | | `c26e6181` (before) | 10 | 150.43 | 153.02 | 172.52 | 265.8 | | Version | p50 (ms) | p99 (ms) | CPU% | Mem (MB) | |---------|----------|----------|------|----------| | `c26e6181` (before) | 150.470 ± 0.034 | 153.67 ± 1.66 | 172.82 ± 0.50 | 265.6 ± 0.2 | | `90de7913` (after) | 150.465 ± 0.043 | 154.36 ± 1.94 | 172.59 ± 0.41 | 266.9 ± 2.9 | | **Δ (after − before)** | **−0.005ms (0.00%)** | **+0.69ms (+0.45%)** | **−0.23pp (−0.13%)** | **+1.2MB (+0.47%)** | </details> Co-authored-by: thomas.kowalski <thomas.kowalski@datadoghq.com>
https://datadoghq.atlassian.net/browse/PROF-13112 This fixes a crash in the Profiler codebase. The crash was caused by `ThreadInfo::unwind_task` trying to use `Frame` objects that it had a reference to (`Frame::Ref`) while those same Frames were being evicted from the cache, which can happen under high load. ``` Error UnixSignal: Process terminated with SI_KERNEL (SIGSEGV) ``` The proposed fix is to change `FrameStack` to extend `std::deque<Frame>` instead of `std::deque<Frame::Ref>`, which means pushing frames to `FrameStack` actually copies them and eliminates that race condition. This should be OK performance-wise because `Frame` is tiny and only has trivial fields to copy (no memory allocations, only integral types). Another possible fix would be to "lock" certain `Frame` objects in the cache (to prevent them from being evicted while any Task using them is being unwound) but this would be much more complicated to get right and although it would lead to less copies, the performance picture is blurry for more subtle reasons. So I suggest we don't go ahead with the alternative unless we absolutely need to (which we don't seem to). Running DoE with this change on the Enterprise archetype with asyncio/FastAPI (n=10) shows no significant difference (either in memory or CPU usage / latency), so I think this is safe to merge performance-wise. <details> | Commit | Run | p50 (ms) | p99 (ms) | CPU% | Mem (MB) | |--------|-----|----------|----------|------|----------| | `90de7913` (after) | 1 | 150.43 | 152.88 | 172.68 | 274.9 | | `90de7913` (after) | 2 | 150.45 | 152.38 | 172.17 | 266.2 | | `90de7913` (after) | 3 | 150.43 | 153.91 | 172.92 | 266.2 | | `90de7913` (after) | 4 | 150.46 | 154.16 | 172.83 | 265.8 | | `90de7913` (after) | 5 | 150.44 | 153.01 | 172.80 | 267.4 | | `90de7913` (after) | 6 | 150.43 | 153.28 | 172.70 | 265.8 | | `90de7913` (after) | 7 | 150.54 | 157.48 | 173.32 | 265.8 | | `90de7913` (after) | 8 | 150.51 | 156.61 | 172.18 | 265.5 | | `90de7913` (after) | 9 | 150.53 | 157.05 | 172.18 | 265.8 | | `90de7913` (after) | 10 | 150.45 | 152.84 | 172.08 | 265.5 | | `c26e6181` (before) | 1 | 150.51 | 152.80 | 173.63 | 265.5 | | `c26e6181` (before) | 2 | 150.48 | 152.41 | 173.49 | 265.4 | | `c26e6181` (before) | 3 | 150.47 | 156.31 | 172.53 | 265.9 | | `c26e6181` (before) | 4 | 150.49 | 152.60 | 172.86 | 265.5 | | `c26e6181` (before) | 5 | 150.40 | 152.86 | 172.99 | 266.0 | | `c26e6181` (before) | 6 | 150.46 | 152.66 | 172.53 | 265.5 | | `c26e6181` (before) | 7 | 150.49 | 154.91 | 172.74 | 265.3 | | `c26e6181` (before) | 8 | 150.46 | 152.46 | 171.93 | 265.7 | | `c26e6181` (before) | 9 | 150.52 | 156.70 | 172.96 | 265.7 | | `c26e6181` (before) | 10 | 150.43 | 153.02 | 172.52 | 265.8 | | Version | p50 (ms) | p99 (ms) | CPU% | Mem (MB) | |---------|----------|----------|------|----------| | `c26e6181` (before) | 150.470 ± 0.034 | 153.67 ± 1.66 | 172.82 ± 0.50 | 265.6 ± 0.2 | | `90de7913` (after) | 150.465 ± 0.043 | 154.36 ± 1.94 | 172.59 ± 0.41 | 266.9 ± 2.9 | | **Δ (after − before)** | **−0.005ms (0.00%)** | **+0.69ms (+0.45%)** | **−0.23pp (−0.13%)** | **+1.2MB (+0.47%)** | </details> Co-authored-by: thomas.kowalski <thomas.kowalski@datadoghq.com>
https://datadoghq.atlassian.net/browse/PROF-13112 This fixes a crash in the Profiler codebase. The crash was caused by `ThreadInfo::unwind_task` trying to use `Frame` objects that it had a reference to (`Frame::Ref`) while those same Frames were being evicted from the cache, which can happen under high load. ``` Error UnixSignal: Process terminated with SI_KERNEL (SIGSEGV) ``` The proposed fix is to change `FrameStack` to extend `std::deque<Frame>` instead of `std::deque<Frame::Ref>`, which means pushing frames to `FrameStack` actually copies them and eliminates that race condition. This should be OK performance-wise because `Frame` is tiny and only has trivial fields to copy (no memory allocations, only integral types). Another possible fix would be to "lock" certain `Frame` objects in the cache (to prevent them from being evicted while any Task using them is being unwound) but this would be much more complicated to get right and although it would lead to less copies, the performance picture is blurry for more subtle reasons. So I suggest we don't go ahead with the alternative unless we absolutely need to (which we don't seem to). Running DoE with this change on the Enterprise archetype with asyncio/FastAPI (n=10) shows no significant difference (either in memory or CPU usage / latency), so I think this is safe to merge performance-wise. <details> | Commit | Run | p50 (ms) | p99 (ms) | CPU% | Mem (MB) | |--------|-----|----------|----------|------|----------| | `90de7913` (after) | 1 | 150.43 | 152.88 | 172.68 | 274.9 | | `90de7913` (after) | 2 | 150.45 | 152.38 | 172.17 | 266.2 | | `90de7913` (after) | 3 | 150.43 | 153.91 | 172.92 | 266.2 | | `90de7913` (after) | 4 | 150.46 | 154.16 | 172.83 | 265.8 | | `90de7913` (after) | 5 | 150.44 | 153.01 | 172.80 | 267.4 | | `90de7913` (after) | 6 | 150.43 | 153.28 | 172.70 | 265.8 | | `90de7913` (after) | 7 | 150.54 | 157.48 | 173.32 | 265.8 | | `90de7913` (after) | 8 | 150.51 | 156.61 | 172.18 | 265.5 | | `90de7913` (after) | 9 | 150.53 | 157.05 | 172.18 | 265.8 | | `90de7913` (after) | 10 | 150.45 | 152.84 | 172.08 | 265.5 | | `c26e6181` (before) | 1 | 150.51 | 152.80 | 173.63 | 265.5 | | `c26e6181` (before) | 2 | 150.48 | 152.41 | 173.49 | 265.4 | | `c26e6181` (before) | 3 | 150.47 | 156.31 | 172.53 | 265.9 | | `c26e6181` (before) | 4 | 150.49 | 152.60 | 172.86 | 265.5 | | `c26e6181` (before) | 5 | 150.40 | 152.86 | 172.99 | 266.0 | | `c26e6181` (before) | 6 | 150.46 | 152.66 | 172.53 | 265.5 | | `c26e6181` (before) | 7 | 150.49 | 154.91 | 172.74 | 265.3 | | `c26e6181` (before) | 8 | 150.46 | 152.46 | 171.93 | 265.7 | | `c26e6181` (before) | 9 | 150.52 | 156.70 | 172.96 | 265.7 | | `c26e6181` (before) | 10 | 150.43 | 153.02 | 172.52 | 265.8 | | Version | p50 (ms) | p99 (ms) | CPU% | Mem (MB) | |---------|----------|----------|------|----------| | `c26e6181` (before) | 150.470 ± 0.034 | 153.67 ± 1.66 | 172.82 ± 0.50 | 265.6 ± 0.2 | | `90de7913` (after) | 150.465 ± 0.043 | 154.36 ± 1.94 | 172.59 ± 0.41 | 266.9 ± 2.9 | | **Δ (after − before)** | **−0.005ms (0.00%)** | **+0.69ms (+0.45%)** | **−0.23pp (−0.13%)** | **+1.2MB (+0.47%)** | </details> Co-authored-by: thomas.kowalski <thomas.kowalski@datadoghq.com>
…ort 4.7] (#17600) https://datadoghq.atlassian.net/browse/PROF-13112 Backport of #17456. Co-authored-by: Thomas Kowalski <thomas.kowalski@datadoghq.com>
…17599) https://datadoghq.atlassian.net/browse/PROF-13112 Backport of #17456. Co-authored-by: Thomas Kowalski <thomas.kowalski@datadoghq.com>
Description
https://datadoghq.atlassian.net/browse/PROF-13112
Crash
This fixes a crash in the Profiler codebase.
The crash was caused by
ThreadInfo::unwind_tasktrying to useFrameobjects that it had a reference to (Frame::Ref) while those same Frames were being evicted from the cache, which can happen under high load.Proposed fix
The proposed fix is to change
FrameStackto extendstd::deque<Frame>instead ofstd::deque<Frame::Ref>, which means pushing frames toFrameStackactually copies them and eliminates that race condition.This should be OK performance-wise because
Frameis tiny and only has trivial fields to copy (no memory allocations, only integral types).Another possible fix would be to "lock" certain
Frameobjects in the cache (to prevent them from being evicted while any Task using them is being unwound) but this would be much more complicated to get right and although it would lead to less copies, the performance picture is blurry for more subtle reasons.So I suggest we don't go ahead with the alternative unless we absolutely need to (which we don't seem to).
Performance
Running DoE with this change on the Enterprise archetype with asyncio/FastAPI (n=10) shows no significant difference (either in memory or CPU usage / latency), so I think this is safe to merge performance-wise.
Details
Raw runs
90de7913(after)90de7913(after)90de7913(after)90de7913(after)90de7913(after)90de7913(after)90de7913(after)90de7913(after)90de7913(after)90de7913(after)c26e6181(before)c26e6181(before)c26e6181(before)c26e6181(before)c26e6181(before)c26e6181(before)c26e6181(before)c26e6181(before)c26e6181(before)c26e6181(before)Aggregated (mean ± stdev, n=10)
c26e6181(before)90de7913(after)