Skip to content

MetricsEventSource formats histogram quantiles with CurrentCulture, causing dotnet-counters to misparse them #98632

@KalleOlaviNiemitalo

Description

@KalleOlaviNiemitalo

Description

Run a program that publishes histogram metrics on Windows with a Finnish culture, and run dotnet-counters monitor to view these metrics. The tool then incorrectly displays the quantiles as Percentile=500, Percentile=9500, and Percentile=9900. The correct output would be Percentile=50, Percentile=95, and Percentile=99. The values can also be corrupted.

The quantile boundaries are hardcoded in AggregationManager.s_defaultHistogramConfig as { 0.50, 0.95, 0.99 }. MetricsEventSource in .NET Runtime formats these with CurrentCulture, resulting in { "0,5", "0,95", "0,99" }. The dotnet-counters tool parses these strings with InvariantCulture, which ignores the comma, and gets { 5, 95, 99 }. The tool then multiplies these numbers by 100 to convert them to percentages, and the results are { 500, 9500, 9900 }.

Reproduction Steps

  1. Configure the Windows user account with a Finnish culture.

  2. dotnet tool install --global dotnet-counters --version=8.0.510501

  3. dotnet tool install --global dotnet-trace --version=8.0.510501

  4. Create Histogram.csproj:

    <Project Sdk="Microsoft.NET.Sdk">
    
      <PropertyGroup>
        <OutputType>Exe</OutputType>
        <TargetFramework>net8.0</TargetFramework>
        <Nullable>enable</Nullable>
      </PropertyGroup>
    
    </Project>
  5. Create Program.cs:

    using System;
    using System.Diagnostics.Metrics;
    using System.Threading;
    
    Meter meter = new Meter("Demo");
    Histogram<double> histogram = meter.CreateHistogram<double>("demo");
    
    while (true)
    {
        histogram.Record(1.25);
        Thread.Sleep(TimeSpan.FromSeconds(1));
    }
  6. Start dotnet run and leave it running.

  7. In another console, start dotnet-counters monitor --name=Histogram --counters=Demo and leave it running.

  8. In another console, start dotnet-trace collect --name=Histogram --providers=System.Diagnostics.Metrics and leave it running.

Expected behavior

dotnet-counters should show 50, 95, and 99 as percentiles, and either 1.25 or 1,25 as the value.

Press p to pause, r to resume, q to quit.
    Status: Running

Name                                                           Current Value
[Demo]
    demo
        Percentile=50                                               1,25    
        Percentile=95                                               1,25    
        Percentile=99                                               1,25    

Actual behavior

dotnet-counters shows 500, 9500, and 9900 as percentiles. It doesn't make sense that these go beyond 100%.
It also incorrectly shows 125 as the value.

Press p to pause, r to resume, q to quit.
    Status: Running

Name                                                           Current Value
[Demo]
    demo
        Percentile=500                                               125    
        Percentile=9500                                              125    
        Percentile=9900                                              125    

Then terminate dotnet-trace and review the trace file. It shows that the quantiles were formatted as 0,5=1,25;0,95=1,25;0,99=1,25, i.e. in a culture-dependent format.

Regression?

Yes, this is a regression; .NET 7.0.16 works correctly.

Known Workarounds

Do this at the start of the program:

CultureInfo.DefaultThreadCurrentCulture = CultureInfo.InvariantCulture;

It affects the thread that AggregationManager.Start() creates here:

Because MetricsEventSource.CommandHandler.FormatQuantiles is only ever called on that thread, it will then format the numbers with InvariantCulture, and dotnet-counters will be able to parse them correctly.

This workaround may have unwanted effects on other threads that the application or libraries create explicitly. I don't think it will affect CurrentCulture flow in async functions, though.

Configuration

.NET SDK 8.0.201 and .NET Runtime 8.0.2.
Windows 10 version 22H2 x64.
The bug is culture-dependent. It might also work differently on non-Windows operating systems, because of different culture data.
Not using Blazor.

Other information

MetricsEventSource formats the quantiles with CurrentCulture in FormatQuantiles(QuantileValue[] quantiles):

private static string FormatQuantiles(QuantileValue[] quantiles)
{
StringBuilder sb = new StringBuilder();
for (int i = 0; i < quantiles.Length; i++)
{
sb.Append(quantiles[i].Quantile).Append('=').Append(quantiles[i].Value);
if (i != quantiles.Length - 1)
{
sb.Append(';');
}
}
return sb.ToString();
}

This bug was added in https://github.com/dotnet/runtime/pull/80753/files#diff-072334d2e8cacd4f335b7f48f0da8249a5619b6237e9ed311b029801145e8022L443-R443 before the .NET 8.0 release. The new behaviour is not consistent with how TransmitMetricValue uses InvariantCulture for other types of statistics:

Log.CounterRateValuePublished(sessionId, instrument.Meter.Name, instrument.Meter.Version, instrument.Name, instrument.Unit, FormatTags(stats.Labels),
rateStats.Delta.HasValue ? rateStats.Delta.Value.ToString(CultureInfo.InvariantCulture) : "",
rateStats.Value.ToString(CultureInfo.InvariantCulture));

In dotnet-counters, TraceEventExtensions.ParseQuantiles parses the quantiles with InvariantCulture: https://github.com/dotnet/diagnostics/blob/8c08c89a0643d31db91e119b1adb463be3e0ffe5/src/Microsoft.Diagnostics.Monitoring.EventPipe/Counters/TraceEventExtensions.cs#L509-L516

The multiplication by 100 (e.g. converting the already incorrect 99 to 9900%) seems to happen here: https://github.com/dotnet/diagnostics/blob/8c08c89a0643d31db91e119b1adb463be3e0ffe5/src/Tools/dotnet-counters/CounterMonitor.cs#L111

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions