Skip to content

feat: replace Promtail with OpenTelemetry collector#2784

Merged
j-zimnowoda merged 76 commits intomainfrom
APL-1273
Jan 15, 2026
Merged

feat: replace Promtail with OpenTelemetry collector#2784
j-zimnowoda merged 76 commits intomainfrom
APL-1273

Conversation

@merll
Copy link
Contributor

@merll merll commented Dec 9, 2025

📌 Summary

This PR upgrades Loki to the latest release and replaces log collection with OpenTelemetry.

To be added to Upgrade notes:

In this release, Loki is upgraded. In addition, Promtail is deprecated for log collection as it is no longer being maintained. If Loki has not been enabled before the upgrade, once it is enabled OpenTelemetry is used for log collection immediately and no further action is needed. In case there are prior logs from Loki, for maintaining access to them the transition must be delayed. This is due to a change in storage format required by OpenTelemetry, which requires a defined date for deciding which format to read and write:

After the upgrade, the platform administrator should check under Apps -> Loki settings (gear icon) the value of v13SchemaStartDate. This is set at least one day into the future during the upgrade. On or after this date (UTC), the value of enableOpenTelemetry can be changed to true. This uninstalls Promtail and enables the OpenTelemetry log collection instead.

In case access to previous logs is not needed, the platform administrator may also choose to remove all data from the Loki storage bucket or create a new one, and change the value of enableOpenTelemetry to true immediately. v13SchemaStartDate can then be set to an empty value.

Most labels have been adjusted for Loki to match the former Promtail format, but not all have been migrated or indexed for efficiency reasons. For example filename is still present in the metadata, while not being indexed. job has been discarded as it is mostly represented by other labels (e.g. service_name).

🔍 Reviewer Notes

Since the previous storage schema is not compatible with OpenTelemetry collection (requiring structured metadata), there are two scenarios.

  1. Fresh install or logging not enabled: Enabling the Loki app also enables the OpenTelemetry Operator and installs the log collector with the current storage format. Promtail is never installed.
  2. Loki is installed in a previous version with Promtail: Loki is upgraded, and the storage format transition is set to a date 26h into the future (cutoff is per day, 1d + 2h to account for delays when upgrading just before midnight). From that date, the new collection can be enabled by the platform admin, setting apps.loki.enableOpenTelemetry to true. This enables the OpenTelemetry Collector for logs and removes Promtail.

🧹 Checklist

  • Code is readable, maintainable, and robust.
  • Unit tests added/updated

@Ani1357
Copy link
Contributor

Ani1357 commented Jan 9, 2026

Tested by:

  1. Creating a cluster in prod(latest release v4.12.3)
  2. Upgraded to APL-1273
  3. Enabled Loki

Was expecting the otel operator to get deployed automatically but it never happened

@merll
Copy link
Contributor Author

merll commented Jan 9, 2026

Tested by:

1. Creating a cluster in prod(latest release v4.12.3)

2. Upgraded to APL-1273

3. Enabled Loki

Was expecting the otel operator to get deployed automatically but it never happened

The OTel App should be enabled immediately when Loki is. Console does this by the info found in core.yaml. So if this did not happen, probably this is because Console had stale information about the dependency info.
It seems Console might not be the best place to handle this. I can also override the enabled flag in the Helmfile to make sure OTel is available.

@merll
Copy link
Contributor Author

merll commented Jan 13, 2026

The OTel app is now enabled from Core whenever Loki is enabled, during or after the upgrade.

@Ani1357
Copy link
Contributor

Ani1357 commented Jan 13, 2026

Could you update the datasource in line 332 of the core.yaml file to loki-v3? Otherwise it does not default to loki as a data source when opening loki from a team view.

@ElderMatt
Copy link
Contributor

ElderMatt commented Jan 14, 2026

Tested clean install and upgrade, Promtail is removed when openTelemetry is enabled in Loki values. Loki plugin is working in Grafana after the schemastartdate has been passed and uses v13.

Copy link
Contributor

@Ani1357 Ani1357 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested both scenarios: clean install and upgrade when Loki was already available. All logs were preserved.

@j-zimnowoda j-zimnowoda merged commit 6440cd9 into main Jan 15, 2026
12 checks passed
@j-zimnowoda j-zimnowoda deleted the APL-1273 branch January 15, 2026 09:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants