Skip to content

Fix k8s-file log format for terminating F-sequence#629

Merged
jnovy merged 1 commit intocontainers:mainfrom
jnovy:ffix
Feb 4, 2026
Merged

Fix k8s-file log format for terminating F-sequence#629
jnovy merged 1 commit intocontainers:mainfrom
jnovy:ffix

Conversation

@jnovy
Copy link
Collaborator

@jnovy jnovy commented Feb 3, 2026

The k8s-file format requires the space separator between tag and content to be present even when content is empty. The terminating F-sequence generated for partial log lines was missing this mandatory space, causing podman logs to fail with "is not a valid container log line".

Changed terminating F-sequence from "F\n" to "F \n" and updated byte counts accordingly. Fixed test assertions to validate the correct format.

Fixes containers/podman#28014

The k8s-file format requires the space separator between tag and content
to be present even when content is empty. The terminating F-sequence
generated for partial log lines was missing this mandatory space,
causing podman logs to fail with "is not a valid container log line".

Changed terminating F-sequence from "F\n" to "F \n" and updated byte
counts accordingly. Fixed test assertions to validate the correct format.

Fixes containers/podman#28014

Signed-off-by: Jindrich Novy <jnovy@redhat.com>
@jnovy
Copy link
Collaborator Author

jnovy commented Feb 3, 2026

@ricardobranco777 PTAL

@jnovy jnovy requested a review from Luap99 February 3, 2026 12:24
@ricardobranco777
Copy link
Contributor

@ricardobranco777 PTAL

Thanks for fixing it. Do you have a RPM I can inject in our tests to verify it?

@jnovy
Copy link
Collaborator Author

jnovy commented Feb 3, 2026

@Luap99
Copy link
Member

Luap99 commented Feb 3, 2026

A reference to the bad commit 59c2b94 would be nice.

Looking at that this does change the behavior still and I guess we would argue whether #252 is valid or not.

It now prints a final newline as part of podman logs which we did not previously. In practise this means that the output between podman run and podman logs differ because logs now always adds a final newline so logs no longer matches the real container output 1 to 1.

I don't know if this matters in the real world to anyone, I doubt the extra newline causes complains but what it means it does is it claim the container wrote a final newline when it did not, so conmon injecting output into the container stream.

both echo hola and printf hola now produce the same logs despite the later not printing a newline.

I don't care to much personally this here fixes at least the broken log format so sure LGTM in that regard

cc @mheon @haircommander

@jnovy
Copy link
Collaborator Author

jnovy commented Feb 4, 2026

How's the testing going @ricardobranco777 ?

@ricardobranco777
Copy link
Contributor

How's the testing going @ricardobranco777 ?

Sorry, I should've told you earlier. I found source RPM's and I need the binaries.

@jnovy
Copy link
Collaborator Author

jnovy commented Feb 4, 2026

@ricardobranco777 binary RPMs are available if you click e.g. [buildArch (conmon-2.2.0-2.fc42.src.rpm, x86_64)](https://koji.fedoraproject.org/koji/taskinfo?taskID=141874123) on the scratch build page, you have all arches there in the scratch build.

@ricardobranco777
Copy link
Contributor

I think it works. The podman e2e test fails differently now:

https://openqa-assets.opensuse.org/tests/5649499/file/podman_e2e-localintegration.txt

  [FAILED] Timed out after 0.042s.
  The function passed to Eventually failed at /var/tmp/podman/test/e2e/logs_test.go:465 with:
  Expected
      <string>: "...a72be
      "
  to equal               |
      <string>: "...a72be"
  In [It] at: /var/tmp/podman/test/e2e/logs_test.go:466 @ 02/04/26 07:31:55.76

  Full Stack Trace
    github.com/containers/podman/v5/test/e2e.init.func45.21()
    	/var/tmp/podman/test/e2e/logs_test.go:466 +0x3fc
  < Exit [It] podman logs partial log lines: json-file - /var/tmp/podman/test/e2e/logs_test.go:447 @ 02/04/26 07:31:55.76 (458ms)

@jnovy
Copy link
Collaborator Author

jnovy commented Feb 4, 2026

Thanks @ricardobranco777 - let me know if anything else needs to be fixed on conmon's side.

@jnovy jnovy merged commit 67718f2 into containers:main Feb 4, 2026
31 of 33 checks passed
@Luap99
Copy link
Member

Luap99 commented Feb 4, 2026

I think it works. The podman e2e test fails differently now:

https://openqa-assets.opensuse.org/tests/5649499/file/podman_e2e-localintegration.txt

  [FAILED] Timed out after 0.042s.
  The function passed to Eventually failed at /var/tmp/podman/test/e2e/logs_test.go:465 with:
  Expected
      <string>: "...a72be
      "
  to equal               |
      <string>: "...a72be"
  In [It] at: /var/tmp/podman/test/e2e/logs_test.go:466 @ 02/04/26 07:31:55.76

  Full Stack Trace
    github.com/containers/podman/v5/test/e2e.init.func45.21()
    	/var/tmp/podman/test/e2e/logs_test.go:466 +0x3fc
  < Exit [It] podman logs partial log lines: json-file - /var/tmp/podman/test/e2e/logs_test.go:447 @ 02/04/26 07:31:55.76 (458ms)

well this is exactly what I have mentioned above, this is a behavioural change by injecting an additional newline, we do explicitly test that this is not the case

			// Important: do not use OutputToString(), this will remove the trailing newline from the output.
			// However this test must make sure that there is no such extra newline.
			Expect(string(logc.Out.Contents())).To(Equal(content))

so if conmon now does this we must chnage the podman test to allow that, but I like to actually have some input from others if conmon doing this now is really sane

At the very least this here makes the log drivers behave differently!

$ podman --conmon /home/pholzing/CODE/conmon/bin/conmon  run --name test --log-driver journald  quay.io/libpod/testimage:20241011 printf "hola" && echo ---; podman logs test; echo ---
test
hola---
hola---
$ podman --conmon /home/pholzing/CODE/conmon/bin/conmon  run --name test --log-driver k8s-file  quay.io/libpod/testimage:20241011 printf "hola" && echo ---; podman logs test; echo ---
test
hola---
hola
---

IMO it would be best to just revert the original fix instead and no do that to preserve the behavior.

@jnovy
Copy link
Collaborator Author

jnovy commented Feb 4, 2026

Thanks for feedback @Luap99 - I'd rather keep the current fix as it's correct according to CRI specification [1] and it is also consistent across drivers. Let's coordinate on podman side to fix the parser. The issue is in log consumption, not log generation.

[1] https://github.com/kubernetes/design-proposals-archive/blob/main/node/kubelet-cri-logging.md

@Luap99
Copy link
Member

Luap99 commented Feb 4, 2026

Can you point out the exact text that says there must be a final F line? I don't see anything saying that there.

The tags fields can support multiple tags, delimited by :. Currently, only one tag is defined in CRI to support multi-line log entries: partial or full. Partial (P) is used when a log entry is split into multiple lines by the runtime, and the entry has not ended yet. Full (F) indicates that the log entry is completed -- it is either a single-line entry, or this is the last line of the multiple-line entry.

At least I do not interpret that in a way that a final F is required. In fact I would argue the opposite with that, P is when the log entry is not ended (i.e. no newline) and the container never wrote the newline so it should not be mark as ended IMO.

How does other implementations deal with that, i.e. containerd?

and it is also consistent across drivers.

as I have shown above it is not, the new k8s-log behavior injects a final newline while journald does not so that is not good that our tests see different behaviours based on the driver in use

@jnovy
Copy link
Collaborator Author

jnovy commented Feb 5, 2026

The CRI spec text you quoted doesn't explicitly define container termination behavior - you're correct. However, the spec's scope is limited to runtime-initiated line splitting, not container termination scenarios. The evidence that this behavior is expected comes from filed bugs against both conmon and containerd showing that missing terminating F-sequences break log consumers in production:

Conmon issue #252: Reports that running echo blah | tr '\n' ' ' produces only:

2021-03-29T21:16:20.753110058-04:00 stdout P blah

The bug report explicitly states the expected behavior should be:

2021-03-29T21:16:20.753110058-04:00 stdout P blah
2021-03-29T21:16:20.753110059-04:00 stdout F

and concludes: "conmon should be writing out a terminating F-sequence into the log if it sees a container pipe closing without a trailing newline" what I consider justified.

Containerd issue #4813: Shows containerd has the identical bug. When running printf "Hello from init container!", containerd outputs:

2020-12-07T15:32:38.994574754Z stdout P Hello from init container!

and stops. This breaks fluentd's concat plugin with timeout errors.

Grafana Loki issue #10998: Documents that when a container outputs echo -n "fuga" and terminates, promtail's CRI stage cannot flush the partial line. It remains buffered indefinitely (100+ minutes) until the buffer is forcibly flushed, at which point Loki may reject it as too old.

Regarding driver inconsistency - the journald driver doesn't inject the newline because it writes directly to journald using native journald fields, bypassing the CRI log format entirely. The k8s-file driver must comply with CRI format to be parseable by kubectl/log consumers.

This is a pre-existing architectural difference: journald has always had different semantics than file-based drivers. The proper solution would be documenting this difference or making journald also emit CRI-compliant logs (which would require significant rework and I'm not planning to do that).

Sources:

@Luap99
Copy link
Member

Luap99 commented Feb 5, 2026

(I do not appreciate being answered by an LLM)

Anyhow my point stands, the spec is not clear and the containerd issue was closed and not fixed AFAICT.
Having different behaviors for each driver is just bad so IMO the right thing is to revert and not do that.

@bitoku
Copy link
Contributor

bitoku commented Feb 9, 2026

@jnovy @Luap99
Hi, It seems like it broke CRI-O CI. I guess it should be reverted for now.

Specifically, this test is broken.

https://github.com/kubernetes/kubernetes/blob/48f67b9656c97fbbbb90562c06628c7418a7ebb4/test/e2e/common/node/runtime.go#L212-L226

You can find the failure in this CI.
https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/cri-o_cri-o/9719/pull-ci-cri-o-cri-o-main-ci-e2e/2020819571723735040

This also affects critest CI. This one can be fixed in CRI-O though.

https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/cri-o_cri-o/9723/pull-ci-cri-o-cri-o-main-ci-rhel-critest/2020897290889203712

Personally I think this fix would be valid, but still upstream doesn't seem to be ready for it.
If we do that, we should bring it up upstream first, and this PR should be reverted until upstream can handle it properly.

I think this also should be reverted.
#592

bitoku added a commit to bitoku/cri-o that referenced this pull request Feb 9, 2026
bitoku added a commit to bitoku/cri-o that referenced this pull request Feb 9, 2026
…for details.

Signed-off-by: Ayato Tokubi <atokubi@redhat.com>
@mheon
Copy link
Member

mheon commented Feb 9, 2026

At this point, if CRI-O is failing, I think we have an indication that the CRI docs aren't agreeing with reality. Concur with a revert.

jnovy added a commit to jnovy/conmon that referenced this pull request Feb 10, 2026
This reverts commit 67718f2.

Breaks CRI-O and Kubernetes CI. See PR containers#629 discussion.
jnovy added a commit to jnovy/conmon that referenced this pull request Feb 10, 2026
This reverts commit e25af44.

Breaks CRI-O and Kubernetes CI. See PR containers#629 discussion.
jnovy added a commit to jnovy/conmon that referenced this pull request Feb 10, 2026
After reverting PR containers#592 and containers#629, remove assertion expecting terminating
F-sequence that is no longer generated.
@jnovy jnovy mentioned this pull request Feb 10, 2026
@jnovy
Copy link
Collaborator Author

jnovy commented Feb 10, 2026

@bitoku - just created PR for revert - PTAL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

podman-logs fails with --log-driver

5 participants