Skip to content

Don't rely on nb_frames to be correct#25844

Merged
asmorkalov merged 1 commit intoopencv:4.xfrom
dietmar:dont_rely_on_nb_frames
Jul 5, 2024
Merged

Don't rely on nb_frames to be correct#25844
asmorkalov merged 1 commit intoopencv:4.xfrom
dietmar:dont_rely_on_nb_frames

Conversation

@dietmar
Copy link
Copy Markdown
Contributor

@dietmar dietmar commented Jul 2, 2024

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

This is an attempt to fix #4362; It probably needs discussion/review.

Similar to this issue originally reported in 2012 (but which also has comments from 2023), I have a video where it seems like OpenCV does not read all frames.

I put the problematic video file here: https://github.com/dietmar/opencv/blob/problem_video/samples/data/interviewsample.mp4
I received this file from a co-worker, unfortunately I don't know how exactly it was produced. It's the first 3.5 minutes of this video: https://www.youtube.com/watch?v=YawI85U7QtA

The problem I encounter is with looping the video's frames; OpenCV stops way too early, after 160 frames, using Python code like:

import sys
sys.path.append('build/python_loader')
import cv2
video_reader = cv2.VideoCapture('samples/data/interviewsample.mp4')
i = 0
while True:
    ret, frame = video_reader.read()
    if not ret:
        break
    i += 1
    print(f"Read {i:} frames")

Or equivalently in C++ something like:

VideoCapture cap(filename);

cout << "Start grabbing" << endl;
for (int i = 0;; i++)
{
    cap.read(frame);
    if (frame.empty()) {
        cerr << "frame.empty() is true, stopping" << endl;
        break;
    }
    cout << "Frame: " << i << endl;
}

It looks like the problem with this file is that ffmpeg reports nb_frames as 160, but there are "really" 6,554 frames:

$ ffprobe -i samples/data/interviewsample.mp4 -print_format json -loglevel fatal \
-show_streams -count_frames -select_streams v
{
    "streams": [
        {
            "index": 0,
            "codec_name": "h264",
            "codec_long_name": "H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10",
            "profile": "Main",
            "codec_type": "video",
            "codec_tag_string": "avc1",
            "codec_tag": "0x31637661",
            "width": 1280,
            "height": 720,
            "coded_width": 1280,
            "coded_height": 720,
            "closed_captions": 0,
            "has_b_frames": 1,
            "sample_aspect_ratio": "1:1",
            "display_aspect_ratio": "16:9",
            "pix_fmt": "yuv420p",
            "level": 31,
            "color_range": "tv",
            "color_space": "bt709",
            "color_transfer": "bt709",
            "color_primaries": "bt709",
            "chroma_location": "left",
            "refs": 3,
            "is_avc": "true",
            "nal_length_size": "4",
            "r_frame_rate": "30000/1001",
            "avg_frame_rate": "30000/1001",
            "time_base": "1/90000",
            "start_pts": 3003,
            "start_time": "0.033367",
            "duration_ts": 19681662,
            "duration": "218.685133",
            "bit_rate": "295738",
            "bits_per_raw_sample": "8",
            "nb_frames": "160",            <------------- HERE
            "nb_read_frames": "6554",      <------------- HERE
            "disposition": {
                "default": 1,
                "dub": 0,
                "original": 0,
                "comment": 0,
                "lyrics": 0,
                "karaoke": 0,
                "forced": 0,
                "hearing_impaired": 0,
                "visual_impaired": 0,
                "clean_effects": 0,
                "attached_pic": 0,
                "timed_thumbnails": 0
            },
            "tags": {
                "language": "und",
                "handler_name": "ISO Media file produced by Google Inc. Created on: 05/23/2018.",
                "vendor_id": "[0][0][0][0]"
            }
        }
    ]
}

I traced the problem back to the CvCapture_FFMPEG::grabFrame() function in modules/videoio/src/cap_ffmpeg_impl.hpp (here) where we have this check:

if( ic->streams[video_stream]->nb_frames > 0 &&
    frame_number > ic->streams[video_stream]->nb_frames )
    return false;

So, such Python/C++ loops stop when the current frame_number is greater than the value of nb_frames, the latter coming from some file header presumably.

I propose to just remove this check. The code is not wrong in the sense that there is no programming error, and one could argue that we are just adhering to the file headers, as we have no reliable way of telling whether they are "incorrect".

However, I'd argue that OpenCV is currently relying too strongly on the information in the headers and behaves unexpectedly for most people: My "problematic" file from above plays fine (i.e., all 6,554 frames == 03:38) in video players (VLC, mplayer, Totem), so it looks like they don't rely on nb_frames to be correct.

ffmpeg itself also does not adhere to nb_frames when we ask it to actually do something with the video. For example:

$ ffmpeg -i samples/data/interviewsample.mp4 \
-vcodec libx264 -crf 32 samples/data/interviewsample_recoded.mp4

The above command interprets the input file as 6,554 frames long and produces an output file where the headers are now "correct":

$ ffprobe -i ~/Downloads/interviewsample_recoded.mp4 -print_format json -loglevel fatal \
-show_streams -count_frames -select_streams v

            ...
            "nb_frames": "6555",
            "nb_read_frames": "6555",
            ...

So my proposal is to grab as many frames from a video as we can, and not as many as the headers tell us are there. In doing so, OpenCV would handle such contradictory situations the same way as video players and the ffmpeg executable also seem to be handling them.

In my tests, just removing the check against nb_frames was unproblematic; at the end of the file, avcodec_send_packet returned something < 0 here and consequently my loops stopped gracefully.

Copy link
Copy Markdown
Contributor

@opencv-alalek opencv-alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@asmorkalov asmorkalov merged commit 88b28ee into opencv:4.x Jul 5, 2024
@dietmar
Copy link
Copy Markdown
Contributor Author

dietmar commented Jul 8, 2024

Thanks @asmorkalov and @opencv-alalek !

@asmorkalov asmorkalov mentioned this pull request Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python cv2.VideoCapture.read() does not read all frames

3 participants