Support one-time audio video reading#22930
Conversation
alalek
left a comment
There was a problem hiding this comment.
Need to fix compilation / related CI issues.
modules/videoio/test/test_audio.cpp
Outdated
| #ifdef _WIN32 | ||
| // validate audio frame size | ||
| EXPECT_NEAR(audioFrame.cols, samplesPerFrame, audioSamplesTolerance); | ||
| /EXPECT_NEAR(audioFrame.cols, samplesPerFrame, audioSamplesTolerance); | ||
| #endif |
| gint nbVideoStream; | ||
| gint nbAudioStream; |
| class GStreamerCapture CV_FINAL : public IVideoCapture | ||
| { | ||
| public: | ||
| struct StreamData |
There was a problem hiding this comment.
StreamData
This type is not really used in the current state (used once to replace videoStream / audioStream fields).
| GSafePtr<GstSample> usedVideoSample; | ||
| GSafePtr<GstSample> impendingAudioSample; | ||
| GSafePtr<GstSample> audioSample; | ||
| std::vector<GstSample*> audioSamples; |
There was a problem hiding this comment.
std::vector<GstSample*>
How is lifetime controlled?
There was a problem hiding this comment.
memory clearing is done manually
for (uint i = 0; i < audioSamples.size(); i++)
{
gst_sample_unref(audioSamples[i]);
}
audioSamples.clear();
There was a problem hiding this comment.
This doesn't work as many other error-prone attempts of manual memory management.
There are other places missed:
- close()
- destructor
Just add CV_CheckEQ(audioSamples.size(), (size_t)0, ""); into destructor and bailout from the middle of the capture loop. E.g.:
if (frame == 11)
break;
modules/videoio/test/test_audio.cpp
Outdated
| paramCombination("test_audio.mp4", 1, 0.15, CV_8UC3, 240, 320, 90, 132299, 30, 30., cv::CAP_GSTREAMER) | ||
| #ifdef _WIN32 | ||
| paramCombination("test_audio.mp4", 1, 0.15, CV_8UC3, 240, 320, 90, 131819, 30, 30., cv::CAP_MSMF) | ||
| , paramCombination("test_audio.mp4", 1, 0.15, CV_8UC3, 240, 320, 90, 131819, 30, 30., cv::CAP_MSMF), |
modules/videoio/test/test_audio.cpp
Outdated
| INSTANTIATE_TEST_CASE_P(/**/, Media, testing::ValuesIn(mediaParams)); | ||
| #endif // _WIN32 | ||
|
|
||
| #ifdef _WIN32 |
There was a problem hiding this comment.
Prefer backend check if possible.
(because GStreamer builds exist for Windows too)
There was a problem hiding this comment.
_WIN32 check is only for MSMF
There was a problem hiding this comment.
As I wrote above, GStreamer for Windows passes this check too.
is only for MSMF
it is not.
|
There are merge conflicts. Squash commits into one and then rebase onto upstream. |
alalek
left a comment
There was a problem hiding this comment.
Thank you for the update.
Please take a look on comments below.
| GSafePtr<GstSample> usedVideoSample; | ||
| GSafePtr<GstSample> impendingAudioSample; | ||
| GSafePtr<GstSample> audioSample; | ||
| std::vector<GstSample*> audioSamples; |
There was a problem hiding this comment.
This doesn't work as many other error-prone attempts of manual memory management.
There are other places missed:
- close()
- destructor
Just add CV_CheckEQ(audioSamples.size(), (size_t)0, ""); into destructor and bailout from the middle of the capture loop. E.g.:
if (frame == 11)
break;
| gint videoStream; | ||
| gint audioStream; |
There was a problem hiding this comment.
No need to move them - they were defined previous on lines 327-328 before duration
modules/videoio/test/test_audio.cpp
Outdated
| INSTANTIATE_TEST_CASE_P(/**/, Media, testing::ValuesIn(mediaParams)); | ||
| #endif // _WIN32 | ||
|
|
||
| #ifdef _WIN32 |
There was a problem hiding this comment.
As I wrote above, GStreamer for Windows passes this check too.
is only for MSMF
it is not.
modules/videoio/test/test_audio.cpp
Outdated
| ASSERT_TRUE(cap.grab()); | ||
|
|
||
| if (frame == 0) | ||
| if (backend == cv::CAP_MSMF) |
There was a problem hiding this comment.
Why is it applicable for MSMF only?
There was a problem hiding this comment.
Why is it applicable for MSMF only?
Because position tracking in gstreamer with the gst_element_query_position function is inaccurate. A longer video frame can be read anywhere as samples are captured, or not counted at all
There was a problem hiding this comment.
We still could dump these properties without test failures.
There was a problem hiding this comment.
Get it. Fixed
| case CAP_PROP_AUDIO_BASE_INDEX: | ||
| return audioBaseIndex; | ||
| case CAP_PROP_AUDIO_TOTAL_STREAMS: | ||
| CV_LOG_ONCE_WARNING(NULL, "OpenCV | GStreamer: CAP_PROP_AUDIO_TOTAL_STREAMS property not support"); |
| gint64 bufferedAudioDuration; | ||
| gint64 requiredAudioTime; | ||
| gint64 impendingVideoSampleTime; | ||
| gint64 usedVideoSampleTime; | ||
| gint64 videoSampleDuration; | ||
| gint64 audioSampleDuration; | ||
| gint64 impendingAudioSampleTime; | ||
| gint64 audioSampleTime; | ||
| gint64 chunkLengthOfBytes; | ||
| gint64 givenAudioTime; |
There was a problem hiding this comment.
duration
time
All these fields must have the clear unit suffix, e.g. NS for nanoseconds. For audio, inSamples could be applicable too.
| gint64 usedVideoSampleTime; | ||
| gint64 videoSampleDuration; | ||
| gint64 audioSampleDuration; | ||
| gint64 impendingAudioSampleTime; | ||
| gint64 audioSampleTime; | ||
| gint64 chunkLengthOfBytes; | ||
| gint64 givenAudioTime; |
There was a problem hiding this comment.
usedVideo
givenAudio
Why are here different prefixes?
usedVideoSample / audioSample - the same
There was a problem hiding this comment.
usedVideo
givenAudioWhy are here different prefixes?
usedVideoSample/audioSample- the same
Two samples are buffered for video, the one has prefix used. The sample that will be used in the next iteration has prefix impending. The logic is the same with the video timestaps. This is not necessary for audio, for timestamps prefix is given (duration of the got audio)
| audioSample.attach(gst_app_sink_pull_sample(GST_APP_SINK(audiosink.get()))); | ||
| if (!audioSample) | ||
| { | ||
| CV_LOG_ERROR(NULL, "videoio(MSMF): gst_app_sink_pull_sample() method is not succeeded"); |
There was a problem hiding this comment.
Why do we see this error message in tests?
[ RUN ] Audio.audio/1, where GetParam() = ("test_audio.mp4", 1, 132522, 0.15, GSTREAMER)
[ INFO:0@2.343] global cap_gstreamer.cpp:1378 open OpenCV | GStreamer: /home/alalek/projects/opencv/extra/testdata/highgui/audio/test_audio.mp4
[ INFO:0@2.343] global cap_gstreamer.cpp:1411 open OpenCV | GStreamer: mode - FILE
[ERROR:0@4.890] global cap_gstreamer.cpp:700 grabAudioFrame videoio(MSMF): gst_app_sink_pull_sample() method is not succeeded
[ OK ] Audio.audio/1 (2772 ms)
Why is it an error?
There was a problem hiding this comment.
Why is it an error?
The gst_app_sink_is_eos function does not return true in some cases until it reads an empty sample. Added a condition to track this case
| #ifdef _WIN32 | ||
| const paramCombination mediaParams[] = | ||
| { | ||
| paramCombination("test_audio.mp4", 1, 0.15, CV_8UC3, 240, 320, 90, 132299, 30, 30., cv::CAP_GSTREAMER) |
There was a problem hiding this comment.
Why do we have doubled "the same" videoTimestamp's on frames 85 and 86 (2866.67 ms)?
[ RUN ] Media.audio/0, where GetParam() = ("test_audio.mp4", 1, 0.15, 16, 240, 320, 90, 132299, 30, 30, GSTREAMER)
[ INFO:0@5.116] global cap_gstreamer.cpp:1378 open OpenCV | GStreamer: /home/alalek/projects/opencv/extra/testdata/highgui/audio/test_audio.mp4
[ INFO:0@5.116] global cap_gstreamer.cpp:1411 open OpenCV | GStreamer: mode - FILE
[ WARN:0@5.607] global cap_gstreamer.cpp:1672 open OpenCV | GStreamer warning: unable to query duration of stream
[ WARN:0@5.610] global cap_gstreamer.cpp:1703 open OpenCV | GStreamer warning: Cannot query video position: status=1, value=0, duration=-1
[ WARN:0@5.630] global cap_gstreamer.cpp:1878 getProperty OpenCV | GStreamer: CAP_PROP_AUDIO_SHIFT_NSEC property is not supported
[ WARN:0@5.630] global cap_gstreamer.cpp:1792 getProperty OpenCV | GStreamer: CAP_PROP_POS_MSEC property result may be unrealiable: https://github.com/opencv/opencv/issues/19025
video0 timestamp: 0 audio0 timestamp: 0 (audio shift nanoseconds: 0 , seconds: 0)
frame=0: audioFrameSize=1470 videoTimestamp=0 ms
frame=1: audioFrameSize=1470 videoTimestamp=33.3333 ms
frame=2: audioFrameSize=1470 videoTimestamp=66.6667 ms
frame=3: audioFrameSize=1469 videoTimestamp=100 ms
frame=4: audioFrameSize=1470 videoTimestamp=133.333 ms
frame...
frame=85: audioFrameSize=0 videoTimestamp=2866.67 ms
frame=86: audioFrameSize=1470 videoTimestamp=2866.67 ms
frame=87: audioFrameSize=1470 videoTimestamp=2900 ms
frame=88: audioFrameSize=1470 videoTimestamp=2933.33 ms
frame=89: audioFrameSize=1470 videoTimestamp=2966.67 ms
Total audio samples=132299
[ OK ] Media.audio/0 (2821 ms)
There was a problem hiding this comment.
As I wrote, the gst_element_query_position function is not accurate, this is the reason
There was a problem hiding this comment.
There is some random in output:
frame=0: audioFrameSize=1470 videoTimestamp=0 ms
frame=1: audioFrameSize=2940 videoTimestamp=33.3333 ms
frame=2: audioFrameSize=1470 videoTimestamp=100 ms
frame=3: audioFrameSize=1469 videoTimestamp=133.333 ms
frame=4: audioFrameSize=1470 videoTimestamp=166.667 ms
frame...
frame=85: audioFrameSize=1470 videoTimestamp=2866.67 ms
frame=86: audioFrameSize=1470 videoTimestamp=2900 ms
frame=87: audioFrameSize=1470 videoTimestamp=2933.33 ms
frame=88: audioFrameSize=0 videoTimestamp=2966.67 ms
frame=89: audioFrameSize=1470 videoTimestamp=2966.67 ms
frame=0: audioFrameSize=1470 videoTimestamp=0 ms
frame=1: audioFrameSize=1470 videoTimestamp=33.3333 ms
frame=2: audioFrameSize=1470 videoTimestamp=66.6667 ms
frame=3: audioFrameSize=1469 videoTimestamp=100 ms
frame=4: audioFrameSize=1470 videoTimestamp=133.333 ms
frame...
frame=85: audioFrameSize=2940 videoTimestamp=2833.33 ms
frame=86: audioFrameSize=1470 videoTimestamp=2900 ms
frame=87: audioFrameSize=0 videoTimestamp=2933.33 ms
frame=88: audioFrameSize=1470 videoTimestamp=2933.33 ms
frame=89: audioFrameSize=1470 videoTimestamp=2966.67 ms
Below timestamps for generic video-only scan (ignore crc64):
frame=80 frame_size=[320 x 240] videoTimestamp=2700 ms crc64=0FF87E8865B65E85
frame=81 frame_size=[320 x 240] videoTimestamp=2733.33 ms crc64=5B0016D23525B13E
frame=82 frame_size=[320 x 240] videoTimestamp=2766.67 ms crc64=7ACED82BA667C494
frame=83 frame_size=[320 x 240] videoTimestamp=2800 ms crc64=5F393F9FF6515BEF
frame=84 frame_size=[320 x 240] videoTimestamp=2833.33 ms crc64=1DA3C18F7CC6254F
frame=85 frame_size=[320 x 240] videoTimestamp=2866.67 ms crc64=2D2D5461D2C20E89
frame=86 frame_size=[320 x 240] videoTimestamp=2900 ms crc64=568E4DC494CEBB11
frame=87 frame_size=[320 x 240] videoTimestamp=2933.33 ms crc64=C73EDB4F7625C01D
frame=88 frame_size=[320 x 240] videoTimestamp=2966.67 ms crc64=B7BE9F635F4ED293
frame=89 frame_size=[320 x 240] videoTimestamp=3000 ms crc64=9349423711753500
VideoCapture capture(findDataFile("audio/test_audio.mp4"), CAP_GSTREAMER);
Mat frame;
for (int i = 0; ; i++)
{
if (!capture.grab())
break;
capture.retrieve(frame);
std::cout << "frame=" << i << " frame_size=" << frame.size() << " videoTimestamp=" << capture.get(CAP_PROP_POS_MSEC) << " ms " << cv::format("crc64=%016lX", crc64(frame)) << std::endl;
}|
There is failed test after the last update (reproduced locally): |
Support one-time audio video reading * stream switching functionality * audio+video pipeline with switch stream functionality * audio video sync * fixed sync * removed switch swtream functionality * changed test for gstreamer audio * fixed error * fixed error * fixed issue * fixed issue * fixed error * fixed error * fixed error
added to gsrtreamer: