Skip to content

Add retrieve encoded frame to VideoCapture#15290

Merged
alalek merged 25 commits intoopencv:masterfrom
cudawarped:ffmpeg_raw_retrieve
Nov 18, 2019
Merged

Add retrieve encoded frame to VideoCapture#15290
alalek merged 25 commits intoopencv:masterfrom
cudawarped:ffmpeg_raw_retrieve

Conversation

@cudawarped
Copy link
Copy Markdown
Contributor

@cudawarped cudawarped commented Aug 13, 2019

This pullrequest adds the functionality to retrieve the encoded bitstream for a grabbed video frame

Since #14774 cudacodec has been broken because it needs the capacity to fall back on ffmpeg for parsing certain video files and rtsp streams. This pull request adds the capacity to retrieve the encoded bit stream for each frame to the VideoCapture class as suggested in opencv/opencv_contrib#2180.

I have only built against the ffmpeg binaries (I am not sure how to build the plugin) I suspect that I may have broken the plugin functionality because I simply copied the existing retrieve method in plugin_api.h and cap_ffmpeg.cpp.

force_builders=linux,docs
buildworker:Custom=linux-1
build_image:Custom=fedora:31

@cudawarped
Copy link
Copy Markdown
Contributor Author

cudawarped commented Aug 15, 2019

Updated to use a separate VideoContainer subclass as suggested by mshabunin

@mshabunin
Copy link
Copy Markdown
Contributor

@cudawarped , I have several general comments/questions:

  • do we really need class hierarchy here, can we make independent VideoContainer class which will use parts of same backend under the hood? For simplicity, we can make interface only for specific
    backend, i.e. omit probing and detection part and force users to always provide backend ID.
  • as I understand not all media containers store one frame in one packet, so generally it is not possible to read "one frame" from demuxer. One should continuously read packets and pass them to decoder until it returns "I have decoded frame(s)" result. Consider this example: https://github.com/NVIDIA/video-sdk-samples/blob/aa3544dcea2fe63122e4feb83bf805ea40e58dbe/Samples/AppDecode/AppDec/AppDec.cpp#L51-L64 . I can imagine an interface similar to std::istream or fread with main method like this: size_t readsome(uchar * buf, size_t sz).
  • we need some tests for this feature, there are some video files here: https://github.com/opencv/opencv_extra/tree/master/testdata/highgui/video . I believe file big_buck_bunny.h264 is a raw stream extracted from one of containers (don't remember which one), so we can use it as reference data stream.
  • it would be great to support reading several streams and returning metadata, but we can implement it later.

@cudawarped
Copy link
Copy Markdown
Contributor Author

cudawarped commented Aug 23, 2019

@mshabunin thanks for the feedback I have been changing my mind continually regarding your first point.

* do we really need class hierarchy here, can we make independent `VideoContainer` class which will use parts of same backend under the hood? For simplicity, we can make interface only for specific
  backend, i.e. omit probing and detection part and force users to always provide backend ID.

I would prefer not to mess with the class hierarchy as I have done if possible, but I couldn't think of a sensible way to use all the existing methods inside VideoCapture without inheriting from it. Are you suggesting we make VideoCapture a member of VideoContainer as
VideoCapture videoCapture;
and expose additional methods in VideoCapture to allow iCap to be manipulated, e.g.

bool VideoCapture::grabEncoded()
{
    CV_INSTRUMENT_REGION();
    bool ret = !icap.empty() ? icap->grabEncodedFrame() : false;
    if (!ret && throwOnFail)
        CV_Error(Error::StsError, "");
    return ret;
}

bool VideoContainer::grab()
{
    return videoCapture.grabEncoded();
}

or to have a completely separate class VideoContainer with member
Ptr<IVideoCapture> icap;
and copy all the required method implementations from VideoCapture to VideoContainer, or something else?

@mshabunin
Copy link
Copy Markdown
Contributor

I was thinking about something like

this
// videoio.hpp
class VideoContainer {
public:
    bool open(const std::string &, int);
    size_t read(uchar *, size_t);
    bool isOpened() const;
    // maybe something like: cv::Codec getCodec() const;
private:
    Ptr<IVideoCapture> icap;
};
// cap.cpp
bool VideoContainer::open(...) {
    // body similar to VideoCapture::open, some parts can be extracted to static function and reused
    icap = backend->createRawCapture(...);
}
size_t VideoContainer::read(...) {
    // ...
    icap->readRaw(...);
    // ...
}
// backend.hpp (.cpp)
Ptr<IVideoCapture> StaticBackend::createRawCapture(...) {
    auto cap = createCapture(...);
    if (cap && cap->setRaw(true)) return cap;
    return 0;
}
// TODO: implement PluginBackend::createRawCapture later
// cap_interface.hpp
class IVideoCapture {
    virtual bool setRaw(bool) { return false; }
    virtual size_t readRaw(...);
}
// cap_ffmpeg.cpp
class CvCapture_FFMPEG_proxy CV_FINAL : public cv::IVideoCapture {
 // implement setRaw and readRaw
}

Later we can rework internals and create hidden IVideoContainer or IVideoSource/IRawVideoSource classes with cleaner implementation. For now it is important to establish usable and convenient public interface (in videoio.hpp).

-setRaw to disable video decoding and enable bitstream filters from mp4 to h254 and h265.
-readRaw to return the raw undecoded/filtered bitstream.
Add createRawCapture to initiate a backend with setRaw enabled.
Remove inheritance and use an independant VideoContainer subclass with IVideoCapture member.
@cudawarped
Copy link
Copy Markdown
Contributor Author

@mshabunin I have added the suggested changes and have a few additional queries.

  1. Do we need a setRaw(...) method which can be turned on and off or would it be better to a) remove the method and pass a flag to the open method which performs this at initialization or b) have a setRaw() method which doesn't take any arguments and only turns on this capability?

  2. I have not implemented any of the plugin parts because I am not sure how to build/test them can you advise?

  3. The mp4 test file you suggested is h263 so to perform the testing I copied the h246 and h265 files to mp4 using

    ffmpeg -i big_buck_bunny.h264 -codec copy big_buck_bunny_h264.mp4

    and

    ffmpeg -i big_buck_bunny.h265 -codec copy big_buck_bunny_h265.mp4

    I am not familiar with h265 but I found that the big_buck_bunny.h265 had a mixture of start code lengths, both 0x00 0x00 0x01 and 0x00 0x00 0x00 0x01 which is different to the h265 parsed from the big_buck_bunny_h265.mp4 with the hevc_mp4toanexb filter. It may be an idea to add an additional file big_buck_bunny_annex.h265 converted using the hevc_mp4toanexb filter as

    ffmpeg -i big_buck_bunny_h265.mp4 -codec copy -bsf:v hevc_mp4toannexb big_buck_bunny_h265_annex.h265

    Additionally in testing when ffmpeg reads from big_buck_bunny_h265_annex.h265 0x00 is included at the end of each read instead of the start. This means that after the first read, each read from big_buck_bunny_h265.mp4 starts with 0x00 0x00 0x00 0x01 and each read from big_buck_bunny_h265_annex.h265 starts with 0x00 0x00 0x01, which I had to accommodate for in the test.
    Should I add the additional test files to the other repo or have you a better suggestion?

Remove VideoContainer from python bindings as it no longer returns a Mat.
Use opencv type uchar instead of unsigned char.
Add missing destructor to VideoContainer class.
Change api version defines to be consistent - most recent api version first.
@mshabunin
Copy link
Copy Markdown
Contributor

  1. Doing as a)-option describes would require many changes in all backends (cap_interface.hpp, create* functions), b)-option is fine.
  2. I think we can postpone plugin support.
  3. Why is it necessary to always add bitstream filter? Maybe it could be turned off by default?

Test files should be added to opencv_extra repository using the same branch name as this PR (ffmpeg_raw_retrieve), then both PRs will be tested together.

@cudawarped
Copy link
Copy Markdown
Contributor Author

3\. Why is it necessary to always add bitstream filter? Maybe it could be turned off by default?

Good point, I am not sure that it is, my only use case for this PR is to decode with cudacodec where it is required. Would you prefer a separate function to initialize this after creation (useMp4ToAnexb()) or a bool passed to read (read(uchar** data, size_t* size, const bool mp4ToAnexB)) which will initialize this the first time it is passed?

@mshabunin
Copy link
Copy Markdown
Contributor

OK, let's leave it enabled then. We will be able to extend interface later.

As for test data, I suggest replacing .h264 and .h265 with filtered streams and adding corresponding .mp4 files to leave the test simple. We can restore original files and add filtering test case later.

@cudawarped
Copy link
Copy Markdown
Contributor Author

@mshabunin
Currently I am testing the raw read functionality in two ways:

  1. Verifying that the bit stream read from a h264[h265] encoded video is identical to that parsed from the same encoded video in an mp4 container.
  2. Verifying that when the bit stream read in (1) is written directly to a binary file with the correct extension (h264 or h264) it can be read by VideoCapture with the decoded frames being identical to that read from the the raw file in (1).

The ffmpeg version which OpenCV is currently built against

-- avcodec: YES (56.60.100)
-- avformat: YES (56.40.101)
-- avutil: YES (54.31.100)
-- swscale: YES (3.1.101)

which is causing (1) to fail in the following way.

When big_buck_bunny_h264.mp4 is read and passed through the h264_mp4toannexb filter pre this fix the start code length added to the parameter sets is 0x00 0x00 0x01 and not 0x00 0x00 0x00 0x01 as expected. The raw h264 file big_buck_bunny.h264 on the other hand has fixed start codes 0x00 0x00 0x00 0x01 meaning a bit level comparison is not possible.
Therefore for (1) to work on both old and new ffmpeg versions the bit stream would need to be parsed searching for both sizes of start code. In your opinion would this still be a valid test or should this test be dropped?

@mshabunin
Copy link
Copy Markdown
Contributor

@cudawarped , the test (1) should be dropped then.

@cudawarped
Copy link
Copy Markdown
Contributor Author

@mshabunin test dropped and redundant PR opencv/opencv_extra#673 adding additional test data closed.

@cudawarped
Copy link
Copy Markdown
Contributor Author

@mshabunin is there anything I can do to prepare this PR for merging so we can fix cudacodec?

@mshabunin
Copy link
Copy Markdown
Contributor

@cudawarped , sorry for delay, I'll take a look next week.
Did you get managed to make it work with the decoder? Do you have an example application?

@cudawarped
Copy link
Copy Markdown
Contributor Author

@mshabunin, no worries thanks for all your help.

Did you get managed to make it work with the decoder?

The cudacodec decoder implementation from the latest commit to opencv/opencv_contrib#2180 should successfully decode frames retrieved using this VideoContainer class and pass the included tests.

Do you have an example application?

Are the test's sufficient?

Additionally I don't know if the ffmpeg plugin will compile against this.

@cudawarped
Copy link
Copy Markdown
Contributor Author

This change also applies the fixes #9739 with the change described here to the master branch. I believe it has already been fixed on the 3.4 branch.

#endif

typedef CvResult (CV_API_CALL *cv_videoio_retrieve_cb_t)(int stream_idx, unsigned const char* data, int step, int width, int height, int cn, void* userdata);
typedef CvResult (CV_API_CALL* cv_videoio_retrieve_raw_cb_t)(unsigned const char* data, int step, void* userdata);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int step should be size_t size

Copy link
Copy Markdown
Contributor Author

@cudawarped cudawarped Oct 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mshabunin Does Capture_retreive_raw need to use a callback or can it just be defined as

CvResult(CV_API_CALL *Capture_retreive_raw)(CvPluginCapture handle, uchar** data, size_t* size);

@param callback retrieve callback (synchronous)
@param userdata callback context data

@note API-CALL 13, API-Version == 0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it should be API-Version == 1 and API_VERSION should be increased. @alalek, is it correct?

VideoCodec_NumCodecs,

// Uncompressed YUV
VideoCodec_YUV420 = (('I' << 24) | ('Y' << 16) | ('U' << 8) | ('V')), // Y,U,V (4:2:0)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some items have FourCC code, others not. Perhaps we can get rid of this enumeration completely and use FourCC codes for new properties.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mshabunin Those

Uncompressed YUV

are codecs returned by the nvcuvid video parser and can be removed, I incorrectly included them here.

I could use FOURCC codes for the remaining codecs but because there can be a many to one mapping from ffmpeg FOURCC codes to codec (h264, x264 etc.) it seemed more suitable to return the ffmpeg codec, do you agree?

/* 12*/cv_writer_write
/* 12*/cv_writer_write,
/* 13*/cv_capture_read_raw
/* 14*/cv_capture_set_raw
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This table is not consistent with one in plugin_api.h. Build with -DVIDEOIO_PLUGIN_LIST=ffmpeg option fails.

Copy link
Copy Markdown
Contributor Author

@cudawarped cudawarped Oct 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now builds with -DVIDEOIO_PLUGIN_LIST=ffmpeg flag although the plugin code could still be incorrect.

}
} catch(const cv::Exception& e) {
if(throwOnFail && apiPreference != CAP_ANY) throw;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please restore formatting of this block.

@mshabunin mshabunin self-assigned this Nov 18, 2019
@alalek alalek merged commit 0867e31 into opencv:master Nov 18, 2019
asmorkalov pushed a commit that referenced this pull request Oct 25, 2023
…ncapsulation

videoio: Add raw encoded video stream muxing to cv::VideoWriter with CAP_FFMPEG #24363

Allow raw encoded video streams (e.g. h264[5]) to be encapsulated by `cv::VideoWriter` to video containers (e.g. mp4/mkv).

Operates in a similar way to #15290 where encapsulation is enabled by setting the `VideoWriterProperties::VIDEOWRITER_PROP_RAW_VIDEO` flag when constructing `cv::VideoWriter` e.g.
```
VideoWriter container(fileNameOut, api, fourcc, fps, { width, height }, { VideoWriterProperties::VIDEOWRITER_PROP_RAW_VIDEO, 1 });
```
and each raw encoded frame is passed as single row of a `CV_8U` `cv::Mat`.

The main reason for this PR is to allow `cudacodec::VideoWriter` to output its encoded streams to a suitable container, see opencv/opencv_contrib#3569.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
IskXCr pushed a commit to Haosonn/opencv that referenced this pull request Dec 20, 2023
…ream_encapsulation

videoio: Add raw encoded video stream muxing to cv::VideoWriter with CAP_FFMPEG opencv#24363

Allow raw encoded video streams (e.g. h264[5]) to be encapsulated by `cv::VideoWriter` to video containers (e.g. mp4/mkv).

Operates in a similar way to opencv#15290 where encapsulation is enabled by setting the `VideoWriterProperties::VIDEOWRITER_PROP_RAW_VIDEO` flag when constructing `cv::VideoWriter` e.g.
```
VideoWriter container(fileNameOut, api, fourcc, fps, { width, height }, { VideoWriterProperties::VIDEOWRITER_PROP_RAW_VIDEO, 1 });
```
and each raw encoded frame is passed as single row of a `CV_8U` `cv::Mat`.

The main reason for this PR is to allow `cudacodec::VideoWriter` to output its encoded streams to a suitable container, see opencv/opencv_contrib#3569.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
thewoz pushed a commit to thewoz/opencv that referenced this pull request Jan 4, 2024
…ream_encapsulation

videoio: Add raw encoded video stream muxing to cv::VideoWriter with CAP_FFMPEG opencv#24363

Allow raw encoded video streams (e.g. h264[5]) to be encapsulated by `cv::VideoWriter` to video containers (e.g. mp4/mkv).

Operates in a similar way to opencv#15290 where encapsulation is enabled by setting the `VideoWriterProperties::VIDEOWRITER_PROP_RAW_VIDEO` flag when constructing `cv::VideoWriter` e.g.
```
VideoWriter container(fileNameOut, api, fourcc, fps, { width, height }, { VideoWriterProperties::VIDEOWRITER_PROP_RAW_VIDEO, 1 });
```
and each raw encoded frame is passed as single row of a `CV_8U` `cv::Mat`.

The main reason for this PR is to allow `cudacodec::VideoWriter` to output its encoded streams to a suitable container, see opencv/opencv_contrib#3569.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
thewoz pushed a commit to thewoz/opencv that referenced this pull request May 29, 2024
…ream_encapsulation

videoio: Add raw encoded video stream muxing to cv::VideoWriter with CAP_FFMPEG opencv#24363

Allow raw encoded video streams (e.g. h264[5]) to be encapsulated by `cv::VideoWriter` to video containers (e.g. mp4/mkv).

Operates in a similar way to opencv#15290 where encapsulation is enabled by setting the `VideoWriterProperties::VIDEOWRITER_PROP_RAW_VIDEO` flag when constructing `cv::VideoWriter` e.g.
```
VideoWriter container(fileNameOut, api, fourcc, fps, { width, height }, { VideoWriterProperties::VIDEOWRITER_PROP_RAW_VIDEO, 1 });
```
and each raw encoded frame is passed as single row of a `CV_8U` `cv::Mat`.

The main reason for this PR is to allow `cudacodec::VideoWriter` to output its encoded streams to a suitable container, see opencv/opencv_contrib#3569.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants