Transform layers [DON'T MERGE] #569

sguada · 2014-06-30T16:07:18Z

This is a tentative approach to add transformation layers that would allow to crop_mirror, center_scale the blobs.

The first use would be to help Data source and pre-processing separation. The source would produce a vector of blobs, one per image, the transformation layers and finally concat all the blobs into one.

The concept is similar to do map(transformation, bottom_blobs, top_blobs)

@Yangqing What do you think?

…vector of blobs

…lobs are the same

jeffdonahue · 2014-06-30T17:14:11Z

Note that this would have substantial speed and memory overhead vs. the current implementation as all of this stuff is currently done on the next batch by the prefetch thread while the current batch is being run through the net; this moves it to the forward pass. Unless I'm misunderstanding the intent here.

shelhamer · 2014-06-30T17:17:54Z

In the past the idea came up of a PREFETCH phase so that we could specify
the data processing and have it run before forward to avoid overheads like
these. Or perhaps we need a "stage" instead of just phase since these
options differ at train and test.

At any rate, I agree that whatever the design it needs to not incur a bunch
of memory and speed costs.

Le lundi 30 juin 2014, Jeff Donahue notifications@github.com a écrit :

Note that this would have substantial speed and memory overhead vs. the
current implementation as all of this stuff is currently done on the next
batch by the prefetch thread while the current batch is being run through
the net; this moves it to the forward pass. Unless I'm misunderstanding the
intent here.

—
Reply to this email directly or view it on GitHub
#569 (comment).

Evan Shelhamer

sguada · 2014-06-30T17:24:46Z

I was planning on keeping the prefetch in the next batch within the data
layers.

The transform layers don't need to know about that. The data layer will
have set of internal transform layers that would run in the prefetch thread
in the next batch while the rest of the net process the current batch.

I was also thinking in adding multiple threads to process the blobs in
parallel within the transform layers.

We could discuss the architecture later in person.

On Monday, June 30, 2014, Evan Shelhamer notifications@github.com wrote:

In the past the idea came up of a PREFETCH phase so that we could specify
the data processing and have it run before forward to avoid overheads like
these. Or perhaps we need a "stage" instead of just phase since these
options differ at train and test.

At any rate, I agree that whatever the design it needs to not incur a
bunch
of memory and speed costs.

Le lundi 30 juin 2014, Jeff Donahue <notifications@github.com
javascript:_e(%7B%7D,'cvml','notifications@github.com');> a écrit :

Note that this would have substantial speed and memory overhead vs. the
current implementation as all of this stuff is currently done on the
next
batch by the prefetch thread while the current batch is being run
through
the net; this moves it to the forward pass. Unless I'm misunderstanding
the
intent here.

—
Reply to this email directly or view it on GitHub
#569 (comment).

Evan Shelhamer

—
Reply to this email directly or view it on GitHub
#569 (comment).

Sergio

jeffdonahue · 2014-06-30T17:30:30Z

Even if these layers are run inside the prefetch thread there's still a speed and memory cost (at least for most of the transformations). For example, when cropping is enabled, we never store the entire uncropped image in a blob, which would be a huge cost in both speed and memory if you're training on small patches of the input image. Since mirroring doesn't change the size, it could be done "in place" to use almost no additional memory, but you still have the speed cost of copying the un-mirrored image in the first place.

sguada · 2014-06-30T17:40:20Z

There will be some memory overhead if we transform all images at once. But we could apply all the transformations one image at the time, and iterate to avoid extra memory.

I think this approach should be faster if we can process several images in parallel with different threads.

Although we could add a getCropMirror_fromdatum for the case of small patches. Or we could store Blobs in LevelDB instead of datum and save one memory copy.

shelhamer · 2014-06-30T17:56:24Z

We could pull the current method into a base data layer that still does
everything as-is then inherit, no? If we align the interfaces that
is. There'd be a certain amount of redundant code, but much less.

Let's talk this afternoon in person like @sguada suggested.

Le lundi 30 juin 2014, Sergio Guadarrama notifications@github.com a
écrit :

There will be some memory overhead if we transform all images at once. But
we could apply all the transformations one image at the time, and iterate
to avoid extra memory.

I think this approach should be faster if we can process several images in
parallel with different threads.

—
Reply to this email directly or view it on GitHub
#569 (comment).

sguada · 2014-06-30T18:20:14Z

Probably we can abstract most of the common things, but that will depend if
we assume the input data is datum or not. I don't see an easy way to do the
same for datum, images read from files, cv:Mat and HDF5 data.

I think that in the long run having the ability to do different
pre-processing steps will pay off. For instance if we want to add color
jittering or have videos as inputs.

Sergio

2014-06-30 10:56 GMT-07:00 Evan Shelhamer notifications@github.com:

We could pull the current method into a base data layer that still does
everything as-is then inherit, no? If we align the interfaces that
is. There'd be a certain amount of redundant code, but much less.

Let's talk this afternoon in person like @sguada suggested.

Le lundi 30 juin 2014, Sergio Guadarrama notifications@github.com a
écrit :

There will be some memory overhead if we transform all images at once.
But
we could apply all the transformations one image at the time, and
iterate
to avoid extra memory.

I think this approach should be faster if we can process several images
in
parallel with different threads.

—
Reply to this email directly or view it on GitHub
#569 (comment).

—
Reply to this email directly or view it on GitHub
#569 (comment).

shelhamer · 2014-06-30T18:24:50Z

Agreed–the point of this re-design is to make everything more configurable
and a matter of prototxt instead of code and a worthy change. Any
refactoring for simplicity is only a side goal.

but that will depend if we assume the input data is datum or not.

Maybe it's best if we turn everything into blob, and if we pay a per-batch
memory hit so be it.

On Mon, Jun 30, 2014 at 11:20 AM, Sergio Guadarrama <
notifications@github.com> wrote:

Probably we can abstract most of the common things, but that will depend
if
we assume the input data is datum or not. I don't see an easy way to do
the
same for datum, images read from files, cv:Mat and HDF5 data.

I think that in the long run having the ability to do different
pre-processing steps will pay off. For instance if we want to add color
jittering or have videos as inputs.

Sergio

2014-06-30 10:56 GMT-07:00 Evan Shelhamer notifications@github.com:

We could pull the current method into a base data layer that still does
everything as-is then inherit, no? If we align the interfaces that
is. There'd be a certain amount of redundant code, but much less.

Let's talk this afternoon in person like @sguada suggested.

Le lundi 30 juin 2014, Sergio Guadarrama notifications@github.com a
écrit :

There will be some memory overhead if we transform all images at once.
But
we could apply all the transformations one image at the time, and
iterate
to avoid extra memory.

I think this approach should be faster if we can process several
images
in
parallel with different threads.

—
Reply to this email directly or view it on GitHub
#569 (comment).

—
Reply to this email directly or view it on GitHub
#569 (comment).

—
Reply to this email directly or view it on GitHub
#569 (comment).

kloudkl · 2014-07-01T01:11:09Z

Back in #244, I tried many APIs to unify the preprocessing steps for different data formats but gave up.

The new design would possibly consist of data IO including prefetching, data format conversions and finally data content transformations. The data IO takes care of the various data sources such as leveldb/lmdb, HDF5, images on the disk and from memory. To avoid replicating the preprocessing codes for each format, the raw data should be converted into Blob.

Although the layers have been a very important part of Caffe. They have unified method interfaces and we are all accustomed to wrapping many things into them. But certainly not everything needs to be a layer. The conversions and transformations are better put in simpler classes to allow for in-place data manipulations.

template <typename Dtype>
class BaseInPlaceOperation {
    virtual void apply(Blob<Dtype>* data) = 0;
}
template <typename Dtype>
class Crop : public BaseInPlaceOperation {
   ...
}

I have used boost::thread to parallelize the performance critical parts in a few projects. It is cross platform and has much more flexible API than pthread. Multiple threads for a common task can be effectively managed in thread_group. We should definitely switch to it to be future-proof.

bhack · 2014-07-01T05:39:10Z

One of the advantage of boost threads is that will be very easy to switch to c++11 threads when cuda version will let us to switch to a modern version of gcc.
Will we need optimized opencv operator when we want to handle images transformation? On one side we need to think at data transform other than image on the other one could be useful to experiment doing transformation for augmenting dataset/synth dataset generator ( http://bouthilx.wordpress.com/tag/sampling/) and probably some kind of "infinite" transformation renewing a little part of the training set every n iteration with or without a logic controlling loss.

bhack · 2014-09-20T07:19:45Z

@sguada Can we discuss here if this layer could be compatible and useful also for data augmentation (light changes, elastic distortion, affine transformation, Gaussian noise, motion blur etc.). I think that some operation (when we handle image data) probably are easier and more optimized to do in opencv but at this level we are handling blob so I don't know how we could introduce pluggable augmentation operators.

sguada · 2014-09-20T11:28:58Z

@bhack maybe doing data augmentation would be easier within Transform_Data. Within #1070 during the transformation of cv::Mat to Blob we could use any opencv routine.
Although having Transform Layers could be useful in some cases, it wouldn't be very efficient to change Blob back to cv::Mat for data augmentation.

jeffdonahue · 2015-03-09T22:10:30Z

Closing as abandoned. We agree with the motivation (data transformers becoming layers of their own rather than applied by each data layer) but this needs a bit more thought (prefetching etc., see discussion above) and a rebase.

whjxnyzh123 · 2015-06-01T07:30:03Z

@sguada are there RGB jittering in caffe？Thank you

sguada · 2015-06-01T19:19:28Z

I don't think so, but you could implement them easily using other layers.

Sergio

2015-06-01 0:30 GMT-07:00 whjxnyzh notifications@github.com:

@sguada https://github.com/sguada are there RGB jittering in
caffe？Thank you

—
Reply to this email directly or view it on GitHub
#569 (comment).

sguada added 4 commits June 30, 2014 07:52

Added Transform layers interface, useful to apply a transformation a …

1b9fa18

…vector of blobs

Added transform_layer.cpp just check that the num of bottom and top b…

80ff18c

…lobs are the same

Added center_scale_layer transformation

8fb9499

Added crop_mirror_layer

ae9e2e6

This was referenced Jul 3, 2014

Implement generic object detector #601

Closed

Implement SpatialPyramidPoolingLayer with the Split, Pooling, Flatten & Concat layers #560

Closed

This was referenced Aug 20, 2014

boost::python vs. cython and Python interface preprocessing profiling and improvement #941

Closed

Refactor data layers to avoid duplication of data transformation code #954

Merged

shelhamer force-pushed the dev branch 3 times, most recently from 4278286 to c01f07a Compare August 28, 2014 07:00

sguada changed the title ~~Tranform layers [DON'T MERGE]~~ Transform layers [DON'T MERGE] Sep 12, 2014

sguada mentioned this pull request Sep 12, 2014

Refactor data_transform to allow datum, cv:Mat and Blob transformation #1070

Merged

shelhamer force-pushed the dev branch from 64258b6 to 403b56b Compare September 19, 2014 04:38

shelhamer force-pushed the dev branch from d8eb4df to 914da95 Compare October 8, 2014 16:36

sergeyk force-pushed the dev branch from 2fb4c97 to 1718903 Compare October 17, 2014 18:44

shelhamer added the abandoned label Mar 9, 2015

jeffdonahue closed this Mar 9, 2015

shelhamer mentioned this pull request Apr 13, 2015

DataTransformer as a separate layer #2225

Closed

bhack mentioned this pull request Jun 20, 2015

Caffe OpenCL support #2610

Closed

ronghanghu mentioned this pull request Jul 31, 2015

Implement transformations as layers #2840

Closed

Transform layers [DON'T MERGE] #569

Transform layers [DON'T MERGE] #569

Uh oh!

Conversation

sguada commented Jun 30, 2014

Uh oh!

jeffdonahue commented Jun 30, 2014

Uh oh!

shelhamer commented Jun 30, 2014

Uh oh!

sguada commented Jun 30, 2014

Uh oh!

jeffdonahue commented Jun 30, 2014

Uh oh!

sguada commented Jun 30, 2014

Uh oh!

shelhamer commented Jun 30, 2014

Uh oh!

sguada commented Jun 30, 2014

Uh oh!

shelhamer commented Jun 30, 2014

Uh oh!

kloudkl commented Jul 1, 2014

Uh oh!

bhack commented Jul 1, 2014

Uh oh!

bhack commented Sep 20, 2014

Uh oh!

sguada commented Sep 20, 2014

Uh oh!

jeffdonahue commented Mar 9, 2015

Uh oh!

whjxnyzh123 commented Jun 1, 2015

Uh oh!

sguada commented Jun 1, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants