Skip to content

Conversation

@ronghanghu
Copy link
Member

This PR creates a powerful Matlab interface for Caffe. It has been an issue for a long time in Caffe that Matlab interface is not as powerful as Python interface. #501 (followed by #1913) attempted to resolve this issue by adding more functions. However features in #1913 are still limited (e.g. not allowing creating multiple nets, not allowing training a net), and development of #1913 seems to have been stalled. Since #1913 is already called MatCaffe2, I'll call this PR MatCaffe3 :)

This Matlab interface of Caffe implements more features when compared with #1913, with comparable or even less code, by creating wrappers for caffe::Solver, caffe::Net, caffe::Layer and caffe::Blob in Matlab interface. In this interface, almost everything that can be done in PyCaffe can also be done here (except for a MATLAB_LAYER, which can be a future functionality).

This PR is add-only and non-invasive, which means the old Matlab interface is not removed in case someone still wants to use it. Note: the old Matlab wrapper doesn't work right now due to #1970. This PR removes the old Matlab wrapper as suggested by @shelhamer , but keeps the HDF5 example and modified the image classification example using BVLC CaffeNet.

Update message:

  • caffe/matlab/+caffe/imagenet/ilsvrc_2012_mean.mat has been updated in Update ilsvrc_2012_mean.mat to W x H x C, update demo and add comments #2527 to contain mean_data in Width x Height x Channels with BGR channel order and single precision, which makes it consistent with caffe-supported data format. Previously the image_mean in ilsvrc_2012_mean.mat was in Height x Width x Channels with BGR, which was neither Matlab's image format nor caffe's data format, but somewhere in the middle, and inconsistent with read_mean.
  • In this new interface, I did not provide specific mean subtraction functions, image preparation functions or image classification functions, as it depends on which method you are using. For mean subtraction, some methods do image mean subtraction (e.g. CaffeNet) and some do channel mean subtraction (e.g. VGG Net). For taking crops, CaffeNet first resize to 256x256 and takes 10 crops, while some method resize image to have min(h, w)=256 and take 10 crops from 256x256 central region, and some first resize to min(h, w)=256 and takes 10 crops from 4 conor + center (and flips) of the resized image instead its 256x256 central region. Since approach to prepare input can be arbitrary, instead of providing a series of such image preparing functions, I decided to leave that to users and illustrate it in caffe/matlab/classification_demo.m for CaffeNet.

MATLAB

The MATLAB interface -- matcaffe -- is the caffe package in caffe/matlab in which you can integrate Caffe in your Matlab code.

In MatCaffe, you can

  • Creating multiple Nets in Matlab
  • Do forward and backward computation
  • Access any layer within a network, and any parameter blob in a layer
  • Get and set data or diff to any blob within a network, not restricting to input blobs or output blobs
  • Save a network's parameters to file, and load parameters from file
  • Reshape a blob and reshape a network
  • Edit network parameter and do network surgery
  • Create multiple Solvers in Matlab for training
  • Resume training from solver snapshots
  • Access train net and test nets in a solver
  • Run for a certain number of iterations and give back control to Matlab
  • Intermingle arbitrary Matlab code with gradient steps

An ILSVRC image classification demo is in caffe/matlab/demo/classification_demo.m (you need to download BVLC CaffeNet from Model Zoo to run it).

Build MatCaffe

Build MatCaffe with make all matcaffe. After that, you may test it using make mattest.

Common issue: if you run into error messages like libstdc++.so.6:version 'GLIBCXX_3.4.15' not found during make mattest, then it usually means that your Matlab's runtime libraries do not match your compile-time libraries. You may need to do the following before you start Matlab:

export LD_LIBRARY_PATH=/opt/intel/mkl/lib/intel64:/usr/local/cuda/lib64
export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libstdc++.so.6

Or the equivalent based on where things are installed on your system, and do make mattest again to see if the issue is fixed. Note: this issue is sometimes more complicated since during its startup Matlab may overwrite your LD_LIBRARY_PATH environment variable. You can run !ldd ./matlab/+caffe/private/caffe_.mexa64 (the mex extension may differ on your system) in Matlab to see its runtime libraries, and preload your compile-time libraries by exporting them to your LD_PRELOAD environment variable.

After successful building and testing, add this package to Matlab search PATH by starting matlab from caffe root folder and running the following commands in Matlab command window.

addpath ./matlab

You can save your Matlab search PATH by running savepath so that you don't have to run the command above again every time you use MatCaffe.

Use MatCaffe

MatCaffe is very similar to PyCaffe in usage.

Examples below shows detailed usages and assumes you have downloaded BVLC CaffeNet from Model Zoo and started matlab from caffe root folder.

model = './models/bvlc_reference_caffenet/deploy.prototxt';
weights = './models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel';

Set mode and device

Mode and device should always be set BEFORE you create a net or a solver.

Use CPU:

caffe.set_mode_cpu();

Use GPU and specify its gpu_id:

caffe.set_mode_gpu();
caffe.set_device(gpu_id);

Create a network and access its layers and blobs

Create a network:

net = caffe.Net(model, weights, 'test'); % create net and load weights

Or

net = caffe.Net(model, 'test'); % create net but not load weights
net.copy_from(weights); % load weights

which creates net object as

  Net with properties:

           layer_vec: [1x23 caffe.Layer]
            blob_vec: [1x15 caffe.Blob]
              inputs: {'data'}
             outputs: {'prob'}
    name2layer_index: [23x1 containers.Map]
     name2blob_index: [15x1 containers.Map]
         layer_names: {23x1 cell}
          blob_names: {15x1 cell}

The two containers.Map objects are useful to find the index of a layer or a blob by its name.

You have access to every blob in this network. To fill blob 'data' with all ones:

net.blobs('data').set_data(ones(net.blobs('data').shape));

To multiply all values in blob 'data' by 10:

net.blobs('data').set_data(net.blobs('data').get_data() * 10);

Be aware that since Matlab is 1-indexed and column-major, the usual 4 blob dimensions in Matlab are [width, height, channels, num], and width is the fastest dimension. Also be aware that images are in BGR channels. Also, Caffe uses single-precision float data. If your data is not single, set_data will automatically convert it to single.

You also have access to every layer, so you can do network surgery. For example, to multiply conv1 parameters by 10:

net.params('conv1', 1).set_data(net.params('conv1', 1).get_data() * 10); % set weights
net.params('conv1', 2).set_data(net.params('conv1', 2).get_data() * 10); % set bias

Alternatively, you can use

net.layers('conv1').params(1).set_data(net.layers('conv1').params(1).get_data() * 10);
net.layers('conv1').params(2).set_data(net.layers('conv1').params(2).get_data() * 10);

To save the network you just modified:

net.save('my_net.caffemodel');

To get a layer's type (string):

layer_type = net.layers('conv1').type;

Forward and backward

Forward pass can be done using net.forward or net.forward_prefilled. Function net.forward takes in a cell array of N-D arrays containing data of input blob(s) and outputs a cell array containing data from output blob(s). Function net.forward_prefilled uses existing data in input blob(s) during forward pass, takes no input and produces no output. After creating some data for input blobs like data = rand(net.blobs('data').shape); you can run

res = net.forward({data});
prob = res{1};

Or

net.blobs('data').set_data(data);
net.forward_prefilled();
prob = net.blobs('prob').get_data();

Backward is similar using net.backward or net.backward_prefilled and replacing get_data and set_data with get_diff and set_diff. After creating some gradients for output blobs like prob_diff = rand(net.blobs('prob').shape); you can run

res = net.backward({prob_diff});
data_diff = res{1};

Or

net.blobs('prob').set_diff(prob_diff);
net.backward_prefilled();
data_diff = net.blobs('data').get_diff();

However, the backward computation above doesn't get correct results, because Caffe decides that the network does not need backward computation. To get correct backward results, you need to set 'force_backward: true' in your network prototxt.

After performing forward or backward pass, you can also get the data or diff in internal blobs. For example, to extract pool5 features after forward pass:

pool5_feat = net.blobs('pool5').get_data();

Reshape

Assume you want to run 1 image at a time instead of 10:

net.blobs('data').reshape([227 227 3 1]); % reshape blob 'data'
net.reshape();

Then the whole network is reshaped, and now net.blobs('prob').shape should be [1000 1];

Training

Assume you have created training and validation lmdbs following our ImageNET Tutorial, to create a solver and train on ILSVRC 2012 classification dataset:

solver = caffe.Solver('./models/bvlc_reference_caffenet/solver.prototxt');

which creates solver object as

  Solver with properties:

          net: [1x1 caffe.Net]
    test_nets: [1x1 caffe.Net]

To train:

solver.solve();

Or train for only 1000 iterations (so that you can do something to its net before training more iterations)

solver.step(1000);

To get iteration number:

iter = solver.iter();

To get its network:

train_net = solver.net;
test_net = solver.test_nets(1);

To resume from a snapshot "your_snapshot.solverstate":

solver.restore('your_snapshot.solverstate');

Input and output

caffe.io class provides basic input functions load_image and read_mean. For example, to read ILSVRC 2012 mean file (assume you have downloaded imagenet example auxiliary files by running ./data/ilsvrc12/get_ilsvrc_aux.sh):

mean_data = caffe.io.read_mean('./data/ilsvrc12/imagenet_mean.binaryproto');

To read Caffe's example image and resize to [width, height] and suppose we want width = 256; height = 256;

im_data = caffe.io.load_image('./examples/images/cat.jpg');
im_data = imresize(im_data, [width, height]); % resize using Matlab's imresize

Keep in mind that width is the fastest dimension and channels are BGR, which is different from the usual way that Matlab stores an image. If you don't want to use caffe.io.load_image and prefer to load an image by yourself, you can do

im_data = imread('./examples/images/cat.jpg'); % read image
im_data = im_data(:, :, [3, 2, 1]); % convert from RGB to BGR
im_data = permute(im_data, [2, 1, 3]); % permute width and height
im_data = single(im_data); % convert to single precision

Also, you may take a look at caffe/matlab/demo/classification_demo.m to see how to prepare input by taking crops from an image.

We show in caffe/matlab/hdf5creation how to read and write HDF5 data with Matlab. We do not provide extra functions for data output as Matlab itself is already quite powerful in output.

Clear nets and solvers

Call caffe.reset_all() to clear all solvers and stand-alone nets you have created.

@ronghanghu ronghanghu force-pushed the matcaffe3 branch 8 times, most recently from 4181b6e to 10b0f85 Compare May 25, 2015 03:14
@ducha-aiki
Copy link
Contributor

Great job!

@ronghanghu ronghanghu force-pushed the matcaffe3 branch 9 times, most recently from 2a9d33d to 3a9b8e5 Compare May 26, 2015 00:52
@shelhamer
Copy link
Member

Thanks Ronghang for the excellent improvements and thorough PR message!

@s-gupta @bharath272 could you double-check this when you have a chance? Once that's done we can merge.

@shelhamer
Copy link
Member

I did a quick check and this builds + passes the matcaffe tests with matlab2014a. Good work!

A more thorough check by Saurabh and Bharath will come soon, but until then let's consider finishing details @ronghanghu.

  • Drop the old matcaffe? This is strictly better and the old interface is losing compatibility with the core.
  • Add the thorough description in this PR as an example or docs? It could go in the docs/tutorial/interfaces.md or a few m files in caffe/matlab

@ronghanghu ronghanghu force-pushed the matcaffe3 branch 7 times, most recently from 944a9b5 to 28f45ad Compare May 28, 2015 09:17
@ronghanghu
Copy link
Member Author

@shelhamer The old matcaffe has been removed but the HDF5 example is kept. Also, an image classification demo is provided. PR message is added into docs/tutorial/interfaces.md

Could you add label MATLAB to this PR?

shelhamer added a commit that referenced this pull request May 29, 2015
MatCaffe: overhaul and improve the MATLAB interface
@shelhamer shelhamer merged commit ae4a5b1 into BVLC:master May 29, 2015
@ronghanghu
Copy link
Member Author

Great!

@ronghanghu
Copy link
Member Author

@Trekky12
Copy link

In the old interface it was neccessary to substract the mean-image from the input-data. Is this still neccessary with the functions read_mean and load_image or can you just pass the im_data to the nets forward pass?

Is the following code correct?

model = './models/bvlc_reference_caffenet/deploy.prototxt';
weights = './models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel';
mean_data = caffe.io.read_mean('./data/ilsvrc12/imagenet_mean.binaryproto');
im_data = caffe.io.load_image('./examples/images/cat.jpg');
net = caffe.Net(model, weights, 'test'); % create net and load weights
res = net.forward({im_data});
prob = res{1};

Or can you please adjust the classification_demo.m to the new functions caffe.io.read_mean and caffe.io.load_image?

Thank you for you help!

@ronghanghu
Copy link
Member Author

Hi, @Trekky12

Your code is NOT correct. You still need to do mean subtraction. You can still use the same way you prepare input images as you did in old matlab wrapper, or you can load image and convert it to W x H x 3 BGR single format using caffe.io.load_image, and then subtract mean_data returned by caffe.io.read_mean.

So the correct code is:

model = './models/bvlc_reference_caffenet/deploy.prototxt';
weights = './models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel';   
net = caffe.Net(model, weights, 'test'); % create net and load weights

% caffe.io.load_image returns W x H x 3, BGR image data
im_data = caffe.io.load_image('./examples/images/cat.jpg');
im_data = imresize(im_data, [256 256]); % resize to 256 x 256

% caffe.io.read_mean returns W x H x 3, BGR mean data
mean_data = caffe.io.read_mean('./data/ilsvrc12/imagenet_mean.binaryproto');
im_data = im_data - mean_data; % subtract mean

% take 10 crops
im_crops = zeros(227, 227, 3, 10, 'single');
im_crops(:, :, :, 1) = im_data( 1:227,  1:227, :); % upper left
im_crops(:, :, :, 2) = im_data( 1:227, 30:256, :); % lower left
im_crops(:, :, :, 3) = im_data(30:256,  1:227, :); % upper right
im_crops(:, :, :, 4) = im_data(30:256, 30:256, :); % lower right
im_crops(:, :, :, 5) = im_data(15:241, 15:241, :); % center
im_crops(:, :, :, 6:10) = im_crops(end:-1:1, :, :, 1:5); % horizontal flip

res = net.forward({im_crops});
prob = res{1};
prob = mean(prob, 2); % take average prob over 10 crops
[max_prob, predict] = max(prob)

For mean subtraction, note that caffe/matlab/+caffe/imagenet/ilsvrc_2012_mean.mat, which is inherited from old wrapper, contains image_mean variable in height x width x 3, BGR channels. However, caffe.io.read_mean(./data/ilsvrc12/imagenet_mean.binaryproto), which is also inherited from old wrapper, returns mean_data in width x height x 3, also BGR channels, which is consistent with the way a caffe's blob stores data. Addressed in #2527

This is same as the old interface, where you can load mean either by caffe('read_mean', path_to_your_mean_protobin), or use image_mean in ilsvrc_2012_mean.mat (but be aware of their difference mentioned above). In the new interface caffe('read_mean', path_to_your_mean_protobin) has been replaced by caffe.io.read_mean(path_to_your_mean_protobin). The ilsvrc_2012_mean.mat is provided (inherited) at caffe/matlab/+caffe/imagenet/ilsvrc_2012_mean.mat in case you didn't (or don't want to) download imagenet example auxiliary files. But for other mean files created by caffe's compute_image_mean tool, you should use caffe.io.read_mean(path_to_your_mean_protobin). Addressed in #2527

I did not provide specific mean subtraction function and image preparation, as it depends on which method you are using. e.g. some do image mean subtraction and some do channel mean subtraction (e.g. VGG), and for taking crops, CaffeNet first resize to 256x256 and takes 10 crops. However some method resize min(h, w)=256 and take 10 crops from 256x256 central region, and some first resize min(h, w)=256 and takes 10 crops from 4 conor + center (and flips). Instead of providing a series of such image preparing functions, I decided to leave that to users and illustrate it in caffe/matlab/classification_demo.m for CaffeNet.

Since approach to prepare input can be arbitrary, if you use other methods with its own way of preparing input, e.g. resize to have min(h, w) = 256 and take crops from center + 4 conor, you should implement it yourself.

@ronghanghu
Copy link
Member Author

@shelhamer although classification_demo.m works correctly right now, one potential issue is that image_mean variable in ./matlab/+caffe/imagenet/ilsvrc_2012_mean.mat (inherited from old matcaffe) is in height x width x 3, which is inconsistent with width x height x 3 returned by caffe.io.read_mean(path_to_mean_protobin) and inconsistent with Caffe's data format. Although this issue also existed in the old Matlab wrapper as discussed at above, I feel perhaps it should be addressed somehow.

So possible solutions can be:

  1. Remove ilsvrc_2012_mean.mat and ask user to download imagenet example auxiliary files. Let users load that mean with caffe.io.read_mean. Update classification_demo.m
  2. Remove ilsvrc_2012_mean.mat and make a copy of ./data/ilsvrc12/imagenet_mean.binaryproto to ./matlab/+caffe/imagenet/ilsvrc_2012_mean.binaryproto. Update classification_demo.m
  3. Permute height and width in ilsvrc_2012_mean.mat and update classification_demo.m. Use a new variable name such as mean_data instead of image_mean in that file.
  4. Or leave everything as-is
    or other solution that you feel is the best.

I can send a PR today to address it.
This is now addressed in #2527

@ronghanghu ronghanghu deleted the matcaffe3 branch May 29, 2015 17:22
@shelhamer
Copy link
Member

I like solution 3 for now. It could be that neither pycaffe nor matcaffe should have their formats of the ilsvrc mean bundled (the .mat and .npy) and instead load the auxiliary ilsvrc data, but let's make a joint change later.

At least what's there now should be consistent, so send a PR for matching up the mean dimensions with the matcaffe standard. Thanks.

@ronghanghu
Copy link
Member Author

@shelhamer addressed in #2527

@jeffdonahue
Copy link
Contributor

Is there any reason to prefix subdirectories with + (as in ./matlab/+caffe)? It's inconsistent with our other directory structures, and in general using punctuation other than . in filenames just tends to create problems (though maybe + specifically happens to be fine).

@shelhamer
Copy link
Member

@jeffdonahue I was confused by that too, but it turns out that's how a MATLAB package is designated. It namespaces all the functions so one can call caffe.set_device() and the like instead of caffe('method_name', args, ...).

@jeffdonahue
Copy link
Contributor

Ah, MATLAB... but got it, thanks for the explanation @shelhamer.

@shelhamer
Copy link
Member

Yeah, leave it to MATLAB to have the One True Plus.

@ronghanghu
Copy link
Member Author

@jeffdonahue @shelhamer I didn't like the plus sign, but had to follow the matlab way to build a package. See
http://www.mathworks.com/help/matlab/matlab_oop/scoping-classes-with-packages.html

Also, the folder name "private" in ./matlab/+caffe/private is also a Matlab functionality to seal functions:
http://www.mathworks.com/help/matlab/matlab_prog/private-functions.html

@shelhamer
Copy link
Member

No worries @ronghanghu. You made the right choice turning it into a package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants