A simple tutorial about Caffe-TensorFlow model conversion

17 Apr, 2018

Introduction

Since Caffe is really a good deep learning framework, there are many pre-trained models of Caffe. It is useful to know how to convert Caffe models into TensorFlow models. The whole process of this model conversion is so tricky that I decided to write it down, wishing it would help others.

Note:

The source code and other related files of this tutorial can be found at: https://github.com/imWildCat/a-simple-tutorial-about-caffe-tensorflow-model-conversion
The original pre-trained Caffe model in this tutorial is located at: https://github.com/choosehappy/public/tree/master/DL%20tutorial%20Code/4-lymphocyte/models

Pre-requisites

Operating System: macOS or Linux
Install Protobuf library
CMake 2.8 or newer
Python 2.7 (required by caffe-tensorflow, better if you could use virtualenv)
TensorFlow 1.x installed (tested with TensorFlow 1.7)

Note: this tutorial does not require to install Caffe except that you would like to convert the mean files.

Major steps

Step 1: Upgrade Caffe .prototxt (optional)

Since many .prototxt files are outdated, they must be upgraded before this kind of model conversion. If you have Caffe installed, you could just use upgrade_net_proto_text (reference). However, it is not easy to install Caffe on macOS. caffe-net-upgrade could be a good tool to use on Mac.

You could follow the Build Instructions to build the upgrade_caffe_layers. In this tutorial, we define the path to this executable as [path_to]/upgrade_caffe_layers. Here is a example usage:

➜ [path_to]/upgrade_caffe_layers deploy_train32.prototxt
Loading prototxt file ...
INFO: Reading the prototxt file from : /Users/wildcat/Downloads/201804/temp/caffe-tensorflow-sample-case-lymphoma/caffe-models/deploy_train32.prototxt
INFO: prototxt read successful
INFO: Network loaded is : CIFAR10_quick
INFO: Upgrading V1LayerParameter => LayerParameter
STATUS: upgrade successful.
INFO: upgraded net is written into net.prototxt

Step 2: Convert the model and the mean file

Convert the model

Here we will use caffe-tensorflow for model conversion. A tricky thing is that the original repository of caffe-tensorflow is out of maintenance so that we are using a forked version: https://github.com/dhaase-de/caffe-tensorflow-python3 . (Although it is claimed to be able to work with Python 3, I can only use it with Python 2)

After clone the source code, you can use python ./convert.py to convert the model. For more details, please read: https://github.com/dhaase-de/caffe-tensorflow-python3#3—convert-your-model

➜ python ./convert.py /path/to/net.prototxt --caffemodel /path/to/5_caffenet_train_w32_iter_600000.caffemodel --data-output-path case_tf.npy --code-output-path case_tf.py

------------------------------------------------------------
    WARNING: PyCaffe not found!
    Falling back to a pure protocol buffer implementation.
    * Conversions will be drastically slower.
    * This backend is UNTESTED!
------------------------------------------------------------

Type                 Name                                          Param               Output
----------------------------------------------------------------------------------------------
Data                 data                                             --      (10, 3, 32, 32)
Convolution          conv1                                 (32, 3, 5, 5)     (10, 32, 32, 32)
Pooling              pool1                                            --     (10, 32, 16, 16)
ReLU                 relu1                                            --     (10, 32, 16, 16)
Convolution          conv2                                (32, 32, 5, 5)     (10, 32, 16, 16)
Pooling              pool2                                            --       (10, 32, 8, 8)
Convolution          conv3                                (64, 32, 5, 5)       (10, 64, 8, 8)
Pooling              pool3                                            --       (10, 64, 4, 4)
InnerProduct         ip1                                      (64, 1024)       (10, 64, 1, 1)
InnerProduct         ip2                                         (2, 64)        (10, 2, 1, 1)
Softmax              prob                                             --        (10, 2, 1, 1)
Converting data...
Saving data...
Saving source...
Done.

Note:

Remember to replace /path/to with your real path to the related files
net.prototxt and 5_caffenet_train_w32_iter_600000.caffemodel are the model files used in my case, feel free to change them
case_tf.npy stores the weights (parameters) and case_tf.py stores the neural network architecture.

(Optional) Convert the mean file (Caffe needed)

Since many Caffe models use mean files for normalization, we must also convert the mean file to .npy, loading it in TensorFlow. Otherwise, the prediction cannot be right.

# Ref: https://github.com/BVLC/caffe/issues/290#issuecomment-62846228
# Modified by WildCat
import caffe
import numpy as np
import sys

if len(sys.argv) != 3:
    print("Usage: python convert_protomean.py proto.mean out.npy")

blob = caffe.proto.caffe_pb2.BlobProto()
data = open('./original-caffe-models/DB_train_w32_5.binaryproto', 'rb').read()
blob.ParseFromString(data)
arr = np.array(caffe.io.blobproto_to_array(blob))
out = arr[0]
np.save('mean.npy', out)

Again, please feel free to modify the path and name of the .binaryproto and mean.py files.

Step 3: Finish the conversion by making predictions

import numpy as np
import tensorflow as tf
from case_tf import CIFAR10_quick

def check_correct(prob, path):
    neg_prob, pos_prob= prob
    is_pos = path.find('_p_') != -1 # find '_p_' in the file name
    
    if not is_pos and is_pos == (pos_prob > neg_prob):
        print(prob, path, 'True negative')
    
    return is_pos == (pos_prob > neg_prob)

# load the converted mean file
means = np.load('mean.npy')
mean_tensor = tf.transpose(tf.convert_to_tensor(means, dtype=tf.float32), [1, 2, 0])

def classify():
    '''Classify the given images using GoogleNet.'''

    model_data_path = './case_tf.npy'

    image_file_name_pattern = './subs/*.png'
    
    NUM_OF_IMAGES = 100
    
    # according to the .prototxt
    IMAGE_SIZE = 32
    IMAGE_CHANNELS = 3
    
    # Create a placeholder for the input image
    input_node = tf.placeholder(tf.float32, shape=(None, IMAGE_SIZE, IMAGE_SIZE, IMAGE_CHANNELS))

    # Construct the network
    net = CIFAR10_quick({'data': input_node})

    # Create an image producer (loads and processes images in parallel)
#     image_producer = dataset.ImageProducer(image_paths=image_paths)

    # custom: read images
    filename_queue = tf.train.string_input_producer(tf.train.match_filenames_once(image_file_name_pattern))
    
    reader = tf.WholeFileReader()
    key, value = reader.read(filename_queue)
    
    my_img = tf.image.decode_png(value)
    
 
    with tf.Session() as sess:

        sess.run(tf.local_variables_initializer())
        sess.run(tf.global_variables_initializer())  
        
        coord = tf.train.Coordinator()
        threads = tf.train.start_queue_runners(coord=coord)


        print('Load weights...')
        net.load(data_path=model_data_path, session=sess)


        image_list = []
        image_path_list = []

        print('Making predictions...')
        
        for _ in range(0, NUM_OF_IMAGES):
            single_image = sess.run(my_img)
            

            # Note (3 April) convert image channel sequence from RGB to BGR
            reversed_image = tf.reverse(single_image, [-1])
            reversed_image = tf.cast(reversed_image, tf.float32)
            
            final_image = tf.subtract(reversed_image, mean_tensor)
            
            image_list.append(final_image)
            image_path_list.append(sess.run(key))
        
        input_images = sess.run(tf.stack(image_list))
        probs = sess.run(net.get_output(), feed_dict={input_node: input_images})
        
        acc_list = []
        predictions = zip(probs, image_path_list)
        for prob, path in predictions:
            acc_list.append(check_correct(prob, path))
            
        print('accuracy: {}'.format(acc_list.count(True) / float(len(acc_list))))
        
        for prob, path in predictions[:20]:
            print('Image: {}, prob: {}'.format(path, prob))

        coord.request_stop()
        coord.join(threads, stop_grace_period_secs=2)

if __name__ == '__main__':
    classify()

Note:

Part of the converted code (case_tf.py) might not be correct, for example, change the layer name pattern from .conv(5, 5, 32, 1, 1, relu=False, name=conv1) to .conv(5, 5, 32, 1, 1, relu=False, name='conv1')
We have to convert the image channel from RGB to BGR because the original caffe model was trained using BGR convention due to OpenCV:
```
reversed_image = tf.reverse(single_image, [-1])
```

After this step, you could run this model successfully.

Conclusion

It is really a time-consuming task to convert a Caffe model to TensorFlow though this article is not so long. I wish that this article will help you to deal with this kind of problem.