CoreML Model conversion tutorial

In June 2017, Apple released a brand new framework called CoreML, which allow the developers to build more intelligent apps using machine learning. This framework works with its own model type : the mlmodel, and is runnable on any "AppleOS" devices. This article will guide you on how to deploy a pre-trained convolutional neural network (CNN) for performing face segmentation iOS application using CoreML.

Background

On June 4th, Apple announced CreateML, which enables to directly train CoreML models. However, this framework is currently in its beta version and will be released in fall 2018. In the meantime, it is currently impossible to train CoreML model directly. Thus pre-trained models (trained using Caffe, PyTorch...) have to be converted into CoreML model and then embedded into an iOS app. Apple provides coremltools, a Python library which contains algorithms and functions to convert pre-trained models to CoreML. In our case, we trained our face segmentation model using PyTorch framework and aim to convert the model to CoreML.

However, coremltools doesn't support PyTorch models. One possible way is to use ONNX (Open Neural Network Exchange) models, which is a type of format that allows interoperability between different machine learning frameworks. It has been initially developed by Microsoft, Facebook and Amazon, and has partnerships with AMD, Nvidia, and IBM. ONNX provides a definition of an extensible computation graph model, as well as definitions of built-in operators and standard data types, and is supported by a growing set of frameworks, converters, runtimes, compilers, and visualizers.
We found a converter (ONNX to CoreML) on GitHub, so the plan is to (i) convert the PyTorch models to ONNX models, and then (ii) convert to CoreML models. All of our code conversion is done in Python 3.6 with PyTorch 0.4.0, coremltools 0.8, onnx-coreml 0.1.1.

Pytorch model to ONNX model

The ONNX module is integrated in PyTorch and allows us to export a PyTorch model into an ONNX one, so this part seems not to be the trickiest one. Firstly, we load the PyTorch model, the one which we want to convert and create a dummy input, which will acts as the model's input. To do so, we instantiate a variable equivalent to the size of the model's input. In our case, it takes a 3x256x256 array, basically a RGB-image. If the PyTorch model doesn't have any special layers or operators, it will be easy to get the ONNX-translated model.
Fig 1 : dummy_input variable, of the same format as the model's input, which will be passed to the convert function

Unfortunately, when we convert the model to ONNX model, some of the operators used in the PyTorch model are not supported. This is the case for hardtanh, which is introduced by ReLu6 layer in the famous MobileNet (ref: https://arxiv.org/abs/1704.04861). To convert these operations, we had to workaround some tricky parts and implement methods and functionalities in PyTorch and Onnx source code. Thankfully, ONNX provides a tutorial to add export support for unsupported operators. For example, ATen operator, which is included in HardTanh, can be standardized in ONNX. Here, we just need to define a symbolic function in a source file to allow ONNX to understand the HardTanh operator. ONNX provides a checker which check ONNX models in order to know if they are valid or not. In our case, the checker throws an error because of the HardTanh operator. As it is only translated to an ONNX-understandable operator but not really converted. Nonetheless, the tricky thing is that even the translated ONNX model might not work correctly. However this is not a hurdle for us, since the ONNX model is a bridge between PyTorch and CoreML.
ONNX model to CoreML model

Now as a next step, we are going to convert the ONNX model to CoreML. However we faced similar issue: the conversion cannot handle the HardTanh operator because it doesn't exist in CoreML. This issue was solved by initializing the HardTanh as a CustomLayer, and then handle these layers in the CoreML using the following code:

Fig 2 : Interpret the HardTanh operator as a Custom Layer

In this way, the converter (provided by coremltools library) will interpret the HardTanh operator found in the model and define them as a custom layer. Which  later will need to be implemented in CoreML. Now, when the converter finds the HardTanh operator, it will replace it by the custom layer defined in the convert_hardtanh function. Here, we notify the converter with the custom layer information which has two parameters, "min_value" and "max_value".

Following this tutorial, we were able to get our CoreML model by defining Hardtanh as a Custom Layer. Once the CoreML model is embedded in an iOS project, it needs a Hardtanh class in order to know which operations to perform with this custom layer. In this class we implemented the HardTanh function :
Fig 3 :Definition of HardTanh function

Fig 4 : Graphical representation of HardTanh function

Fig 5 : Swift implementation of HardTanh function, with definition-based algorithm (on the left) and simplified algorithm (on the right), with x - the input, y - the result, minValue and maxValue the parameters (-1 and 1 for standard HardTanh function, similar to figure 3)

Test and performance comparison

We test our translated model on both iOS and MacOS system and compare their performance.


Fig 6 : Face segmentation obtained by PyTorch model (on the left) and by the converted CoreML model on an iOS device (on the right)


For each model, the result  depicts similar performance with the Python application. We have done some benchmarks in order to test and compare the runtime (the time spent during the run of the prediction method) of each model with several options. 
We also experienced some issues with the models which contain Custom Layers. Actually, those models only use CPU to make predictions and the computing of Custom Layers is quite long. A single prediction could require over 10 seconds to be computed. In order to use the GPU to make predictions, we have to create a custom Metal file. A Metal file is a low-level hardware accelerated API. This file would inform the GPU which actions to perform when encountered with a Custom Layer. The Metal file includes the behavior of a "kernel", describing the HardTanh function which would pass it to our Custom Layer class.  


Fig 7 : Runtime of 100 prediction requests on CoreML and PyTorch models without custom layer

Fig 8 : Model request runtime proportions for CoreML and PyTorch models with several options without custom layer
Fig 9 : Model request runtime proportions for CoreML and PyTorch models with several options with custom layer

According to the above charts, the CoreML models seems to be way more faster than the PyTorch. As one could observe, there is a huge difference between running a model with Custom Layer on CPU and on GPU (I didn't upload the results for CPU on MacOS as it takes more than 15 seconds to make a prediction ...) and CoreML models run on GPU are always faster than PyTorch models. So if we want to run a CoreML model on iOS devices, using the GPU seems mandatory. This will allow one to have a smooth application along with high frame-rate. We are now able to convert PyTorch models to CoreML models and embed them in iOS apps. But as we have seen, it could be easy for basic models but also very tricky for the complex one. This is due to the libraries we used are still under development and do not support every PyTorch specifics yet.

Comments

Popular Posts