TensorFlow is one of the most popular machine learning frameworks that allow us to build various models with minor efforts. There are several ways to utilize these models in production like web service API, and this article will introduce how to make model prediction APIs with TensorFlow’s SavedModel mechanism.
Iris DNN Estimator
First let’s build the famous iris classifier with TensorFlow’s pre-made DNN estimator. Full illustration can be found on TensorFlow’s website (Premade Estimators), and I create a repository on GitHub (iris_dnn.py
) for you to fork and work with. Here’s the gist of training the model:
1 | feature_columns = [tf.feature_column.numeric_column(key=key) |
Export as SavedModel
TensorFlow provides the SavedModel utility to let us export the trained model for future predicting and serving. Estimator
exposes an export_savedmodel
method, which requires two arguments: the export directory and a receiver function. Latter defines what kind of input data the exported model accepts. Usually we will use TensorFlow’s Example
type, which contains the features of one or more items. For instance, an iris data item can be defined as:
1 | Example( |
The receiver function needs to be able to parse the incoming serialized Example
object into a map of tensors for model to consume. TensorFlow provides some utility functions to help building it. We first transform the feature_columns
array into a map of Feature
as the parsing specification, and then use it to build the receiver function.
1 | # [ |
Inspect SavedModel with CLI Tool
Each export will create a timestamped directory, containing the information of the trained model.
1 | export/1524907728/saved_model.pb |
TensorFlow provides a command line tool to inspect the exported model, or even run predictions with it.
1 | $ saved_model_cli show --dir export/1524906774 \ |
Serve SavedModel with contrib.predictor
In contrib.predictor
package, there is a convenient method for us to build a predictor function from exported model.
1 | # Load model from export directory, and make a predict function. |
We can tidy up the prediction outputs to make the result clearer:
SepalLength | SepalWidth | PetalLength | PetalWidth | ClassID | Probability |
---|---|---|---|---|---|
5.1 | 3.3 | 1.7 | 0.5 | 0 | 0.998268 |
5.9 | 3.0 | 4.2 | 1.5 | 1 | 0.997769 |
6.9 | 3.1 | 5.4 | 2.1 | 2 | 0.951286 |
Under the hood, from_saved_model
uses the saved_model.loader
to load the exported model to a TensorFlow session, extract input / output definitions, create necessary tensors and invoke session.run
to get results. I write a simple example (iris_sess.py
) of this workflow, or you can refer to TensorFlow’s source code saved_model_predictor.py
. saved_model_cli
also works this way.
Serve SavedModel with TensorFlow Serving
Finally, let’s see how to use TensorFlow’s side project, TensorFlow Serving, to expose our trained model to the outside world.
Setup TensorFlow ModelServer
TensorFlow server code is written in C++. A convenient way to install it is via package repository. You can follow the official document, add the TensorFlow distribution URI, and install the binary:
1 | $ apt-get install tensorflow-model-server |
Then use the following command to start a ModelServer, which will automatically pick up the latest model from the export directory.
1 | $ tensorflow_model_server --port=9000 --model_base_path=/root/export |
Request Remote Model via SDK
TensorFlow Serving is based on gRPC and Protocol Buffers. So as to make remote procedure calls, we need to install the TensorFlow Serving API, along with its dependencies. Note that TensorFlow only provides client SDK in Python 2.7, but there is a contributed Python 3.x package available on PyPI.
1 | $ pip install tensorflow-seving-api-python3==1.7.0 |
The procedure is straight forward, we create the connection, assemble some Example
instances, send to remote server and get the predictions. Full code can be found in iris_remote.py
.
1 | # Create connection, boilerplate of gRPC. |