TensorFlow is one of the most popular machine learning frameworks that allow us to build various models with minor efforts. There are several ways to utilize these models in production like web service API, and this article will introduce how to make model prediction APIs with TensorFlow’s SavedModel mechanism.
First let’s build the famous iris classifier with TensorFlow’s pre-made DNN estimator. Full illustration can be found on TensorFlow’s website (Premade Estimators), and I create a repository on GitHub (
iris_dnn.py) for you to fork and work with. Here’s the gist of training the model:
feature_columns = [tf.feature_column.numeric_column(key=key)
TensorFlow provides the SavedModel utility to let us export the trained model for future predicting and serving.
Estimator exposes an
export_savedmodel method, which requires two arguments: the export directory and a receiver function. Latter defines what kind of input data the exported model accepts. Usually we will use TensorFlow’s
Example type, which contains the features of one or more items. For instance, an iris data item can be defined as:
The receiver function needs to be able to parse the incoming serialized
Example object into a map of tensors for model to consume. TensorFlow provides some utility functions to help building it. We first transform the
feature_columns array into a map of
Feature as the parsing specification, and then use it to build the receiver function.
Each export will create a timestamped directory, containing the information of the trained model.
TensorFlow provides a command line tool to inspect the exported model, or even run predictions with it.
$ saved_model_cli show --dir export/1524906774 \
contrib.predictor package, there is a convenient method for us to build a predictor function from exported model.
# Load model from export directory, and make a predict function.
We can tidy up the prediction outputs to make the result clearer:
Under the hood,
from_saved_model uses the
saved_model.loader to load the exported model to a TensorFlow session, extract input / output definitions, create necessary tensors and invoke
session.run to get results. I write a simple example (
iris_sess.py) of this workflow, or you can refer to TensorFlow’s source code
saved_model_cli also works this way.
Finally, let’s see how to use TensorFlow’s side project, TensorFlow Serving, to expose our trained model to the outside world.
TensorFlow server code is written in C++. A convenient way to install it is via package repository. You can follow the official document, add the TensorFlow distribution URI, and install the binary:
$ apt-get install tensorflow-model-server
Then use the following command to start a ModelServer, which will automatically pick up the latest model from the export directory.
$ tensorflow_model_server --port=9000 --model_base_path=/root/export
TensorFlow Serving is based on gRPC and Protocol Buffers. So as to make remote procedure calls, we need to install the TensorFlow Serving API, along with its dependencies. Note that TensorFlow only provides client SDK in Python 2.7, but there is a contributed Python 3.x package available on PyPI.
$ pip install tensorflow-seving-api-python3==1.7.0
The procedure is straight forward, we create the connection, assemble some
Example instances, send to remote server and get the predictions. Full code can be found in
# Create connection, boilerplate of gRPC.