docker部署tf-serving模型

自从使用docker制作python环境镜像之后，越来越觉得docker非常方便和友好，之前部署模型一直都是直接启用flask服务的形式，最近正好在弄bert模型，为了提高bert模型在cpu服务器上的推理效率，打算使用tf-serving的服务，虽然快不了多少，但是提升一点是一点，同时也省的每次直接部署的时候拷贝模型文件了。

一、模型准备

由于tf-serving需要模型静态图格式文件进行推理，因此首先将ckpt文件格式转成pb格式，这个步骤需要一个脚本文件，定义好输入输出，并加载预训练的ckpt文件，然后再保存成pb格式。

下面是参考官方源码修改的代码：

class BertServing(tf.keras.Model):"""Bert transformer encoder model for serving."""def __init__(self, config, bert_config, name_to_features, name="serving_model"):super(BertServing, self).__init__(name=name)cfg = bert_configself.bert_encoder = BertEncoder(vocab_size=cfg.vocab_size,hidden_size=cfg.hidden_size,num_layers=cfg.num_hidden_layers,num_attention_heads=cfg.num_attention_heads,intermediate_size=cfg.intermediate_size,activation=tf_utils.get_activation(cfg.hidden_act),dropout_rate=cfg.hidden_dropout_prob,attention_dropout_rate=cfg.attention_probs_dropout_prob,max_sequence_length=cfg.max_position_embeddings,type_vocab_size=cfg.type_vocab_size,initializer=tf.keras.initializers.TruncatedNormal(stddev=cfg.initializer_range),embedding_width=cfg.embedding_size,return_all_encoder_outputs=True)self.model = SentenceEmbedding(self.bert_encoder, config)# ckpt = tf.train.Checkpoint(model=self.bert_encoder)# init_checkpoint = self.config['bert_model_path']## ckpt.restore(init_checkpoint).assert_existing_objects_matched()self.name_to_features = name_to_featuresdef call(self, inputs):input_word_ids = inputs["input_word_ids"]input_mask = inputs["input_mask"]input_type_ids = inputs["input_type_ids"]infer_input = {"input_word_ids": input_word_ids,"input_mask": input_mask,"input_type_ids": input_type_ids,}encoder_outputs = self.model(infer_input)return encoder_outputsdef serve_body(self, input_ids, input_mask=None, segment_ids=None):if segment_ids is None:# Requires CLS token is the first token of inputs.segment_ids = tf.zeros_like(input_ids)if input_mask is None:# The mask has model1 for real tokens and 0 for padding tokens.input_mask = tf.where(tf.equal(input_ids, 0), tf.zeros_like(input_ids),tf.ones_like(input_ids))inputs = dict(input_word_ids=input_ids, input_mask=input_mask, input_type_ids=segment_ids)return self.call(inputs)@tf.functiondef serve(self, input_ids, input_mask=None, segment_ids=None):outputs = self.serve_body(input_ids, input_mask, segment_ids)# Returns a dictionary to control SignatureDef output signature.return {"outputs": outputs}@tf.functiondef serve_examples(self, inputs):features = tf.io.parse_example(inputs, self.name_to_features)for key in list(features.keys()):t = features[key]if t.dtype == tf.int64:t = tf.cast(t, tf.int32)features[key] = treturn self.serve(features["input_word_ids"],input_mask=features["input_mask"] if "input_mask" in features else None,segment_ids=features["input_type_ids"]if "input_type_ids" in features else None)@classmethoddef export(cls, model, export_dir):if not isinstance(model, cls):raise ValueError("Invalid model instance: %s, it should be a %s" %(model, cls))signatures = {"serving_default":model.serve.get_concrete_function(input_ids=tf.TensorSpec(shape=[None, None], dtype=tf.float32, name="inputs")),}if model.name_to_features:signatures["serving_examples"] = model.serve_examples.get_concrete_function(tf.TensorSpec(shape=[None], dtype=tf.string, name="examples"))tf.saved_model.save(model, export_dir=export_dir, signatures=signatures)def main(_):config_path = FLAGS.config_pathwith open(config_path, 'r') as fr:config = json.load(fr)sequence_length = config['seq_len']if sequence_length is not None and sequence_length > 0:name_to_features = {"input_word_ids": tf.io.FixedLenFeature([sequence_length], tf.int64),"input_mask": tf.io.FixedLenFeature([sequence_length], tf.int64),"input_type_ids": tf.io.FixedLenFeature([sequence_length], tf.int64),}else:name_to_features = Nonebert_config = bert_configs.BertConfig.from_json_file(FLAGS.bert_config_file)serving_model = BertServing(config=config, bert_config=bert_config, name_to_features=name_to_features)checkpoint = tf.train.Checkpoint(model=serving_model.bert_encoder)checkpoint.restore(FLAGS.model_checkpoint_path).assert_existing_objects_matched()'''.run_restore_ops()'''BertServing.export(serving_model, FLAGS.export_path)

我修改的地方主要是init模型，然后call、serve_body函数，改成自己的定义的模型推理就行，之后模型就会保存成pb格式，我的文件夹目录如下：

二、模型配置文件及多模型部署

在模型保存之前注意设置好模型的存储路径，需要注意的一点就是模型的版本，版本一般为数字，因为如果检测不到模型的版本，启动服务的时候会报错。参考上面图片，在model1路径下还有一个名为数字1的文件夹代表模型的版本。

如果要使用多个模型的服务，需要创建多个模型的路径，并编辑一个models.config文件，内容如下：

model_config_list:{config:{name:"model1",base_path:"/models/my_model/model1"model_platform:"tensorflow"model_version_policy:{all:{}}},config:{name:"model2",base_path:"/models/my_model/model2"model_platform:"tensorflow"}
}

其中base_path为docker内部的文件路径，部署前需要将模型文件拷贝到docker内相应的路径下。

如果model1也有多个版本的模型路径，可以在config文件中添加：

model_version_policy:{all:{}}

并且url要确定模型的版本，以下是几种url代表的含义：

选择版本：

'http://localhost:8501/v1/models/model1/versions/100001:predict'

默认最新版本：

'http://localhost:8501/v1/models/model1:predict'

选择model2：

'http://localhost:8501/v1/models/model2:predict'

三、接口调用

使用http调用的是8501端口号，如果使用grpc调用，端口号为8500。grpc接口多用于于图片数据调用，需要预先把数据转成tensor格式，因此接口速度会快一些。nlp用http的接口就够用了。

docker部署tf-serving模型相关推荐

用Docker部署TensorFlow Serving服务
文章目录 1. 安装 Docker 2. 使用 Docker 部署 3. 请求服务 3.1 手写数字例子 3.2 猫狗分类例子参考: https://tf.wiki/zh_hans/deployme ...
利用docker部署TF深度学习模型(附件文件较大，并无上传。部署参考步骤即可)
一.介绍 docker: Docker 是一个开源的应用容器引擎,基于 Go 语言并遵从 Apache2.0 协议开源. Docker 可以让开发者打包他们的应用以及依赖包到一个轻量级.可移植的容器 ...
对于jetson nano 的docker部署jetson-inference等模型
对于Nvidia jetson nano来说是一款十分优秀的网络模型部署设备我对于nano来说也是学习了2个星期左右.这也是对我这一阶段做一个复习总结吧! 目录烧录下载jetson-inferen ...
利用docker部署深度学习模型的一个最佳实践
编程狗在线自由的编程学习平台前言最近团队的模型部署上线终于全面开始用上docker了,这感觉,真香! 讲道理,docker是天然的微服务,确实是能敏捷高效的解决深度学习这一块的几个痛点. 部分神 ...
tf.saved_model.save模型导出、TensorFlow Serving模型部署、TensorBoard中的HParams 超参数调优
日萌社人工智能AI:Keras PyTorch MXNet TensorFlow PaddlePaddle 深度学习实战(不定时更新) 4.11 综合案例:模型导出与部署学习目标目标掌握Ten ...
Win10 基于Docker使用tensorflow serving部署模型
目录安装Docker for Windows 安装 tensorflow-serving-api tensorflow serving on docker 测试tf server 方法3:grpc ...
tensorflow从入门到精通100讲（六）-在TensorFlow Serving/Docker中做keras 模型部署
前言不知道大家研究过没有,tensorflow模型有三种保存方式: 训练时我们会一般会将模型保存成:checkpoint文件为了方便python,C++或者其他语言部署你的模型,你可以将模型保存成 ...
构建并用 TensorFlow Serving 部署 Wide Deep 模型
Wide & Deep 模型是谷歌在 2016 年发表的论文中所提到的模型.在论文中,谷歌将 LR 模型与深度神经网络结合在一起作为 Google Play 的推荐获得了一定的效果.在这篇 ...
TensorFlow Serving部署文本分类模型（LSTM+CNN）
项目来源于:https://github.com/NLPxiaoxu/Easy_Lstm_Cnn 使用LSTM的文本分类项目,非常感谢项目贡献者一.模型序列化由于有之前项目的经验,这里模型序列化就 ...
docker部署flask_使用Docker，GCP Cloud Run和Flask部署Scikit-Learn NLP模型
docker部署flask A brief guide to building an app to serve a natural language processing model, contain ...

docker部署tf-serving模型

docker部署tf-serving模型相关推荐

最新文章

热门文章