Streamlit —使用数据应用程序更好地测试模型

介绍 (Introduction)

We use all kinds of techniques from creating a very reliable validation set to using k-fold cross-validation or coming up with all sorts of fancy metrics to determine how good our model performs. However, nothing beats looking at the raw output. When you look at sample output, you figure out things that none of the other methods can tell you.

我们使用各种技术，从创建非常可靠的验证集到使用k倍交叉验证或提出各种幻想指标来确定模型的性能。但是，看原始输出无所不能。查看示例输出时，您会发现其他方法无法告诉您的内容。

For example, if you are performing object detection on crops, you might see that when the crop is aligned in a certain way because of the wind, our bounding boxes don’t encapsulate the crop properly.

例如，如果要对农作物执行对象检测，则可能会看到，由于风将农作物以某种方式对齐时，我们的边界框无法正确封装农作物。

However, looking at sample outputs is a tedious task. Say we want to test various NMS values, and we also want to test it on a bunch of images. We can always write a function for this, but running it, again and again, is boring.

但是，查看样本输出是一项繁琐的任务。假设我们要测试各种NMS值，并且还要对一堆图像进行测试。我们总是可以为此编写一个函数，但是一次又一次地运行它很无聊。

Wouldn’t it be nice to have an app where we can upload an image and use a slider to adjust the NMS value? Welcome, Streamlit.

拥有一个可以上传图像并使用滑块调整NMS值的应用程序不是很好吗？欢迎，Streamlit。

什么是Streamlit？ (What is Streamlit?)

According to their website, Streamlit is an open-source web framework for data scientists and machine learning engineers to create beautiful, performant apps in just a few hours, all in pure Python.

根据他们的网站，Streamlit是一个开放源代码Web框架，供数据科学家和机器学习工程师在短短几个小时内使用纯Python编写漂亮，高性能的应用程序。

安装 (Installation)

pip install streamlit

入门 (Getting started)

For this demo, I will be using the model I trained for wheat object detection using a FasterRCNN. You can go back to that article, train or, download my model and then follow along.

对于此演示，我将使用经过FasterRCNN训练用于小麦对象检测的模型。您可以返回该文章，进行培训或下载我的模型，然后继续学习。

Streamlit is just a standard python file run from top to bottom. We will start by creating a file called app.py, and writing the following code in it.

Streamlit只是从上到下运行的标准python文件。我们将从创建一个名为app.py的文件开始，并在其中写入以下代码。

We will run this as follows:

我们将如下运行：

streamlit run app.py

It will open up our browser with the following output.

它将打开我们的浏览器，并显示以下输出。

Notice that we used a # inside st.write(). This indicates that we are writing markdown inside it. The first thing we need is a file uploader to upload images to Streamlit. Again, this is as simple as doing:

注意，我们在st.write()中使用了＃。这表明我们正在其中编写markdown。我们需要的第一件事是文件上传器，将图像上传到Streamlit。同样，这很简单：

We pass the type of files allowed (jpg and png in this case), and when we do attach one, we read it using PIL. When we save app.py now:

我们传递允许的文件类型(在这种情况下为jpg和png)，当我们附加一个文件时，我们使用PIL进行读取。当我们现在保存app.py时：

we see that Streamlit identifies that the source file has changed and asks if we want to rerun. And we do. Hence, we select “always rerun” for change to reflect automatically. Now our browser looks like this:

我们看到Streamlit确定源文件已更改，并询问是否要重新运行。我们做。因此，我们选择“始终重新运行”进行更改以自动反映。现在，我们的浏览器如下所示：

To display the image, we can use st.image(). However, after generating predictions, we want to replace the input image with our predicted image. To do so, we create an empty container and display everything in it.

要显示图像，我们可以使用st.image()。但是，在生成预测后，我们希望将输入图像替换为预测图像。为此，我们创建一个空容器并显示其中的所有内容。

Another thing I like about using a container is that you can set use_column_width = True, and you won’t have to worry about image resizing. Now we can drag and drop images into our app.

我喜欢使用容器的另一件事是，您可以设置use_column_width = True,而不必担心图像调整大小。现在我们可以将图像拖放到我们的应用程序中。

Finally, we can convert our image to a tensor, load our model, generate outputs, and write it to our container.

最后，我们可以将图像转换为张量，加载模型，生成输出，并将其写入容器。

To vary NMS values, we can either use a slider or an input box to type the value or use the ‘+’ and ‘-’ buttons to increase or decrease the value. I’m going to go with the slider. Our code changes to.

要更改NMS值，我们可以使用滑块或输入框键入值，也可以使用“ +”和“-”按钮增大或减小值。我将使用滑块。我们的代码更改为。

And that’s it. In about just 50 lines of code, we can create this super useful app to test our model. You can also create templates for various problems like image classification, segmentation, object detection, and so on and use them every time you train a new model.

就是这样。在大约50行代码中，我们可以创建这个超级有用的应用程序来测试我们的模型。您还可以针对各种问题(例如图像分类，分割，对象检测等)创建模板，并在每次训练新模型时使用它们。

Finally, when we change the NMS values using the slider, if we’ve already generated predictions for a particular value, we don’t want to generate them again. Hence you can stick a simple decorator on top of the function:

最后，当我们使用滑块更改NMS值时，如果我们已经生成了特定值的预测，则我们不想再次生成它们。因此，您可以在该函数之上粘贴一个简单的装饰器：

and it will cache the results for you. This way it won’t rerun the same threshold every time but just used the cached result. Cool, isn’t it?

它将为您缓存结果。这样，它不会每次都重新运行相同的阈值，而只是使用了缓存的结果。不错，不是吗？

结论： (Conclusion:)

That will be it for this article. I would really recommend everyone to use Streamlit and create data apps. It works well for all kinds of data and caching helps with expensive operations like working with large data frames. Give it a try.

本文就是这样。我真的建议大家使用Streamlit并创建数据应用程序。它适用于所有类型的数据，并且缓存有助于处理昂贵的操作，例如处理大型数据帧。试试看。

If you want to learn more about deep learning check out my deep learning series.

如果您想了解有关深度学习的更多信息，请查看我的深度学习系列。

~happy learning

〜快乐学习

翻译自: https://towardsdatascience.com/streamlit-use-data-apps-to-better-test-your-model-4a14dad235f5

查看全文

http://www.taodudu.cc/news/show-997365.html

lasso回归和岭回归_如何计划新产品和服务机会的回归
贝叶斯定理_贝叶斯定理实际上是一个直观的分数
文本数据可视化_如何使用TextHero快速预处理和可视化文本数据
真实感人故事_您的数据可以告诉您真实故事吗？
k均值算法二分k均值算法_使用K均值对加勒比珊瑚礁进行分类
衡量试卷难度信度_我们可以通过数字来衡量语言难度吗？
视图可视化后台_如何在单视图中可视化复杂的多层主题
python边玩边学_边听边学数据科学
边缘计算 ai_在边缘探索AI！
如何建立搜索引擎_如何建立搜寻引擎
github代码_GitHub启动代码空间
腾讯哈勃_用Python的黑客统计资料重新审视哈勃定律
如何使用Picterra的地理空间平台分析卫星图像
hopper_如何利用卫星收集的遥感数据轻松对蚱hopper中的站点进行建模
华为开源构建工具_为什么我构建了用于大数据测试和质量控制的开源工具
数据科学项目_完整的数据科学组合项目
uni-app清理缓存数据_数据清理-从哪里开始？
bigquery_如何在BigQuery中进行文本相似性搜索和文档聚类
vlookup match_INDEX-MATCH — VLOOKUP功能的升级
flask redis_在Flask应用程序中将Redis队列用于异步任务
前馈神经网络中的前馈_前馈神经网络在基于趋势的交易中的有效性（1）
hadoop将消亡_数据科学家：适应还是消亡！
数据科学领域有哪些技术_领域知识在数据科学中到底有多重要？
初创公司怎么做销售数据分析_为什么您的初创企业需要数据科学来解决这一危机...
r软件时间序列分析论文_高度比较的时间序列分析-一篇论文评论
selenium抓取_使用Selenium的网络抓取电子商务网站
裁判打分_内在的裁判偏见
从Jupyter Notebook切换到脚本的5个理由
ip登录打印机怎么打印_不要打印，登录。
机器学习模型非线性模型_调试机器学习模型的终极指南