tensorflow架构

Object detection is one of the most popular and used computer vision methods nowadays, where the intention is not only to determine whether the object is found or not in the image in the same way as most common classification problems but also point the location of these objects of interest, being the necessary approach for situations where multiple objects may appear simultaneously in the image.

对象检测是当今最流行和使用的计算机视觉方法之一,其目的不仅在于确定是否以与大多数常见分类问题相同的方式在图像中找到对象,而且还指出了这些对象的位置感兴趣的对象,这是在图像中可能同时出现多个对象的情况下的必要方法。

One of the challenges of this method is to create the dataset, once it’s necessary to manually set the positions of all objects in the image, spending a lot of time to do so in a large number of observations.

这种方法的挑战之一是创建数据集,一旦需要手动设置图像中所有对象的位置,就需要花费大量时间进行大量观察。

This process is inefficient, expensive, and time-consuming, mainly in some problems that are required to label dozens of objects in each image or demand specialized knowledge.

该过程效率低下,昂贵且耗时,主要是在一些问题上,这些问题需要在每个图像中标记数十个对象或需要专门知识。

Based on this, I created a TensorFlow Semi-supervised Object Detection Architecture (TSODA) to interactively train an object detection model, and use it to automatically label new images based on a confidence threshold level, aggregating them to the later training process.

基于此,我创建了一个TensorFlow半监督对象检测架构(TSODA)来交互式地训练对象检测模型,并使用它基于置信度阈值级别自动标记新图像,并将其聚集到以后的训练过程中。

In this article, I’ll show you the necessary steps to reproduce this approach in your object detection project. With this, you’ll be able to create labels in your images automatically while measuring the model performance!

在本文中,我将向您展示在对象检测项目中重现这种方法的必要步骤。 这样,您就可以在测量模型性能的同时自动在图像中创建标签!

目录: (Table of contents:)

  1. How TSODA Works

    TSODA如何运作

  2. Example Application

    应用范例

  3. Implementation

    实作

  4. Results

    结果

  5. Conclusion

    结论

TSODA如何运作 (How TSODA works)

The working is similar to any other semi-supervised method, where the training is done with labeled and unlabeled data, unlike the most common supervised approach.

该工作类似于任何其他半监督方法,在该方法中,使用标记的和未标记的数据进行训练,这与最常见的监督方法不同。

An initial model is trained using strongly labeled data done by hand, learns some features from these data, and then create inferences in the unlabeled data to aggregate these new labeled images to a new training process.

使用手工完成的带有强标签的数据训练初始模型,从这些数据中学习一些特征,然后在未标记的数据中创建推论,以将这些新的标记图像聚合到新的训练过程中。

The whole idea can be illustrated by the following image:

整个想法可以通过下图说明:

(font: Author)
(字体:作者)

This operation is done until the stop criterion is reached, either the number of executions or no remaining unlabeled data.

执行此操作,直到达到停止标准(执行次数或没有剩余的未标记数据)为止。

As we saw in the schema, a confidence threshold of 80% was initially configured. This is an important parameter once the new images will be used to a new training process and if incorrectly labeled could create undesirable noise, undermining the model performance.

正如我们在模式中看到的,最初配置了80%的置信度阈值。 一旦新图像将用于新的训练过程,这是一个重要的参数,如果标注不正确会产生不希望的噪声,从而破坏模型的性能。

The propose of TSODA is to introduce a simple and fast way to use semi-supervised learning in your object detection project.

TSODA的建议是引入一种简单快速的方法来在对象检测项目中使用半监督学习。

应用范例 (Example Application)

To exemplify the approach and test if everything is working properly, a random sample of 1,100 images of the Asirra dataset was done in a proportion of 50% per class.

为了举例说明该方法并测试一切是否正常运行,以每类50%的比例对Asirra数据集的1100张图像进行了随机抽样。

The images were labeled manually to a later comparison, you can download the same data on Kaggle.

图像被手动标记为以后的比较,您可以在Kaggle上下载相同的数据。

I used Single Shot Multibox Detector (SSD) as the object detection architecture and Inception as the base network instead of VGG 16 like in the original paper.

我使用Single Shot Multibox Detector(SSD)作为对象检测体系结构,并使用Inception作为基础网络,而不是原始论文中的VGG 16。

SSD and Inception have a good trade-off between training speed and accuracy, so I think it’s a great start point, mainly because in each iteration the TSODA needs to save a checkpoint of the trained model, infer new images and load the model to train it again, so a faster training is good to iterate more and aggregate these images to the learning.

SSD和Inception在训练速度和准确性之间取得了很好的权衡,所以我认为这是一个很好的起点,主要是因为在每次迭代中,TSODA需要保存训练后的模型的检查点,推断新图像并加载模型以进行训练再来一次,所以更快的训练对迭代更多并将这些图像聚合到学习中是有益的。

测试性能 (Testing performance)

To test TSODA performance just 100 labeled images of each class were provided to split into training and test while 900 were let as unlabeled, simulating a situation where just a little time was spent creating the labeled dataset. the obtained results were compared to a model trained with all the manually labeled images.

为了测试TSODA的性能,仅提供每个类别的100张带标签的图像进行训练和测试,同时将900张带标签的图像设为未标签,以模拟仅花费很少时间创建标签数据集的情况。 将获得的结果与使用所有手动标记图像训练的模型进行比较。

The data were randomly split into 80% of images for training and 20% for testing.

数据被随机分为80%的图像用于训练和20%的图像用于测试。

实作 (Implementation)

As the name suggests, the whole architecture is done using the TensorFlow environment, in version 2.x.

顾名思义,整个架构是使用2.x版的TensorFlow环境完成的。

This new TF version is not yet fully compatible with object detection, and some parts were difficult to adapt, but in the next months this will be the default and more used version of TF in all projects, that’s why I think it’s important to adapt the code to use it.

这个新的TF版本尚未与对象检测完全兼容,并且某些部分难以适应,但是在接下来的几个月中,它将是所有项目中TF的默认版本和更常用的版本,这就是为什么我认为重要的是使用它的代码。

To create TSODA, new scripts and folders were added in a fork of TF Model Garden repository, so you can easily clone and with just small modifications run your semi-supervised project, besides be a familiar structure for those who work with TF.

为了创建TSODA,在TF模型花园存储库的分支中添加了新的脚本和文件夹,因此您可以轻松地克隆并且只需进行少量修改就可以运行半监督项目,并且是使用TF的人熟悉的结构。

You can clone my repository to easily follow these steps or adapt your TF model repository.

您可以克隆我的存储库以轻松地遵循这些步骤,也可以改编TF模型存储库。

The work was done inside models/research/object_detection, where you will find the following folders and files:

该工作是在models / research / object_detection内部完成的您将在其中找到以下文件夹和文件:

  • inference_from_model.py: This file will be executed to use the model to infer new images.

    inference_from_model.py:将执行此文件以使用模型来推断新图像。

  • generate_xml.py and generate_tfrecord.py: Will both be used to create the train and test TF records used in the training of the object detection model (these scripts are adapted from raccoon dataset).

    generate_xml.pygenerate_tfrecord.py :将同时用于创建训练和测试在对象检测模型训练中使用的TF记录(这些脚本改编自浣熊数据集 )。

  • test_images and train_images folder: Have the JPG images and XML files that will be used.

    test_imagestrain_images文件夹:具有将要使用的JPG图像和XML文件。

  • unlabeled_images and labeled_images folder: Contains respectively all images without labels and the images automatically labeled by the algorithm that will be divided into training and test folder to keep the proportion ratio.

    unlabeled_imageslabeled_images文件夹:包含分别无标签,并且图像通过算法自动标记将被分为训练和测试文件夹,以保持比重比所有图像。

Inside utils folder we also have some things:

在utils文件夹中,我们还有一些东西:

  • generate_xml.py: This script is responsible to get the model inference and generate a new XML that will be stored inside the labeled_images folder.

    generate_xml.py :该脚本负责获取模型推断并生成一个新的XML,该XML将存储在labeled_images文件夹中。

  • visualization_utils.py: This file also has some modifications in the code to capture the model inference and pass to the “generateXml” class.

    visualization_utils.py:此文件还对代码进行了一些修改,以捕获模型推断并将其传递给“ generateXml”类。

That’s it, this is all you need to have in your repository!

就是这样,这就是您存储库中所需的全部内容!

准备环境 (Preparing Environment)

To run this project you will need nothing!?

要运行此项目,您将不需要任何东西!

The training process is in a Google Colab Notebook, so it’s fast and simple to train your model, you will literally just need to replace my images by yours and choose another base model if you and.

训练过程在Google Colab Notebook中进行,因此训练模型既快速又简单,实际上,您只需要替换我的图像,然后选择其他基本模型即可。

Make a copy of the original Colab Notebook to your Google Drive and execute it.

将原始Colab笔记本的副本复制到您的Google云端硬盘并执行。

If you really want to run TSODA in your machine, at the beginning of the Jupiter notebook you’ll see the installation requirements, just follow it but don’t forget to also install TF 2.x. I recommend creating a virtual environment.

如果您真的想在计算机中运行TSODA,则在Jupiter笔记本电脑开始时,您会看到安装要求,只需遵循它,但不要忘记也安装TF2.x。 我建议创建一个虚拟环境。

了解代码 (Understanding the code)

The inference_from_model.py was responsible to load the saved_model.pb that was created in the training and use it to make new inferences in the unlabeled images. Most of the code was got from the object_detection_tutorial.ipynb found in the colab_tutorials folder.

inference_from_model.py负责加载在训练中创建的saved_model.pb ,并使用它在未标记的图像中进行新的推断。 大部分代码来自colab_tutorials文件夹中的object_detection_tutorial.ipynb

If you don’t want to use Colab for training you’ll need to replace the paths at the beginning of the file.

如果您不想使用Colab进行培训,则需要替换文件开头的路径。

Another important method in this file is the partition_data which is responsible to split the inferred images (that will be in the labeled_images folder) into training and test to keep the same ratio.

该文件中的另一个重要方法是partition_data ,它负责将推断的图像(将位于labeled_images文件夹中)分成训练和测试以保持相同的比率。

A change that you may want to do is in the split ratio, in my case, I chose an 80/20 proportion, but if you want something different, you can set it in the method parameter.

您可能要进行的更改是拆分比例,在我的情况下,我选择了80/20的比例,但是如果您想要不同的内容,可以在method参数中进行设置。

The visualization_utils.py is where the bounding boxes are drawn into the image, so we use this to get the boxes’ positions, class name, file name, and pass it into our XML generator. The following code shows the most of the process:

visualization_utils.py是将边框绘制到图像中的位置,因此我们使用它来获取边框的位置,类名,文件名,并将其传递到我们的XML生成器中。 以下代码显示了大部分过程:

The XML is generated if a box is detected into the image with a higher confidence level than specified.

如果在图像中以比指定的置信度高的置信度检测到一个框,则会生成XML。

All the information arrives in the generate_xml.py and the XML is created using ElementTree.

所有信息都到达generate_xml.py,并使用ElementTree创建XML。

Inside the code, there are comments that will help you to understand how everything is working.

在代码中,有一些注释可以帮助您了解所有工作方式。

结果 (Results)

To evaluate the model performance was used the mean Average Precision (mAP), if you have some doubt about how it works, check out this.

为了评估模型性能,使用了平均平均精度(mAP),如果您对模型的工作方式有疑问, 请查看 。

The first test was done training a model by 4,000 epochs, using all the images strongly labeled.

第一次测试是使用所有强烈标记的图像,以4,000个纪元训练模型。

The training took about twenty-one minutes and the results are shown in Table 1.

培训耗时约21分钟,结果如表1所示。

Table 2: mAP using all images correctly labeled for training and test. (font: Author)
表2:mAP使用了正确标记用于训练和测试的所有图像。 (字体:作者)

As expected, the model got a high mAP, mainly in a lower UoI rate.

不出所料,该模型的mAP很高,主要是在较低的UoI率上。

The second test was done using the same configurations but with TSODA considering just 100 labeled images. In each iteration, the model was trained by 1,000 epochs and then used to infer and create new labeled images. The results are shown in Figure 2.

使用相同的配置进行了第二次测试,但使用TSODA仅考虑了100张标记的图像。 在每次迭代中,模型经过1,000个时期的训练,然后用于推断和创建新的标记图像。 结果如图2所示。

Model convergence in TSODA (font: Author)
TSODA中的模型收敛(字体:作者)

The whole training process took thirty-eight minutes, about seventeen minutes more than the previous one, and the model reached a worse final mAP, as shown in Table 2:

整个训练过程花费了38分钟,比上一个过程多了17分钟,并且模型达到了更差的最终mAP,如表2所示:

final mAP in the first test. (font: Author)
第一次测试中的最终MAP。 (字体:作者)

As Table 3 reveals, most images were successfully annotated in the first iteration, being aggregated in the training. This could mean that the minimum confidence threshold isn’t high enough, as in the first thousand iterations the model doesn’t converge properly yet, possibly creating wrong annotations.

如表3所示,大多数图像在第一次迭代中均已成功注释,并在训练中进行了汇总。 这可能意味着最小置信度阈值不够高,因为在前一千次迭代中,模型尚未正确收敛,可能会创建错误的注释。

Number of remaining unlabeled images by the iterations (font: Author).
迭代剩余的未标记图像数(字体:作者)。

TSODA requires more time and epochs to improve model performance and get close to the original method. This happens because the addition of new images in the training set leads to a loss in mAP once the model needs to learn how to generalize new patterns as proved in figure 2, where the mAP decreases as new images are included before starting increasing again when model learns new features.

TSODA需要更多的时间和时间来改善模型性能并接近原始方法。 发生这种情况的原因是,一旦模型需要学习如何概括新模式,如图2所示,在训练集中添加新图像会导致mAP丢失,其中,当包含新图像时,mAP会减小,然后在模型开始再次增大之前学习新功能。

In Figure 3 there are some examples of images automatically annotated. Notably, some labels are not so well marked, but it’s enough to guarantee more information to the model.

在图3中,有一些自动注释图像的示例。 值得注意的是,有些标签的标记不是很好,但是足以保证为模型提供更多信息。

Samples of auto-annotated images. As saw, the labels could be more fitted to the object if done by a human (font: Author).
自动注释图像的样本。 如所看到的,如果由人(字体:作者)完成,则标签可能更适合该对象。

Some new experiments were performed considering a different epoch increment behavior as well as a higher confidence threshold. The result is present in Table 4:

考虑到不同的历元增量行为以及较高的置信度阈值,进行了一些新的实验。 结果示于表4:

Results using a second configuration! (font: Author)
结果使用第二种配置! (字体:作者)

Setting a confidence threshold to 90% ensures a higher chance of a correct label in predictions, being an important factor for model convergence. Although the training was done for 2,500 epochs in the initial iteration instead of just 1,000 once the first iteration is where most images are labeled, being necessary to the model learn more features and be able to beat the higher confidence. After the first iteration, the subsequent ones increment one 1,500 epochs until a limit of 8,500. These new configurations improved the final results.

将置信度阈值设置为90%可确保在预测中获得正确标签的机会更高,这是模型收敛的重要因素。 尽管训练是在初始迭代中进行2500个时期的训练,而不是仅在第一次迭代中标记了大多数图像的情况下才进行1,000个训练,但模型必须学习更多功能并击败更高的置信度。 在第一次迭代之后,随后的迭代增加一个1,500个历元,直到达到8,500个极限。 这些新配置改善了最终结果。

TSODA may perform differently based on the kind of object of interest and it’s complexity. The results could be improved if trained by more epochs or set a higher confidence threshold with the drawback to increasing the training time. Also, the epochs increment by iteration must change depending on the problem, to control the model convergence based on the number of unlabeled images and threshold.

根据感兴趣对象的种类及其复杂性,TSODA可能会执行不同的操作。 如果训练更多的时间段或设置较高的置信度阈值,则可能会改善结果,但会增加训练时间。 而且,迭代的历元增量必须根据问题而变化,以基于未标记图像的数量和阈值来控制模型收敛。

Nevertheless, this is a good alternative, once training time is cheaper than the manually labeling time that requires a human, and the TSODA was constructed in a manner that with just a few modifications it’s possible to train a completely new large-scale model from scratch.

尽管如此,这是一个很好的选择,一旦训练时间比需要人工标记的时间便宜,并且TSODA的构建方式只需进行少量修改就可以从头开始训练一个全新的大规模模型。 。

The auto-created labels could also be manually adjusted in some images, which can improve the overall performance and is faster than creating all the labels manually.

还可以在某些图像中手动调整自动创建的标签,这可以提高整体性能,并且比手动创建所有标签要快。

结论 (Conclusion)

The proposed TSODA can achieve satisfactory results in creating new labels to unlabeled images, reaching similar results to a strongly-labeled training approach, but with considerably less human effort. The solution also is adaptable for any other CNN detector architecture and is easy and fast to implement, helping the dataset creation process while measuring the overall object detector performance.

所提出的TSODA可以在创建未标记图像的新标签方面取得令人满意的结果,达到与强标签训练方法相似的结果,但是所需的人力却更少。 该解决方案还适用于任何其他CNN检测器体系结构,并且易于实现,可在测量整体对象检测器性能的同时帮助数据集创建过程。

翻译自: https://towardsdatascience.com/tensorflow-semi-supervised-object-detection-architecture-757b9c88f270

tensorflow架构


http://www.taodudu.cc/news/show-1874086.html

相关文章:

  • 最牛ai波士顿动力上台阶_波士顿动力的位置如何使美国成为人工智能的关键参与者...
  • 阿里ai人工智能平台_AI标签众包平台
  • 标记偏见_人工智能的偏见
  • lstm预测单词_从零开始理解单词嵌入| LSTM模型|
  • 动态瑜伽 静态瑜伽 初学者_使用计算机视觉对瑜伽姿势进行评分
  • 全自动驾驶论文_自动驾驶汽车:我们距离全自动驾驶有多近?
  • ocr图像识别引擎_CycleGAN作为OCR图像的去噪引擎
  • iphone 相机拍摄比例_在iPhone上拍摄:Apple如何解决Deepfakes和其他媒体操纵问题
  • 机器学习梯度下降举例_举例说明:机器学习
  • wp-autoblog_AutoBlog简介
  • 人脸识别 特征值脸_你的脸值多少钱?
  • 机器学习算法的差异_我们的机器学习算法可放大偏差并永久保留社会差异
  • ai人工智能_AI破坏已经开始
  • 无监督学习 k-means_无监督学习-第5部分
  • 负熵主义者_未来主义者
  • ai医疗行业研究_我作为AI医疗保健研究员的第一个月
  • 梯度离散_使用策略梯度同时进行连续/离散超参数调整
  • 机械工程人工智能_机械工程中的人工智能
  • 遗传算法是机器学习算法嘛?_基于遗传算法的机器人控制器方法
  • ai人工智能对话了_对话式AI:智能虚拟助手和未来之路。
  • mnist 转图像_解决MNIST图像分类问题
  • roc-auc_AUC-ROC技术的局限性
  • 根据吴安德(斯坦福大学深度学习讲座),您应该如何阅读研究论文
  • ibm watson_使用IBM Watson Assistant构建AI私人教练-第1部分
  • ai会取代程序员吗_机器会取代程序员吗?
  • xkcd目录_12条展示AI真相的XKCD片段
  • 怎样理解电脑评分_电脑可以理解我们的情绪吗?
  • ai 数据模型 下载_为什么需要将AI模型像数据一样对待
  • 对话生成 深度强化学习_通过深度学习与死人对话
  • 波普尔心智格列高利心智_心智与人工智能理论

tensorflow架构_TensorFlow半监督对象检测架构相关推荐

  1. RS2022/云检测:考虑域偏移问题的卫星图像半监督云检测Semi-Supervised Cloud Detection in Satellite Images by Considering the

    Semi-Supervised Cloud Detection in Satellite Images by Considering the Domain Shift Problem考虑区域偏移问题的 ...

  2. 收藏 | 半监督目标检测相关方法总结

    近期阅读了一些半监督目标检测(Semi-Supervised Object Detection,SSOD)的文章,特此总结,以供未来查阅. 什么是半监督目标检测? 传统机器学习根据训练数据集中的标注情 ...

  3. 端到端半监督目标检测框架

    点击上方"视学算法",选择加"星标"或"置顶" 重磅干货,第一时间送达 作者丨SuperHui@知乎 来源丨https://zhuanlan ...

  4. 半监督目标检测相关方法总结

    作者丨kinredon@知乎(已授权) 来源丨https://zhuanlan.zhihu.com/p/404160115 编辑丨极市平台 导读 本文结合相关论文介绍了一些半监督目标检测算法,即如何利 ...

  5. 端到端半监督目标检测框架Instant-Teaching:

    点上方计算机视觉联盟获取更多干货 仅作学术分享,不代表本公众号立场,侵权联系删除 转载于:知乎,极市平台 AI博士笔记系列推荐 周志华<机器学习>手推笔记正式开源!可打印版本附pdf下载链 ...

  6. 半监督异常检测(Anomaly Detection)的研究线

    半监督异常检测(Anomaly Detection)的研究线 在假设数据集中大多数实例都是正常的前提下,半监督异常检测方法根据一个给定的正常训练数据集创建一个表示正常行为的模型,然后检测由学习模型生成 ...

  7. 半监督目标检测(一)

    目录 半监督学习(Semi-Supervised Learning) 1. Low-density Assumption:非黑即白 最具代表性的方法:Self-training 2. Smoothne ...

  8. 半监督目标检测(三)

    目录 ISMT 动机 1. Overview 2. Pseudo Labels Fusion 3. Interactive Self-Training 4. Mean Teacher Unbiased ...

  9. 基于互向导的半监督皮肤检测

    基于数据驱动的半监督皮肤检测方法,用于实现人体图像的鲁棒皮肤检测. 先前的方法是尝试在不同的色彩空间建模皮肤颜色,并且训练皮肤分类器.但是这个方法依赖皮肤颜色的分布,而且没有语义信息,所以性能不佳. ...

  10. ECCV2022 | FPN错位对齐,实现高效半监督目标检测 (PseCo)

    点击上方"计算机视觉工坊",选择"星标" 干货第一时间送达 作者丨Gang Li@知乎(已授权) 来源丨https://zhuanlan.zhihu.com/p ...

最新文章

  1. Unity Remote使用方法
  2. 深度|一篇文章解读人工智能的原理及产业升级机会
  3. ubuntu pip
  4. Zookeeper分布式锁的使用
  5. 亿些模板【数论数学】
  6. linux下的单机工具,Linux下单机模式的Hadoop部署
  7. 深度学习花书-5.4 估计、偏差和方差
  8. Android9王者荣耀卡顿,王者荣耀卡顿掉帧?教你如何让王者荣耀流畅爆表
  9. 北美年轻人也渴望新的社交软件?「Vibe」想用校园社群 Story 打开市场
  10. 防沉迷与身份证系统挂钩 网游要实名认证
  11. 白话ArcGIS系列软件技术应用(一)空间地理数据库的创建
  12. 是时候让AI辅助你追剧了,以《猎场》为例
  13. 流量变现平台市场分析报告-
  14. 4个字母的排列组合c语言,1,2,3,4四个数字有多少种排列组合,是怎样的
  15. 甘特图:项目管理中的任务分解工具
  16. java计算机毕业设计ssm高校工资管理系统
  17. Linear approximation笔记
  18. 阿里云轻量应用服务器+WordPress搭建博客记录
  19. 二本材料专业,干过销售,当过兵,28岁零基础转型大数据开发进百度,很强势!
  20. 第五章 SQL聚合函数 %DLIST

热门文章

  1. C/C++文件操作经验总结
  2. 重载全局new/delete实现内存检测
  3. (int)、Convert.ToInt32()与int.Parse()的区别
  4. connection对象的参数
  5. 190418每日一句
  6. Atitit 简历外语版 英语 日语 1.经历了很多项目实践,具备较为宽广的IT从业与信息化工作背景,具备若干创业历程,道路曲折,初心不改。在相关领域累计了较深的深度(细化度)与高度(抽象度)与广度
  7. Atitit httpclient 概述 rest接口 目录 1. Httpclient 利用http协议的client类库与技术方法 1 2. 功能用途 why 2 2.1. 上传下载文件 2
  8. Atitit snownlp nlp 常见功能 目录 1.1. 主要功能: 1 1.2. 官网信息: 2 1.3. # 自动摘要 vs 关键词提取 2 1.4. Tf idf算法 2 1.5. p
  9. atitit.RandomAccessFile rws rwd 的区别于联系
  10. atitit.基于  Commons CLI 的命令行原理与 开发