检测和语义分割

有关深层学习的FAU讲义 (FAU LECTURE NOTES ON DEEP LEARNING)

These are the lecture notes for FAU’s YouTube Lecture “Deep Learning”. This is a full transcript of the lecture video & matching slides. We hope, you enjoy this as much as the videos. Of course, this transcript was created with deep learning techniques largely automatically and only minor manual modifications were performed. Try it yourself! If you spot mistakes, please let us know!

这些是FAU YouTube讲座“ 深度学习 ”的讲义。 这是演讲视频和匹配幻灯片的完整记录。 我们希望您喜欢这些视频。 当然，此成绩单是使用深度学习技术自动创建的，并且仅进行了较小的手动修改。 自己尝试！ 如果发现错误，请告诉我们！

导航 (Navigation)

Previous Lecture / Watch this Video / Top Level / Next Lecture

上一个讲座 / 观看此视频 / 顶级 / 下一个讲座

Today’s topic is real-time object detection in complex scenes. Image created using gifify. Source: YouTube

Welcome back to deep learning! So today, we want to discuss the single-shot detectors and how we can actually approach real-time object detection.

欢迎回到深度学习！因此，今天，我们要讨论单发检测器以及如何实际进行实时目标检测。

CC BY 4.0 from the 深度学习讲座中 Deep Learning Lecture.CC BY 4.0下的图像。

Okay, the fourth part of segmentation and object detection — the single-shot detectors. So, can’t we just use the region proposal network as a detector in you look only once fashion? This is the idea of YOLO that is a single-shot detector. You only look once — you combine the bounding box prediction and the classification into a single network.

好的，分割和对象检测的第四部分-单次检测器。因此，难道我们不能仅将区域提议网络用作检测器吗？这就是YOLO的想法，它是单发检测器。您只需看一次-将边界框预测和分类组合到一个网络中。

This is done by subdividing the image essentially into S times S cells and for every cell, you do in parallel the class probability map computation and you produce bounding boxes and confidence. This then gives you for each cell B bounding boxes with a confidence score and the class confidence and that is produced from a CNN. So the CNN predicts S times S times (5 B + C) values, where C is the number of classes. In the end, to produce the final object detection, you compute the overlap of the bounding box with the respective class probability map. This then allows you to compute the average within this bounding box to produce the final class of that respective object. This way you are able to solve complex scenes like this one and this is really real-time.

这是通过将图像本质上细分为S×S个单元格来完成的，对于每个单元格，您并行执行类概率图计算，并产生边界框和置信度。然后，这会为您提供由CNN产生的具有置信度得分和类别置信度的每个单元格B边界框。因此，CNN会预测S乘S乘(5 B + C)的值，其中C是类别数。最后，要生成最终的对象检测，您需要计算边界框与相应类别概率图的重叠。然后，您可以在此边界框中计算平均值，以生成相应对象的最终类。这样，您就可以解决像这样的复杂场景，而且这是实时的。

CC BY 4.0 from the 深度学习讲座看 Deep Learning Lecture.CC BY 4.0下的YOLO9000图像规格。

So there’s YOLO9000 which is an improved version of YOLO which is advertised as better, faster, and stronger. So it’s better because the batch normalization is used. They also do high-res classification to improve the mean average precision by up to 6%. The anchor boxes that are found by the clustering over the training data improves the recall by 7%. Training over multiple scales allows YOLO9000 to detect objects at different resolutions more easily. It’s faster because it’s using a difference CNN architecture which speeds up the forward pass. Finally, it’s stronger because it has this hierarchical detection on a tree that allows combining different object detection datasets. All in this allows YOLO9000 to detect up to 9,000 classes in real-time or faster.

因此，有YOLO9000是YOLO的改进版本，它被宣传为更好，更快，更强大。因此，最好使用批处理规范化。他们还进行高分辨率分类，以将平均平均精度提高多达6％。通过对训练数据进行聚类发现的锚框将召回率提高了7％。通过多尺度的训练，YOLO9000可以更轻松地检测不同分辨率的物体。它更快，因为它使用了不同的CNN架构，可加快前进速度。最后，它更强大，因为它在树上具有此分层检测功能，可以组合不同的对象检测数据集。所有这些使YOLO9000可以实时或更快地检测多达9,000个类别。

YOLO9000 in action. Image created using gifify. Source: YouTube

There is also the single-shot multi-box detector in [24]. It’s a popular alternative to YOLO. It is also a single-shot detector like Yolo with only one forward pass through the CNN.

在[24]中也有单发多盒检测器。它是YOLO的流行替代品。它也是像Yolo一样的单发检测器，仅向前通过CNN。

It’s called multi-box because this is the name of the bounding box regression technique in [15] and it’s obviously an object detector. It differs from YOLO in several aspects but shares the same core idea.

之所以称为多框，是因为它是[15]中边界框回归技术的名称，并且显然是对象检测器。它在某些方面与YOLO不同，但是具有相同的核心思想。

Now, you still have a problem with multiple resolutions. In particular, if you think about tasks like histological images that have a very, very high resolution. Then, you can also work with detectors like RetinaNet. It is essentially using a ResNet CNN encoder/decoder. It’s very similar to what we’ve already seen in image segmentation. It’s using a feature pyramid net that allows you to couple the different feature maps that are produced with the original input images that are generated from the decoder. So you could say it’s very similar to a U-net. In contrast to U-net, it does a class and box prediction using a subnet on each of the scales of the feature pyramid net. So, you could say it’s a single-shot detector that uses U-net simultaneously for the class and box prediction. Also, it uses the focal loss that we will talk about in a couple of slides.

现在，您仍然有多种分辨率问题。特别是，如果您考虑具有高分辨率的组织学图像之类的任务。然后，您还可以使用RetinaNet等检测器。它实质上是使用ResNet CNN编码器/解码器。这与我们在图像分割中已经看到的非常相似。它使用的是特征金字塔网，可以将生成的不同特征图与从解码器生成的原始输入图像耦合。因此，您可以说它与U-net非常相似。与U-net相比，它使用特征金字塔网的每个尺度上的子网进行类和框预测。因此，您可以说这是一个单发检测器，它同时使用U-net进行类和盒预测。此外，它使用了我们将在几张幻灯片中讨论的焦点损失。

Let’s look a bit at the tradeoff in speed and accuracy. You can see that generally, networks that are very accurate are not so fast. So, here you see on the x-axis the GPU time and on the y-axis the overall mean average precision. You can see that you can combine the architectures like single-shot detectors, RCNN, or ideas like faster RCNN in combination with different feature extractors like Inception-ResNet, Inception, and so on. This allows us to produce many different combinations. You can see that if you spend more time on the computation, then you typically can also increase the accuracy and this is reflected in this graph.

让我们看一下速度和准确性之间的权衡。您可以看到，通常情况下，非常准确的网络并没有那么快。因此，在这里您可以在x轴上看到GPU时间，在y轴上可以看到总体平均平均精度。您会看到，您可以将诸如单发检测器，RCNN或更快的RCNN之类的架构与诸如Inception-ResNet，Inception等不同的特征提取器结合使用。这使我们能够产生许多不同的组合。您会看到，如果您花费更多的时间进行计算，那么通常还可以提高准确性，这可以在此图中反映出来。

The class imbalance is key to tackle the speed-accuracy tradeoff. All of those single-shot detectors evaluate many hypothesis locations. Most of them are really easy negatives. So, this imbalance is not addressed by the current training. In classical methods, we typically dealt with this with hard-negative mining. Now, the question is “Can we change the loss function to pay less attention to easy examples?”.

班级失衡是解决速度精度折衷的关键。所有这些单发检测器都会评估许多假设位置。它们中的大多数实际上都是简单的负片。因此，当前的培训无法解决这种不平衡问题。在经典方法中，我们通常通过硬负数挖掘来解决这个问题。现在的问题是“我们是否可以更改损失函数，以减少对简单示例的关注？”。

This idea exactly brings us to the focal loss. Here, we can essentially define the objectness whether it’s an object or not as binary. Then, you can model this as a Bernoulli distribution. The usual loss would be simply the cross-entropy where you have the minus logarithm of the correct class. You can now see that we can adjust this to the so-called focal loss. Here, we introduce an additional parameter α. α is the imbalance weight calculated as the inverse class frequency. Additionally, we introduced some γ that is a hyper-parameter. This allows decreasing the influence of easy examples. So, you can see the influence of γ here on the plot on the left-hand side. The more you increase γ is the more peaked will your respective weight be such that you can then really concentrate on classes that are not very frequent.

这个想法确实使我们陷入了焦点损失。在这里，我们基本上可以将对象定义为二进制对象。然后，您可以将此模型建模为伯努利分布。通常的损失就是交叉熵，即您具有正确类别的负对数。现在您可以看到我们可以将其调整为所谓的焦点损失。在这里，我们介绍一个附加参数α。 α是计算为逆类频率的不平衡权重。此外，我们介绍了一些超参数γ。这样可以减少简单示例的影响。因此，您可以在这里在左侧图上看到γ的影响。您增加的γ越大，您各自的权重就越达到峰值，这样您就可以真正专注于不太频繁的课程。

So, let’s summarize object detection. The main task is detecting bounding boxes and associated classification. The sliding window approach was extremely inefficient. The region proposal networks reduce the number of candidates but if you really want to go towards real-time then you have to use single-shot detectors like YOLO to avoid additional steps. Object detector concepts can, of course, be combined with arbitrary feature extraction and classification networks as we’ve seen earlier. Also, keep in mind the speed-accuracy tradeoff. So, if you want to be very quick then you, of course, reduce the number of bounding boxes that are predicting because then you are much faster but then you may miss true positives.

因此，让我们总结一下对象检测。主要任务是检测边界框和相关分类。滑动窗口方法效率极低。区域提议网络减少了候选者的数量，但是如果您真的想实现实时，那么就必须使用YOLO之类的单发检测器来避免其他步骤。当然，可以将对象检测器概念与任意特征提取和分类网络相结合，如我们先前所见。另外，请记住速度精度的权衡。因此，如果您想非常快，那么您当然要减少所预测的边界框的数量，因为那样您会快得多，但是您可能会错过真正的积极优势。

So, we now discussed segmentation. We now discussed object detection and how to do object detection very quickly. So next time, we will look into the fusion of both which is going to be instance segmentation. So, thank you very much for watching this video and I’m looking forward to seeing you in the next one.

因此，我们现在讨论分割。现在，我们讨论了对象检测以及如何非常快速地进行对象检测。因此，下一次，我们将研究两者的融合，这将是实例分割。因此，非常感谢您观看此视频，我期待与您在下一个视频中见面。

The Yolo COCO object detector in action. Image created using gifify. Source: YouTube

If you liked this post, you can find more essays here, more educational material on Machine Learning here, or have a look at our Deep LearningLecture. I would also appreciate a follow on YouTube, Twitter, Facebook, or LinkedIn in case you want to be informed about more essays, videos, and research in the future. This article is released under the Creative Commons 4.0 Attribution License and can be reprinted and modified if referenced. If you are interested in generating transcripts from video lectures try AutoBlog.

如果你喜欢这篇文章，你可以找到这里更多的文章，更多的教育材料，机器学习在这里，或看看我们的深入学习讲座。如果您希望将来了解更多文章，视频和研究信息，也欢迎关注YouTube ， Twitter ， Facebook或LinkedIn 。本文是根据知识共享4.0署名许可发布的，如果引用，可以重新打印和修改。如果您对从视频讲座中生成成绩单感兴趣，请尝试使用AutoBlog 。

翻译自: https://towardsdatascience.com/segmentation-and-object-detection-part-4-f1d0d213976b

检测和语义分割

查看全文

http://www.taodudu.cc/news/show-1873942.html

工业革命书_工业革命以来最重大的变化
实现无缝滑屏怎么实现_无缝扩展人工智能以实现分布式大数据
colab 数据集_Google Colab上的YOLOv4：轻松训练您的自定义数据集（交通标志）
人工智能和机器学习的前五门课程
c语言儿童教学_五岁儿童的自然语言处理
星球大战telnet_重制星球大战：第四集（1977）
ai人工智能的数据服务_建立AI系统的规则-来自数据科学家
语音库构建_推动数据采用，以通过语音接口构建更好的产品
openai-gpt_GPT-3是“人类”吗？
自动化运维--python_自动化-设计师的朋友还是敌人？
ai人工智能的数据服务_数据科学和人工智能如何改变超市购物
游戏ai人工智能_AI与游戏，第1部分：游戏如何推动了两门AI研究流派
AI的帕雷多利亚
ai转型指南_穿越AI转型的转折点
机器学习算法：马尔可夫链
node-red 可视化_可视化与注意-第1部分
图像数据增强扩充数据库_分析数据扩充以进行图像分类
ai伴侣2.4.7_人工智能：世界各地的活动（7月4日）
如何简化卷积神经网络_卷积神经网络：简化
人工智能ai医学辅助系统_不同的人工智能（AI）技术彻底改变了医学领域（AIM）...
仅使用Python代码从零开始进行Logistic回归
python精妙算法_YOLOv4：高速物体检测的精妙之处
watson机器人_使您的聊天机器人看起来更加智能！ Watson Assistant的隐藏功能。
评估分类器模型性能
预测自适应滤波_使用自适应滤波的时间序列预测
蜜源假货_假货
机器学习预测模型_基于机器学习模型的汽车价格预测（第2部分）
artsy 爬虫_让我们得到Artsy！使用神经网络创建自定义Snapchat过滤器！
大数据机器学习人工智能_在这个季节中，您如何免费学习数据科学，人工智能和机器学习。...
机器学习时会发生什么

检测和语义分割_分割和对象检测-第4部分相关推荐

检测和语义分割_分割和对象检测-第2部分
检测和语义分割有关深层学习的FAU讲义 (FAU LECTURE NOTES ON DEEP LEARNING) These are the lecture notes for FAU's YouT ...
检测和语义分割_分割和对象检测-第1部分
检测和语义分割有关深层学习的FAU讲义 (FAU LECTURE NOTES ON DEEP LEARNING) These are the lecture notes for FAU's YouT ...
检测和语义分割_分割和对象检测-第5部分
检测和语义分割有关深层学习的FAU讲义 (FAU LECTURE NOTES ON DEEP LEARNING) These are the lecture notes for FAU's YouT ...
欧几里得距离网络_使用Tensorflow对象检测模型和OpenCV的社交距离和遮罩检测器...
将深度学习和计算机视觉相结合的隔离项目社会距离化这个术语已席卷全球,正在改变着我们的生活方式.社交距离也称为"物理距离",是指在您自己与其他并非来自家庭的人之间保持安全的空间.随 ...
深度学习和目标检测系列教程 1-300：什么是对象检测和常见的8 种基础目标检测算法
@Author:Runsen 由于毕业入了CV的坑,在内卷的条件下,我只好把别人卷走. 对象检测对象检测是一种计算机视觉技术,用于定位图像或视频中的对象实例.对象检测算法通常利用机器学习或深度学习来 ...
python 检测文件更新失败_依赖错误，检测更新失败，提示这个
该楼层疑似违规已被系统折叠隐藏此楼查看此楼 rick@rick-PC:~$ sudo apt-get update && sudo apt-get dist-upgrade -y 命 ...
mask rcnn实例分割_使用Mask-RCNN的实例分割
mask rcnn实例分割 In this article, I will be creating my own trained model for detecting potholes. For d ...
机器学习算法拟合曲线_制定学习曲线以检测机器学习算法中的错误
机器学习算法拟合曲线机器学习 (Machine Learning) The learning curve is very useful to determine how to improve th ...
YOLO 对象检测 OpenCV 源代码
请直接查看原文章 YOLO 对象检测 OpenCV 源代码 https://hotdog29.com/?p=621 在 2019年7月8日上张贴由 hotdog发表回复 YOLO YOLO 在本教 ...
轻松学Pytorch –使用torchvision实现对象检测
点击上方"小白学视觉",选择加"星标"或"置顶" 重磅干货,第一时间送达大家好,前面一篇文章介绍了torchvision的模型ResNet ...

检测和语义分割_分割和对象检测-第4部分

有关深层学习的FAU讲义 (FAU LECTURE NOTES ON DEEP LEARNING)

导航 (Navigation)

相关文章：

检测和语义分割_分割和对象检测-第4部分相关推荐

最新文章

热门文章