mongdb 群集

Self-supervised learning, semi-supervised learning, pretraining, self-training, robust representations, etc. are some of the hottest terms right now in the field of Computer Vision and Deep Learning. The recent progress in terms of self-supervised learning is astounding. Towards this end, researchers at FAIR have now come up with this new paper that introduces a new method to learn robust image representations.

自我监督学习,半监督学习,预训练,自我训练,鲁棒表示等是计算机视觉和深度学习领域中目前最热门的术语。 自我监督学习方面的最新进展令人震惊。 为此,FAIR的研究人员现在提出了这份新论文 ,其中介绍了一种学习鲁棒图像表示的新方法。

介绍 (Introduction)

One of the most important goals of self-supervised learning is to learn robust representations without using labels. Recent works try to achieve this goal by combining two elements: Contrastive loss and Image transformations. Basically, we want our model to learn more robust representations, not just high-level features, and to achieve a certain level of invariance to image transformations.

自我监督学习的最重要目标之一是在不使用标签的情况下学习可靠的表示形式。 最近的工作试图通过结合两个要素来实现这一目标: 对比损失图像变换。 基本上,我们希望我们的模型学习更鲁棒的表示,而不仅仅是高级功能,并实现图像转换的一定程度的不变性。

The contrastive loss explicitly compares pairs of image representations. It pushes away the representations that come from different images while pulling together the representations that come from a different set of transformations or views of the same image.

对比损失明确地比较了成对的图像表示。 它将来自不同图像的表示推开,而将来自同一图像的一组不同变换或视图的表示汇总在一起。

Computing all the pairwise comparisons on a large dataset is not practical. There are two ways to overcome this constraint. First, instead of comparing all pairs, approximate the loss by reducing the comparison to a fixed number of random images. Second, instead of approximating the loss, we can approximate the task, e.g. instead of discriminating between each pair, discriminating between groups of images with similar features.

在大型数据集上计算所有成对比较是不实际的。 有两种方法可以克服此约束。 首先,不是比较所有对而是通过将比较减少为固定数量的随机图像来近似损失。 第二,代替近似损失,我们可以近似任务,例如,代替区别每对,区别具有相似特征的图像组。

Clustering is a good example of this kind of approximation. However, clustering alone can’t solve the problem as it has its own limitations. For example, the objective of clustering doesn’t scale well with the dataset as it requires a pass over the entire dataset to form image codes during training.

聚类就是这种近似的一个很好的例子。 但是,仅集群无法解决问题,因为它有其自身的局限性。 例如,聚类的目标无法随数据集很好地扩展,因为它需要在训练过程中遍历整个数据集以形成图像代码

提案 (Proposal)

To overcome the limitations listed above, the authors proposed the following:

为了克服上述限制,作者提出了以下建议:

  1. Online Clustering Loss: The authors propose a scalable loss function that works both on large as well as small batch sizes and doesn’t require extra stuff like a memory bank or a momentum encoder. Theoretically, it can be scaled to an unlimited amount of data.

    在线聚类损失:作者提出,在大型以及批量的作品既并且不需要像记忆库或气势编码器额外的东西一个可伸缩的损失函数。 从理论上讲,它可以扩展到无限量的数据。

  2. Multi-Crop Strategy: A new augmentation technique that uses a mix of views with different resolutions in place of two full-resolution views, without increasing the memory or compute requirements much.

    多作物策略:一种新的增强技术,它使用具有不同分辨率的视图混合来代替两个全分辨率视图,而又不增加内存或计算需求。

  3. Combining the above two into a single model that outperforms all other SSL methods as well as pretraining on multiple downstream tasks.将上述两种方法组合成一个模型,该模型优于所有其他SSL方法以及对多个下游任务的预训练。

方法 (Method)

The ultimate goal of this exercise is to learn visual features in an online manner without supervision. To achieve this, the authors propose an online clustering-based self-supervised learning.

该练习的最终目标是在没有监督的情况在线方式学习视觉功能 为了实现这一目标,作者提出了一种基于在线聚类的自我监督学习方法。

But how is it different from typical clustering approaches?

但是,它与典型的聚类方法有何不同?

Typical clustering methods like DeepClutsering are offline as they rely on two steps in general. In the first step, you cluster the image features of the entire dataset and in the second step, we predict the clusters or the codes for different image views. The fact that these methods require multiple passes over the dataset makes them unsuitable for online learning. Let us see how the authors tackle these problems step by step.

DeepClutsering之类的典型集群方法通常都需要两个步骤,因此它们处于脱机状态 。 第一步,对整个数据集的图像特征进行聚类;在第二步中,我们预测不同图像视图的聚类或代码。 这些方法需要对数据集进行多次遍历,这一事实使其不适用于在线学习。 让我们看看作者如何逐步解决这些问题。

在线聚类 (Online Clustering)

  1. We have an image transformation set T. Each image xn is transformed into an augmented view xnt by applying a transformation t sampled from T.

    我们有一个图像变换集T。通过应用从T采样的变换t,每个图像xn被变换为增强视图xnt

  2. The augmented view, xnt, is then mapped to a vector representation by applying a non-linear mapping.

    然后,通过应用非线性映射将增强视图xnt映射到矢量表示

  3. This feature vector is then projected to a unit sphere, which IMO is just a normalization process. Let’s take a look at the order again:然后将此特征向量投影到单位球上,IMO只是一个标准化过程。 让我们再次看一下顺序:

4. We then compute the codes qnt, for this vector znt by mapping it to a set of. K trainable prototype vectors {c₁, c₂…..c_k}. The matrix formed by these vectors is denoted by C.

4.然后我们计算代码qnt 通过将向量znt映射到一组向量。 K个可训练的原型向量{c1,c2 ..... c_k}。 这些向量形成的矩阵用C表示。

交换预测问题 (Swapped Prediction Problem)

We talked about image transformation, the feature vector projection, and code computation(q) but we haven’t discussed why are we doing it this way. As said earlier, one of the goals of this whole exercise is to learn visual features online without any supervision. We want our models to learn robust representations that are consistent across different image views.

我们讨论了图像变换,特征向量投影和代码计算(q),但我们没有讨论为什么要这样做。 如前所述,整个练习的目标之一是在没有任何监督的情况下在线学习视觉特征。 我们希望我们的模型学习在不同图像视图之间一致的鲁棒表示。

The authors propose to enforce consistency between codes from different augmentations of the same image. This is inspired by contrastive learning but the difference is that instead of directly comparing the feature vectors, we would compare the cluster assignment for different image views. How?

作者建议在同一张图片的不同扩充内容之间强制执行代码之间的一致性。 这是受对比学习启发的,但不同之处在于,我们将比较不同图像视图的聚类分配,而不是直接比较特征向量。 怎么样?

Once we have computed the codes zt and zs from two different augmentations of the same image, we would compute the codes qt and qs by mapping the features vectors to the K prototypes. The authors then propose to use a swapped prediction problem with the following function:

一旦我们从同一张图像的两个不同扩充中计算出代码ztzs ,就可以通过将特征向量映射到K个原型来计算代码qtqs 。 然后,作者建议使用具有以下功能的交换预测问题:

Each term on the right-hand side in this equation is cross-entropy loss measures the fit between feature z and code q. The intuition behind this is that if the two features capture the same information, it should be possible to predict the code from the other feature. It is almost similar to contrastive learning but here we are comparing the codes instead of the features directly. If we expand one of the terms on the right-hand side, it looks like this:

该方程式右侧的每个项都是交叉熵损失 测量特征z和代码q之间的拟合 这背后的直觉是,如果两个功能捕获相同的信息,则应该可以从另一个功能预测代码。 这几乎与对比学习类似,但是在这里我们直接比较代码而不是特征。 如果我们在右侧扩展术语之一,则如下所示:

Here the softmax operation is applied on the dot product of z and C. The term Τ is the temperature parameter. Taking this loss over all the images and pairs of data augmentations leads to the following loss function for the swapped prediction problem:

此处,softmax操作应用于z和C的点积。项Τ是温度参数。 对所有图像和数据增强对进行这种损失会导致以下损失函数用于交换预测问题:

This loss function is jointly minimized with respect to the prototypes C and the parameters θ of the image encoder f, used to produce the features znt

相对于原型C和用于生成特征znt的图像编码器f的参数θ,该损失函数被共同最小化。

在线代码计算 (Online Codes computation)

When we started this discussion, we talked about offline vs online clustering, but we haven’t looked at how this method is online.

在开始讨论时,我们谈到了离线群集和在线群集,但是我们没有研究此方法如何在线。

In order to make this method online, the authors propose to computed codes using only image features within a batch. The codes are computed using the prototypes C such that all the examples in a batch are equally partitioned by the prototype. The equipartition constraint is very important here as it ensures that the codes for different images in a batch are distinct, thus preventing the trivial solution where every image has the same code.

为了使此方法在线,作者建议仅使用批处理中的图像特征来计算代码。 使用原型C计算代码,使得一批中的所有示例均由原型平均划分。 等分约束在这里非常重要,因为它确保了批次中不同图像的代码是不同的,从而避免了每个图像都具有相同代码的简单解决方案。

Given B feature vectors Z = [z₁, z₂, . . . , z_B], we are interested in mapping them to the prototypes C = [c₁, . . . , c_K]. This mapping or the codes are represented by Q = [q₁, . . . , qB], and Q is optimized to maximize the similarity between the features and the prototypes, i.e.

给定B个特征向量Z = [z 1,z 2,...。 。 。 ,z_B],我们有兴趣将它们映射到原型C = [c₁,...。 。 。 ,c_K]。 此映射或代码由Q = [q₁,...表示。 。 。 ,qB]和Q进行了优化,以最大化特征和原型之间的相似性,即

where H(Q) is the entropy function, and ε is a parameter that controls the smoothness of the mapping. The above expression represents the optimal transport problem (more about it later). We have the features and the prototypes and now with that, we want to find the optimal codes. The entropy term on the right-hand helps in equipartition (Please correct me in the comments section if I am wrong).

其中H(Q)是熵函数,而ε是控制映射平滑度的参数。 上面的表达式代表了最佳的运输问题(稍后会详细介绍)。 我们具有功能和原型,现在,我们想要找到最佳代码。 右边的熵术语有助于均分( 如果我错了,请在评论部分中对我进行纠正 )。

Also, as we are working on mini-batches, the constraint is imposed on mini-batches and looks something like this:

另外,当我们在迷你批处理上工作时,对迷你批处理施加了约束,它看起来像这样:

where 1_K denotes the vector of ones in dimension K. These constraints enforce that on average each prototype is selected at least (B / K) times in the batch.

其中1_K表示维度K中1的向量。这些约束使得每个批次中的每个原型平均至少选择(B / K)次。

Once a solution Q* is found for (3), there are two options we can go with. First, we can directly use the soft codes. Second, we can get discrete codes by rounding the solution. The authors found out that discrete codes work well when computing codes in an offline manner on the full dataset. However, in the online setting where we are working with mini-batches, using the discrete codes performs worse than using the continuous codes. An explanation is that the rounding needed to obtain discrete codes is a more aggressive optimization step than gradient updates. While it makes the model converge rapidly, it leads to a worse solution. These soft codes Q* takes the form of a normalized exponential matrix.

找到(3)的解Q *后,我们可以使用两种选择。 首先,我们可以直接使用软代码。 其次,通过四舍五入可以得到离散代码。 作者发现,离散代码在完整数据集上以离线方式计算代码时效果很好。 但是,在我们使用迷你批处理的在线设置中,使用离散代码比使用连续代码更糟糕。 一种解释是获得离散代码所需的舍入是比梯度更新更积极的优化步骤。 尽管它使模型快速收敛,但会导致更糟糕的解决方案。 这些软代码Q *采用归一化指数矩阵的形式。

Here u and v are renormalization vectors computed using a small number of matrix multiplications using the iterative Sinkhorn-Knopp algorithm.

uv是使用迭代Sinkhorn-Knopp算法使用少量矩阵乘法计算的重归一化向量

Side note: Thanks to Amit Chaudhary for pointing out the relevant resources for the transportation polytope and Sinkhorn-Knopp algorithm. You can read about these two in detail here and here

旁注:感谢 Amit Chaudhary 指出运输多态性和Sinkhorn-Knopp算法的相关资源。 你可以阅读一下这两个详细 她的 E和 这里

小批量工作 (Working with small batches)

When the number B of batch features is too small compared to the number of prototypes K, it is impossible to equally partition the batch into the K prototypes. Therefore, when working with small batches, the authors use features from the previous batches to augment the size of Z in (3), and only the codes of the batch features are used in the training loss.

当批次特征的数量B与原型K的数量相比太小时,无法将批次均等地划分为K个原型。 因此,在处理小批次时,作者使用先前批次中的特征来增加(3)中的Z的大小,并且训练损失中仅使用了批次特征的代码。

The authors propose to store around 3K features, i.e., in the same range as the number of code vectors. This means that they only keep features from the last 15 batches with a batch size of 256, while contrastive methods typically need to store the last 65K instances obtained from the last 250 batches.

作者建议存储3K左右的特征,即与代码向量的数量在同一范围内。 这意味着它们仅保留最近15个批次中的特征(批次大小为256),而对比方法通常需要存储从最近250个批次中获得的最近65K实例。

All of the above information is related to online clustering only. Nowhere you discussed the new augmentation strategy. Trying to keep the blogpost short? Huh!

以上所有信息仅与在线群集有关。 您无处讨论新的扩充策略。 试图使博客文章简短吗?

多幅裁剪:以较小的图像增强视图 (Multi-crop: Augmenting views with smaller images)

It is a known fact that random crops always help (both in supervised as well as in self-supervised). Comparing random crops of an image plays a central role by capturing information in terms of relations between parts of a scene or an object.

众所周知,随机作物总是有帮助的(无论是在有监督的还是在有自我监督的情况下)。 通过捕获场景或对象各部分之间的关​​系信息,比较图像的随机作物起着核心作用。

Perfect. Let’s take crops of sizes 4x4, 8x8, 16x16, 32x32, ….. Enough data to make the bloody network learn, ha!

完善。 让我们以4x4、8x8、16x16、32x32等大小的农作物为例。..足够的数据使血腥的网络学会了,哈!

Well, you can do that, but increasing the number of crops quadratically increases the memory and compute requirements. To address this, the authors proposed a new multi-crop strategy where they use:

好的,您可以这样做,但是增加农作物的数量将二次增加内存和计算需求。 为了解决这个问题,作者提出了一种新的多作物策略,他们在其中使用:

  1. Two standard resolution crops.两种标准分辨率作物。
  2. V additional low-resolution crops that cover only small parts of the image.

    V仅覆盖图像的一小部分的其他低分辨率作物。

The loss function in (1) is then generalized as:

然后将(1)中的损失函数概括为:

The codes are computed using only the standard resolution crops. It is intuitive that if you include all the crops, it will increase the computational time. Also, if the crops are taken over a very small area, it won’t add much info, and this, very limited, partial information can degrade the overall performance.

仅使用标准分辨率作物计算代码。 直观地讲,如果包括所有农作物,则会增加计算时间。 另外,如果将农作物收在很小的区域,则不会增加太多信息,而这种非常有限的部分信息会降低整体性能。

结果 (Results)

The authors performed a bunch of experiments. I won’t be listing down all the training details here. You can read about them directly from the paper. One important thing to note is that most of the hyperparameters were directly taken from the SimCLR paper along with the LARS optimizer with cosine learning rate, and the MLP projection head. I am listing down some of the results here.

作者进行了大量实验。 我不会在这里列出所有培训详细信息。 您可以直接从纸上阅读它们。 需要注意的重要一件事是,大多数超参数直接取自SimCLR论文,以及具有余弦学习速率LARS优化器 r和MLP投影头。 我在这里列出了一些结果。

Transfer learning on downstream tasks
转移学习下游任务

结论 (Conclusion)

I liked this paper a lot. IMHO, this is one of the best papers on SSL to date. Not only it tries to address the problems associated with instance discrimination task and contrastive learning but it also proposes a very creative solution to move forward. The biggest strength of this method is that it is online.

我非常喜欢这篇论文。 恕我直言,这是迄今为止关于SSL的最佳论文之一。 它不仅试图解决与实例区分任务和对比学习有关的问题,而且还提出了一个非常有创意的解决方案来向前发展。 这种方法的最大优势是在线。

翻译自: https://medium.com/@nainaakash012/unsupervised-learning-of-visual-features-by-contrasting-cluster-assignments-fbedc8b9c3db

mongdb 群集


http://www.taodudu.cc/news/show-863841.html

相关文章:

  • ansys电力变压器模型_变压器模型……一切是如何开始的?
  • 浓缩摘要_浓缩咖啡的收益递减
  • 机器学习中的无监督学习_无监督机器学习中聚类背后的直觉
  • python初学者编程指南_动态编程初学者指南
  • raspberry pi_在Raspberry Pi上使用TensorFlow进行对象检测
  • 我如何在20小时内为AWS ML专业课程做好准备并进行破解
  • 使用composer_在Google Cloud Composer(Airflow)上使用Selenium搜寻网页
  • nlp自然语言处理_自然语言处理(NLP):不要重新发明轮子
  • 机器学习导论�_机器学习导论
  • 直线回归数据 离群值_处理离群值:OLS与稳健回归
  • Python中机器学习的特征选择技术
  • 聚类树状图_聚集聚类和树状图-解释
  • 机器学习与分布式机器学习_我将如何再次开始学习机器学习(3年以上)
  • 机器学习算法机器人足球_购买足球队:一种机器学习方法
  • 机器学习与不确定性_机器学习求职中的不确定性
  • pandas数据处理 代码_使用Pandas方法链接提高代码可读性
  • opencv 检测几何图形_使用OpenCV + ConvNets检测几何形状
  • 立即学习AI:03-使用卷积神经网络进行马铃薯分类
  • netflix 开源_Netflix的Polynote是一个新的开源框架,可用来构建更好的数据科学笔记本
  • 电场 大学_人工电场优化算法
  • 主题建模lda_使用LDA的Google Play商店应用评论的主题建模
  • 胶囊路由_评论:胶囊之间的动态路由
  • 交叉验证python_交叉验证
  • open ai gpt_您实际上想尝试的GPT-3 AI发明鸡尾酒
  • python 线性回归_Python中的简化线性回归
  • 机器学习模型的性能指标
  • 利用云功能和API监视Google表格中的Cloud Dataprep作业状态
  • 谷歌联合学习的论文_Google的未来联合学习
  • 使用cnn预测房价_使用CNN的人和马预测
  • 利用colab保存模型_在Google Colab上训练您的机器学习模型中的“后门”

mongdb 群集_通过对比群集分配进行视觉特征的无监督学习相关推荐

  1. NeurIPS 2020 :ReID任务大幅领先,港中文开源自步对比学习框架,充分挖掘无监督学习样本...

    作者丨葛艺潇 来源丨https://zhuanlan.zhihu.com/p/269112325 编辑丨极市平台 导语:本文介绍一篇作者发表于NeurIPS-2020的论文: <Self-pac ...

  2. sql server 群集_部署具有群集共享卷SQL Server –第2部分

    sql server 群集 In the other article in this series: Deploy SQL Server for failover clustering with Cl ...

  3. sql server 群集_部署SQL Server以使用群集共享卷进行故障转移群集–第1部分

    sql server 群集 Microsoft SQL Server provides us with a wide variety of solutions to architect High av ...

  4. 【VMware vSAN 7.0】6.12 将延伸群集转换为标准 vSAN 群集—我们有软硬件解决方案

    目录 1. vSAN简介 1.1 vSAN 概念 1.1.1 vSAN 的特性 1.2 vSAN术语和定义 1.3 vSAN 和传统存储 1.4 构建 vSAN 群集 1.5 vSAN 部署选项 1. ...

  5. weka分类器怎么设置样本类别_自步对比学习: 充分挖掘无监督学习样本

    本文为香港中文大学MMLab实验室博士生葛艺潇投稿. 本文介绍一篇我们发表于NeurIPS-2020的论文<Self-paced Contrastive Learning with Hybrid ...

  6. 用VMware GSX和W2K群集服务实现Exchange群集

    第一部分:网络配置,内存512M*2 有三台服务器,每台都有两快虚拟网卡:VMnet1,VMnet2.第一台服务器DC01,作为AD,另两台名为cluster1,cluster2作为两个节点.DC01 ...

  7. 支持群集系统服务器,启用对使用群集 RAID 控制器的群集 Windows 服务器的支持

    启用对使用群集 RAID 控制器的群集 Windows 服务器的支持 09/17/2020 本文内容 本文针对为基于 Windows Server 的故障转移群集运行群集验证向导时验证可能失败的问题提 ...

  8. 无监督学习与监督学习_有监督与无监督学习

    无监督学习与监督学习 If we don't know what the objective of the machine learning algorithm is, we may fail to ...

  9. 机器学习集群_机器学习中的多合一集群技术在无监督学习中应该了解

    机器学习集群 Clustering algorithms are a powerful technique for machine learning on unsupervised data. The ...

最新文章

  1. JVM中对象如何在堆内存分配
  2. 学习旧岛小程序 (1) flex 布局
  3. 获取了网站源码有什么用_角点科技:用 Wordpress 建设企业网站需要准备些什么...
  4. [Oracle][Corruption]究竟哪些检查影响到 V$DATABASE_BLOCK_CORRUPTION
  5. TensorFlow深度学习应用开发实战(深度学习简介和开发环境搭建)
  6. 二叉搜索树的思想,以及增删查改的实现
  7. uva1382 Distant Galaxy
  8. javascript 常用小例子收集
  9. visual studio 2013连接Oracle 11g并获取数据:(一:环境搭建)
  10. 网页传奇服务器端,拍拍科技武易传奇神鸟归来商业版+网站
  11. 大漠插件:找图位置偏移(超出界面边界)
  12. Linux-常用工具
  13. 【精】【爆】MTK手机安装软件游戏大全!新人必看
  14. Arduino+nRF24L01无线遥控舵机和电机
  15. vnc远程控制软件 有哪些vnc远程控制软件推荐
  16. 三位数的水仙花数有哪些?
  17. 腾讯云服务器备案全流程 40天备案的血与泪
  18. Excel利用公式向导不会函数也可去掉文本2端空格
  19. 字节流读写文件案例——模拟文件(头像)上传功能
  20. 如何进行架构技术选型

热门文章

  1. LUA: lua基础.
  2. 04. Web大前端时代之:HTML5+CSS3入门系列~HTML5 表单
  3. Flume自定义Hbase Sink的EventSerializer序列化类
  4. C#~异步编程续~.net4.5主推的awaitasync应用
  5. java中 8进制 10进制 2进制 16进制 相互转换
  6. Oracle PL/SQL 非预定义异常、自定义异常处理、RAISE_APPLICATION_ERROR
  7. Windows xp 如何查看SID?
  8. Ajax初体验(一)
  9. jquery.form.js插件中ajaxSubmit提交在jquery1.4版本中的应用
  10. sap 打印预览界面点击打印时记录打印次数_SAP打印机设置