gan神经网络

Neural Synesthesia is an AI art project that aims to create new and unique audiovisual experiences with artificial intelligence. It does this through collaborations between humans and generative networks. The results feel almost like organic art. Swirls of color and images blend together as faces, scenery, objects, and architecture transform to music. There’s a sense of things swinging between feeling unique and at the same time oddly familiar.

神经联觉是一个AI艺术项目,旨在通过人工智能创造新的独特视听体验。 它通过人类与生成网络之间的协作来做到这一点。 结果几乎就像是有机艺术。 当面Kong,风景,物体和建筑转变成音乐时,色彩和图像的漩涡融合在一起。 在感觉独特和陌生熟悉之间,事物之间会摇摆不定。

Neural Synesthesia was created by Xander Steenbrugge, an online content creator who made his start in data science while working on brain-computer interfaces. During his master thesis, he helped build a system that classified imagined movement through brain signals. This system allowed patients suffering from Locked-in syndrome to manipulate physical objects with their minds. The experience impressed upon Steenbrugge the importance of machine learning, and the potential for AI technology to build amazing things.

神经联觉是由在线内容创建者Xander Steenbrugge创建的,他在研究脑机接口的同时就开始了数据科学的研究。 在硕士论文期间,他帮助建立了一个通过大脑信号对想象的运动进行分类的系统。 该系统使患有锁定症候群的患者能够用自己的思想操纵身体。 这次经历使Steenbrugge深刻地意识到了机器学习的重要性以及AI技术构建惊人事物的潜力。

Outside of Neural Synesthesia, Steenbrugge works with a startup using machine learning for drug discovery and runs a popular YouTube channel. He’s also working on wzrd.ai, a platform that augments audio with immersive video through the work of AI. In this interview, we talk about Neural Synesthesia’s inspiration, how it works, and discuss AI and creativity.

在神经通感之外,Steenbrugge与一家使用机器学习进行药物发现的初创公司合作,并经营着一个受欢迎的YouTube频道。 他还在wzrd.ai上工作,该平台通过AI的工作通过沉浸式视频增强音频。 在这次采访中,我们讨论了神经联觉的灵感,它的工作原理,并讨论了AI和创造力。

神经联觉的灵感是什么? (What were the inspirations for Neural Synesthesia?)

I’ve always had a fascination for aesthetics. Examples are mountain panoramas, indie game design, scuba diving in coral reefs, psychedelic experiences, and films by Tarkovsky. Beautiful visual scenes have the power to convey meaning without words. It’s almost like a primal, visual language we all speak intuitively.

我一直对美学着迷。 例如山峰全景,独立游戏设计,在珊瑚礁中进行水肺潜水,迷幻体验以及塔可夫斯基的电影。 美丽的视觉场景可以传达无言的意义。 几乎就像我们都直觉地说的原始视觉语言一样。

When I saw the impressive advances in generative models (especially GANs), I started imagining where this could lead. Just like the camera and the projector brought about the film industry, I wondered what narratives could be built on top of the deep learning revolution. To get hands on with this, my first idea was to simply tweak the existing codebases for GANs to allow for direct visualization of audio. This was how Neural Synesthesia was born.

当我看到生成模型(尤其是GAN)令人印象深刻的进步时,我开始想象这可能会导致什么。 就像照相机和放映机带动了电影业一样,我想知道在深度学习革命的基础上可以建立什么样的叙述。 为此,我的第一个想法是简单地调整GAN的现有代码库,以实现音频的直接可视化。 这就是神经联觉的诞生方式。

您为第一个神经联觉疗法做了多少工作? 您面临任何独特的挑战吗? (How much work did you do for the first Neural Synesthesia piece? Did you face any unique challenges?)

I think coding for the first rendered video took over six months because I was doing it in my spare time. The biggest challenge was how to manipulate the GANs latent input space using features extracted from the audio track. I wanted to create a satisfying match between visual and auditory perception for viewers.

我认为为第一个渲染视频编码需要花费六个多月的时间,因为我在业余时间进行编码。 最大的挑战是如何使用从音轨中提取的特征来操纵GANs潜在输入空间。 我想为观众在视觉和听觉感知之间创造令人满意的匹配。

Here’s a little insight into what I do: I apply a Fourier Transform to extract time varying frequency components from the audio. I also perform harmonic/percussive decomposition, which basically separates the melody from the rhythmic components of the track. These three signals (instantaneous frequency content, melodic energy, and beats) are then combined to manipulate the GANs latent space, resulting in visuals that are directly controlled by the audio.

以下是我的操作的一些见解:我应用了傅里叶变换从音频中提取时变频率分量。 我还执行谐音/打击乐分解,从本质上将旋律与曲目的节奏成分分开。 然后将这三个信号(瞬时频率含量,旋律能量和节拍)组合起来,以操纵GAN的潜在空间,从而产生由音频直接控制的视觉效果。

每个图像数据集是否唯一? 您如何收集这些数据集的图像,以及需要多少图像? (Is every image dataset unique? How do you collect images for these datasets, and how many images do you need?)

I spent a lot of time collecting large and diverse image training data to create interesting generative models. These datasets have aesthetics as their primary goal rather than realism, like most GANs. Experimenting with various blends of image collections is time consuming, since GAN training requires lots of compute and I don’t exactly have a data center at my disposal.

我花了大量时间收集大量多样的图像训练数据,以创建有趣的生成模型。 像大多数GAN一样,这些数据集以美学为主要目标,而不是现实主义。 由于GAN训练需要大量计算,而且我没有一个完全可用的数据中心,因此尝试各种混合的图像收集非常耗时。

Most of the datasets I use are image sets I’ve encountered over the years. I saved them because I knew one day I’d have a use for them. I’ve always had an interest in aesthetics so when I discover something that triggers that sixth sense, I save it.

我使用的大多数数据集都是我多年来遇到的图像集。 我保存了它们,是因为我知道有一天我会用到它们。 我对美学一直很感兴趣,因此当我发现触发第六感的东西时,我就保存下来。

Most GAN papers use datasets of more than 50,000 images, but in practice you can get away with fewer examples. The first step is to start from a pre-trained GAN model that has already been trained on a large dataset. This means the convolutional filters in the model are already well-shaped and contain useful information about the visual world. Secondly, there’s data augmentation, which is basically flipping or rotating an image to effectively increase the amount of training data. Since I don’t really care about sample realism, I can actually afford to do very aggressive image augmentation. This results in many more training images than actual source images. For example, the model I used for a recent performance at Tate Modern had only 3,000 real images, aggressively augmented to a training set of around 70,000.

大多数GAN论文使用​​的数据集超过50,000张图像,但实际上,您可以减少实例的数量。 第一步是从已经在大型数据集上进行训练的预训练GAN模型开始。 这意味着模型中的卷积滤波器已经成形,并且包含有关视觉世界的有用信息。 其次,存在数据增强 ,它基本上是翻转或旋转图像以有效地增加训练数据量。 由于我并不真正关心样本现实性,因此我实际上可以负担得起非常积极的图像增强。 这导致训练图像比实际源图像多得多。 例如,我在泰特现代美术馆(Tate Modern)最近演出时使用的模型只有3,000张真实图像,积极地扩充到约70,000张训练集。

Recently, a lot of new research explicitly addresses the low-data regime for GANs (such as what you can find here, here, and here). My current codebase leverages these techniques to train GANs with as little as a few hundred images.

最近,许多新的研究明确地解决了GAN的低数据机制(例如,您可以在此处 , 此处和此处找到的内容 )。 我当前的代码库利用这些技术来训练仅用几百个图像的GAN。

您将神经通感说成是您和AI之间的协作。 您对利用AI技术的创意项目的未来有什么样的潜力? (You talk about Neural Synesthesia as a collaboration between yourself and AI. What kind of potential do you see for the future of creative projects utilizing AI technology?)

This is actually the most interesting part of the entire project. I usually set out with specific intentions as to what type of visual I want to create. I then curate my dataset, tune the parameters of the training script, and start training the model. A full training run usually requires a few days to converge. Very quickly though, the model starts returning samples that are often unexpected and surprising. This sets an intriguing feedback loop into motion, where I change the code of the model, the model responds with different samples, I react, and it goes on. The creative process is no longer fully under my control; I am effectively collaborating with an AI system to create these works.

这实际上是整个项目中最有趣的部分。 对于要创建哪种视觉效果,我通常会有明确的意图。 然后,我整理数据集,调整训练脚本的参数,然后开始训练模型。 一次完整的训练通常需要几天才能收敛。 不过,模型很快就会开始返回通常出乎意料且令人惊讶的样本。 这将一个引人入胜的反馈循环置于运动中,在该过程中,我更改了模型的代码,模型对不同的样本做出响应,我做出了React,然后继续进行。 创作过程不再完全由我控制。 我正在与AI系统有效地合作来创作这些作品。

I truly believe this is the biggest strength of this approach: you are not limited by your own imagination. There’s an entirely alien system that is also influencing the same space of ideas, often in unexpected and interesting ways. This leads you as a creator into areas you never would have wandered by yourself.

我真正相信这是此方法的最大优势:您不受自己的想象力限制。 有一个完全陌生的系统也经常以意想不到的有趣的方式影响着相同的思想空间。 这将使您作为创作者进入您自己永远不会徘徊的领域。

Looking at the tremendous pace of progress in the field of AI strongly motivates me to imagine what might be possible 10 years from now. After all, modern Deep Learning is only 8 years old! I expect that Moore’s law will continue to bring more powerful computing capabilities, that AI models will continue to scale with more compute, and that the possibilities of this medium will follow this exponential trend.

纵观AI领域的巨大进步,我很想像一下十年后可能发生的事情。 毕竟,现代深度学习只有8年的历史了! 我希望摩尔定律将继续带来更强大的计算功能,人工智能模型将随着更多的计算而继续扩展,并且这种媒介的可能性将遵循这一指数趋势。

Neural Synesthesia in its current form is a prototype. It’s a version 0.1 of a grander idea to leverage deep learning as the core component of the advanced interactive media experiences of the future.

当前形式的神经联觉是一个原型。 这是一个伟大的想法的版本0.1,它将深度学习作为未来高级交互式媒体体验的核心组成部分。

您为神经联觉的未来计划了什么样的创意作品? 您有什么目标或未来计划吗? (What kind of creative works do you have planned for the future of Neural Synesthesia? Do you have any goals or future plans?)

I’ve always been fascinated by the overview effect, where astronauts describe how seeing the Earth in its entirety from space profoundly changes their worldview, kindling the awareness that we are all part of the same, fragile ecosystem, suspended in the blackness of space.

我一直着迷于概述效应 ,在那儿,宇航员描述了如何从太空看到地球的整体而深刻地改变了他们的世界观,激发了我们都属于一个脆弱的生态系统,悬浮在黑暗中的意识。

To me, this is great evidence that profound, alienating experiences can have spectacular effects on people’s choices and behaviors. And what we need is a shift in perception away from tribal feelings of us versus them. We need to move towards a global society with common goals and common challenges.

对我而言,这是充分的证据,表明深刻而疏远的经历会对人们的选择和行为产生巨大影响。 而我们需要的是将观念从我们与他们的部落感觉转移开来。 我们需要朝着具有共同目标和共同挑战的全球社会迈进。

Our world is increasingly facing global issues that are deeply rooted in our locally-centered world views. These views are deeply rooted in our genes; we evolved in small tribes that only needed to attend to their local environments. However, the world is evolving towards a globally connected web of events, where the present can no longer be disconnected from the system as a whole. For example, look at climate change, and people fighting over artificially drawn borders of nationality, race, or even gender.

我们的世界正日益面临根深蒂固于我们以当地为中心的世界观的全球问题。 这些观点深深植根于我们的基因。 我们演变成只需要照顾当地环境的小部落。 但是,世界正在朝着全球连接的事件网络演进,在这里,现在的事物不再与整个系统脱节。 例如,看一下气候变化,人们在人为地划定国籍,种族甚至性别的边界上进行斗争。

As such, my long-term vision is to create rich, immersive experiences with the power to shift perspectives. Cinema 2.0, if you will. I imagine an interactive experience, where a group of people can enter an AI-generated world (e.g. using Virtual Reality headsets) where the visual scenery is so utterly alien and breathtaking that it forces the mind to temporarily halt its usual narrative of describing what’s going on. This is essentially the goal of meditation: to experience the world as it is, emphasizing the experience of the present moment rather than the narrative we construct around it.

因此,我的长期愿景是创造丰富的,身临其境的体验,并具有改变观点的能力。 如果可以的话,Cinema 2.0。 我想象一种交互式的体验,一群人可以进入AI生成的世界(例如使用虚拟现实耳机),视觉风景是如此的陌生和令人惊叹,以至于迫使人们暂时停止通常的叙事方式来描述正在发生的事情上。 本质上,这是冥想的目标:按原样体验世界,强调当前时刻的体验,而不是我们围绕其构建的叙事。

The goal then, is to mimic the perceptual shift one can experience from a positive psychedelic experience, meditative insight, or a trip to space. To realize that our ‘normal’ world view is just a tiny sliver of what it is possible to experience. I believe this perceptual shift is probably the most unique human characteristic. It allows the great wonder of imagination to power our world, and is the most powerful tool we have to tackle the world’s largest challenges.

然后的目标是模仿人们从积极的迷幻经历,冥想见解或太空旅行中可以经历的感知转变。 意识到我们的“正常”世界观只是可能体验到的一小部分。 我相信这种知觉转变可能是人类最独特的特征。 它使想象力成为世界的强大动力,也是我们应对世界上最大挑战的最有力工具。

从技术角度来看,我们离创建这些基本的“ cinema 2.0”体验还有多远? (From a technology standpoint, how far away are we from creating these basic “cinema 2.0” experiences?)

I would say that from a technical point of view, we’re getting very close. The latest Generative models (e.g. StyleGANv2 or BigGanDeep) are able to create very realistic samples and allow for very high diversity. What is lacking at present are creative tools that let non-coders use this technology to get creative. The main challenge, at least for me, is to create a compelling narrative.

我要说的是,从技术角度来看,我们已经很接近了。 最新的Generative模型(例如StyleGANv2或BigGanDeep)能够创建非常逼真的样本并具有很高的多样性。 当前缺乏使非编码者使用该技术进行创作的创作工具。 至少对我而言,主要的挑战是创造引人注目的叙事。

You can see more of Steenbrugge’s Neural Synesthesia work at its dedicated homepage, and try out wzrd.ai here. He’s also active on YouTube and Twitter, and open to collaborating with other creatives who have similar ideas and aspirations. You can contact him at neuralsynesthesia@gmail.com.

您可以在其专用主页上查看Steenbrugge的神经通感的更多作品,并在此处尝试wzrd.ai。 他还活跃于YouTube和Twitter上 ,并愿意与具有类似想法和抱负的其他创意者进行合作。 您可以通过Neurosynesthesia@gmail.com与他联系。

Original article reposted with permission.

原始文章经许可重新发布。

翻译自: https://medium.com/datadriveninvestor/neural-synesthesia-when-art-meets-gans-6453c7c0c5b8

gan神经网络


http://www.taodudu.cc/news/show-863484.html

相关文章:

  • rasa聊天机器人_Rasa-X是持续改进聊天机器人的独特方法
  • python进阶指南_Python特性工程动手指南
  • 人工智能对金融世界的改变_人工智能革命正在改变网络世界
  • 数据科学自动化_数据科学会自动化吗?
  • 数据结构栈和队列_使您的列表更上一层楼:链接列表和队列数据结构
  • 轨迹预测演变(第1/2部分)
  • 人口预测和阻尼-增长模型_使用分类模型预测利率-第3部分
  • 机器学习 深度学习 ai_人工智能,机器学习,深度学习-特征和差异
  • 随机模拟_随机模拟可帮助您掌握统计概念
  • 机器学习算法如何应用于控制_将机器学习算法应用于NBA MVP数据
  • 知乎 开源机器学习_使用开源数据和机器学习预测海洋温度
  • :)xception_Xception:认识Xtreme盗梦空间
  • 评估模型如何建立_建立和评估分类ML模型
  • 介绍神经网络_神经网络介绍
  • 人物肖像速写_深度视频肖像
  • 奇异值值分解。svd_推荐系统-奇异值分解(SVD)和截断SVD
  • 机器学习 对模型进行惩罚_使用Streamlit对机器学习模型进行原型制作
  • 神经网络实现xor_在神经网络中实现逻辑门和XOR解决方案
  • sagan 自注意力_请使用英语:自我注意生成对抗网络(SAGAN)
  • pytorch 音频分类_Pytorch中音频的神经风格转换
  • 变压器 5g_T5:文本到文本传输变压器
  • 演示方法:有抱负的分析师
  • 机器学习 模型性能评估_如何评估机器学习模型的性能
  • 深度学习将灰度图着色_通过深度学习为视频着色
  • 工业机器人入门实用教程_机器学习实用入门
  • facebook 图像比赛_使用Facebook的Detectron进行图像标签
  • 营销大数据分析 关键技术_营销分析的3个最关键技能
  • ue4 gpu构建_待在家里吗 为什么不构建GPU Box!
  • 使用机器学习预测天气_使用机器学习的二手车价格预测
  • python集群_使用Python集群文档

gan神经网络_神经联觉:当艺术遇见GAN相关推荐

  1. 转移神经网络_神经体系结构转移

    转移神经网络 机器学习 (Machine Learning) Neural network topology describes how neurons are connected to form a ...

  2. 模糊神经网络_神经网络模型:当网络开始产生类似于人类思维的过程

    01 对于人和动物来说,我们的机器是用神经元制造的,也就是神经网络. 人类对于智能如何工作的图景,是建立在大脑如何工作的简单想法基础上的. 神经元很复杂,但是人类已经初步识别出神经元所做的一件重要的事 ...

  3. 循环神经网络 递归神经网络_了解递归神经网络中的注意力

    循环神经网络 递归神经网络 I recently started a new newsletter focus on AI education. TheSequence is a no-BS( mea ...

  4. 词嵌入应用_神经词嵌入的法律应用

    词嵌入应用 A fundamental issue with LegalTech is that words - the basic currency of all legal documentati ...

  5. 模型 标签数据 神经网络_大型神经网络和小数据的模型选择

    模型 标签数据 神经网络 The title statement is certainly a bold claim, and I suspect many of you are shaking yo ...

  6. 卷积网络和卷积神经网络_卷积神经网络的眼病识别

    卷积网络和卷积神经网络 关于这个项目 (About this project) This project is part of the Algorithms for Massive Data cour ...

  7. 多层感知机 深度神经网络_使用深度神经网络和合同感知损失的能源产量预测...

    多层感知机 深度神经网络 in collaboration with Hsu Chung Chuan, Lin Min Htoo, and Quah Jia Yong. 与许忠传,林敏涛和华佳勇合作. ...

  8. 转:卷积神经网络_(1)卷积层和池化层学习

    博主总结的很好,学习中.转载:http://www.cnblogs.com/zf-blog/p/6075286.htm 卷积神经网络_(1)卷积层和池化层学习 卷积神经网络(CNN)由输入层.卷积层. ...

  9. 04.卷积神经网络_第一周卷积神经网络

    04.卷积神经网络_第一周卷积神经网络 1.1 计算机视觉 上面这张图是64x64像素的图像,它的数据量是12288,下图是1000x1000像素的图像,它的数据量是3百万.如果用全连接网络去处理这张 ...

最新文章

  1. easyuefi只能在基于uefi启动的_苹果电脑怎么从u盘启动|苹果笔记本按哪个键选u盘启动...
  2. java对文本文件进行操作:读取、修改、添加、删除、重命名等
  3. 人工智能阴影检测与去除,实现一种基于反射的阴影检测与去除方法
  4. 利用SeekFree的核心板调试MM32F3277的ISP功能
  5. CStopWatch计时器的用法实例
  6. Json.net|NH|Log4net|Test等工具下载地址
  7. 1w存银行一年多少利息_利息能拿上万?银行行长:20万存款这样存,一年躺着白白赚一万!...
  8. leetcode436. 寻找右区间(二分法)
  9. AIgorand的相关学习参考链接
  10. 平台式可复用的应用集成能力,助您敏捷、高效的完成企业数字化转型
  11. 【Git】Git解决文件本地更改的合并覆盖错误
  12. 原 jQuery基础修炼圣典—DOM篇
  13. makefile文件管理
  14. 新大陆java工程师笔试题_完美世界,中兴,新大陆支付面经
  15. 翻译:道路机动车辆驾驶自动化系统相关术语的分类和定义 J3016_202104
  16. 环境配置系列五Linux.Fedora9.配置
  17. 【学习笔记】MATLAB与数学建模——蒙特卡罗模拟仿真
  18. linux中括号 美元符号怎么打,键盘输入美元符号
  19. s2 安恒 漏洞验证工具_Struts2漏洞检查工具2018版本V2.1.exe
  20. php 坦克大战,js坦克大战以实现炮弹击中目标消失并且记分

热门文章

  1. Beta冲刺博客集合贴
  2. 谨记2017年8月30日10:03:26
  3. linux下查询域名或IP注册信息的操作记录(whois)
  4. 星辰天合:为云存储而生 Ceph社区代码贡献领先的国产企业
  5. 新一代 Tor发布, 它牛在哪里
  6. 联发科看上AMD“女友”GF:全新22nm处理器来了
  7. TabLayout 与 FragmentTabHost
  8. 成为一名JAVA高级工程师你需要学什么
  9. shell的控制语句
  10. 转:12个信号判断男人肾不好