2022年32篇最佳AI论文：DALL·E 2、Stable Diffusion、ChatGPT等入选

Mila在读博士Louis Bouchard总结的论文列表，总体比较靠谱。GitHub上还有很多论文的短视频和文字解读、代码链接等。

下面的列表我们添加了论文的主要贡献机构（有些机构虽然有贡献但排名较后有挂名嫌疑的，都被忽略不计了），似乎可以反映出各公司在AI领域的江湖地位：

第一档：Google 8篇，Meta 6篇雄踞前二名，OpenAI 3篇但有两篇影响力巨大的（DALL·E 2和ChatGPT），如果按代表作评价，可能不会输给两巨头。
第二档：NVIDIA有2.5篇。
第三档：国内腾讯、百度、微软（出自亚研院）各1篇。国外有三星、迪士尼各1篇。Snap、Adobe都是0.5篇。

高校总共5.5篇，不如两巨头一家，相比之下要逊色很多。其中：

特拉维夫有1.5篇位居第一，但慕尼黑的Stable Diffusion影响巨大，应该视为第一档。
CMU、南洋理工各1篇，第二档。
南加大和伯克利各0.5篇，第三档。

从方向来看，大模型和文生图、跨模态是今年毫无疑问的热点，此外也有多篇GAN等视觉领域的文章。

[1] 三星: Suvorov, R., Logacheva, E., Mashikhin, A., Remizova, A., Ashukha, A., Silvestrov, A., Kong, N., Goka, H., Park, K. and Lempitsky, V., 2022. Resolution-robust Large Mask Inpainting with Fourier Convolutions. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (pp. 2149–2159)., https://arxiv.org/pdf/2109.07161.pdf

[2] 特拉维夫: Tzaban, R., Mokady, R., Gal, R., Bermano, A.H. and Cohen-Or, D., 2022. Stitch it in Time: GAN-Based Facial Editing of Real Videos. https://arxiv.org/abs/2201.08361

[3] 南加大&Snap: Kuang, Z., Olszewski, K., Chai, M., Huang, Z., Achlioptas, P. and Tulyakov, S., 2022. NeROIC: Neural Rendering of Objects from Online Image Collections. https://arxiv.org/pdf/2201.02533.pdf

[4] Google: Borsos, Z., Sharifi, M. and Tagliasacchi, M., 2022. SpeechPainter: Text-conditioned Speech Inpainting. https://arxiv.org/pdf/2202.07273.pdf

[5] 腾讯: Wang, X., Li, Y., Zhang, H. and Shan, Y., 2021. Towards real-world blind face restoration with generative facial prior. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 9168–9178), https://arxiv.org/pdf/2101.04061.pdf

[6] Google: Piergiovanni, A.J., Casser, V., Ryoo, M.S. and Angelova, A., 2021. 4D-Net for learned multi-modal alignment. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 15435–15445), https://openaccess.thecvf.com/content/ICCV2021/papers/Piergiovanni_4D-Net_for_Learned_Multi-Modal_Alignment_ICCV_2021_paper.pdf.

[7] NVIDIA: Thomas Muller, Alex Evans, Christoph Schied and Alexander Keller, 2022, Instant Neural Graphics Primitives with a Multiresolution Hash Encoding, https://nvlabs.github.io/instant-ngp/assets/mueller2022instant.pdf

[8] OpenAI/DALL·E 2: Ramesh et al., 2022, Hierarchical Text-Conditional Image Generation with CLIP Latents, https://cdn.openai.com/papers/dall-e-2.pdf

[9] Google: Nitzan, Y., Aberman, K., He, Q., Liba, O., Yarom, M., Gandelsman, Y., Mosseri, I., Pritch, Y. and Cohen-Or, D., 2022. MyStyle: A Personalized Generative Prior. arXiv preprint arXiv:2203.17272.

[10] Meta/OPT: Zhang, Susan et al. OPT: Open Pre-trained Transformer Language Models. https://arxiv.org/abs/2205.01068

[11] 伯克利&Adobe: Epstein, D., Park, T., Zhang, R., Shechtman, E. and Efros, A.A., 2022. BlobGAN: Spatially Disentangled Scene Representations. arXiv preprint arXiv:2205.02837.

[12] Google DeepMind: Reed S. et al., 2022. Gato - A generalist agent, A Generalist Agent

[13] Google/Imagen: Saharia et al., 2022. Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding. Imagen: Text-to-Image Diffusion Models

[14] Craiyon: Dayma, et al., 2021, DALL·E Mini, doi:10.5281/zenodo.5146400. GitHub （DALL·E的复现，只有一些技术报告，未找到正规论文）

[15] Meta: NLLB Team et al., 2022, No Language Left Behind: Scaling Human-Centered Machine Translation. https://arxiv.org/abs/2207.04672

[16] CMU: Sheinin, Mark and Chan, Dorian and O’Toole, Matthew and Narasimhan, Srinivasa G., 2022, Dual-Shutter Optical Vibration Sensing, Proc. IEEE CVPR. Dual-Shutter Optical Vibration Sensing （CVPR2022最佳论文入围）

[17] Meta: Gafni, O., Polyak, A., Ashual, O., Sheynin, S., Parikh, D. and Taigman, Y., 2022. Make-a-scene: Scene-based text-to-image generation with human priors. https://arxiv.org/pdf/2203.13131.pdf

[18] Meta: Yang, G., Vo, M., Neverova, N., Ramanan, D., Vedaldi, A. and Joo, H., 2022. Banmo: Building animatable 3d neural models from many casual videos. In CVPR2022 (pp. 2863-2873). https://arxiv.org/abs/2112.12761

[19] 慕尼黑/Stable Diffusion: Rombach, R., Blattmann, A., Lorenz, D., Esser, P. and Ommer, B., 2022. High-resolution image synthesis with latent diffusion models. In CVPR2022 (pp. 10684–10695), https://arxiv.org/pdf/2112.10752.pdf

[20] 南洋理工: Yang, J., Ang, Y.Z., Guo, Z., Zhou, K., Zhang, W. and Liu, Z., 2022. Panoptic Scene Graph Generation. arXiv preprint arXiv:2207.11247.

[21] 特拉维夫&NVIDIA: Gal, R., Alaluf, Y., Atzmon, Y., Patashnik, O., Bermano, A.H., Chechik, G. and Cohen-Or, D., 2022. An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion. An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion

[22] 微软: Ni, B., Peng, H., Chen, M., Zhang, S., Meng, G., Fu, J., Xiang, S. and Ling, H., 2022. Expanding Language-Image Pretrained Models for General Video Recognition. arXiv preprint arXiv:2208.02816.

[23] Meta/Make-A-Video: Singer et al., 2022. Make-A-Video: Text-To-Video Generation without Text-Video Data, https://makeavideo.studio/Make-A-Video.pdf

[24] OpenAI/Whisper: Radford, A., Kim, J.W., Xu, T., Brockman, G., McLeavey, C. and Sutskever, I., Robust Speech Recognition via Large-Scale Weak Supervision. GitHub

[25] Google: Poole, B., Jain, A., Barron, J.T. and Mildenhall, B., 2022. DreamFusion: Text-to-3D using 2D Diffusion. arXiv preprint arXiv:2209.14988. DreamFusion: Text-to-3D using 2D Diffusion

[26] Google/Imagic: Kawar, B., Zada, S., Lang, O., Tov, O., Chang, H., Dekel, T., Mosseri, I. and Irani, M., 2022. Imagic: Text-Based Real Image Editing with Diffusion Models. arXiv preprint arXiv:2210.09276.

[27] NVIDIA: Balaji, Y. et al., 2022, eDiffi: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers, https://arxiv.org/abs/2211.01324

[28] Google: Li, Z., Wang, Q., Snavely, N. and Kanazawa, A., 2022. InfiniteNature-Zero: Learning Perpetual View Generation of Natural Scenes from Single Images. In ECCV (pp. 515–534). Springer, Cham, https://arxiv.org/abs/2207.11148

[29] Meta/Galactica: Taylor et al., 2022: Galactica: A Large Language Model for Science, Galactica Demo

[30] 百度: Tang, J., Wang, K., Zhou, H., Chen, X., He, D., Hu, T., Liu, J., Zeng, G. and Wang, J., 2022. Real-time Neural Radiance Talking Portrait Synthesis via Audio-spatial Decomposition. arXiv preprint arXiv:2211.12368.

[31] OpenAI/ChatGPT: ChatGPT: Optimizing Language Models for Dialogue, ChatGPT: Optimizing Language Models for Dialogue

[32] 迪士尼/FRAN: Loss et al., DisneyResearch, 2022: Production-Ready Face Re-Aging for Visual Effects, https://studios.disneyresearch.com

这个列表里的论文哪些你认同，哪些你觉得不应该入选最佳的，还有哪些重要论文遗漏？欢迎大家评论。

吴恩达、Bengio等6位AI大佬2023年度展望 - 智源社区

2022年32篇最佳AI论文：DALL·E 2、Stable Diffusion、ChatGPT等入选 - 智源社区

2022年最值得关注的十篇论文（威斯康星大学助理教授推荐） - 智源社区

CMU博士总结的2022年Top20 AI论文 - 智源社区

2023-01-28智源社区日报：OpenAI为代码大模型标注、谷歌MusicLM生成复杂音乐、Meta文本生成4D场景、伯克利讲解DL基础到大模型、「画匠」生成图片红包活动 - 智源社区

2022年32篇最佳AI论文：DALL·E 2、Stable Diffusion、ChatGPT等入选相关推荐

华人一作占半数，陶大程等人上榜，CVPR公布32篇最佳论文候选
点击上方"3D视觉工坊",选择"星标" 干货第一时间送达国际计算机视觉与模式识别会议(Conference on Computer Vision and Pa ...
【ICLR 2022】 10篇机器学习研究论文推荐
ICLR,即国际表征学习大会,是公认的深度学习领域国际顶级会议之一,关注有关深度学习各个方面的前沿研究,在人工智能.统计和数据科学领域以及机器视觉.语音识别.文本理解等重要应用领域中发布了众多极其有影 ...
整理了近500篇的AI论文，我发现了一个问题……
搞AI,在不断精进自己代码的同时,更应该提升自己的阅读能力.需要不断地阅读大量的最新.最前沿的论文,也要深扎经典论文根基.因为阅读论文可以帮助你深入原理,理解AI更前沿的发展状态,掌握更前沿的技术热点 ...
开学综合症有救了！17篇最新AI论文不容错过
在碎片化阅读充斥眼球的时代,越来越少的人会去关注每篇论文背后的探索和思考. 在这个栏目里,你会快速 get 每篇精选论文的亮点和痛点,时刻紧跟 AI 前沿成果. 点击本文底部的「阅读原文」即刻加入社区 ...
15 篇最新 AI 论文来袭！NLP、CV...人人有份 | 本周值得读
在碎片化阅读充斥眼球的时代,越来越少的人会去关注每篇论文背后的探索和思考. 在这个栏目里,你会快速 get 每篇精选论文的亮点和痛点,时刻紧跟 AI 前沿成果. 点击本文底部的「阅读原文」即刻加入社区 ...
2022年第一篇Nature封面论文，来自中国
1月5日,Nature发表了来自中国科学院国家天文台李菂团队的题为"An early transition to magnetic supercriticality in star form ...
【AI绘图】一、stable diffusion的发展史
一.stable diffusion的发展史本文目标:学习交流对于熟悉SD的同学,一起学习和交流使用过程中的技巧和心得. 帮助新手帮助没有尝试过SD但又对它感兴趣的同学快速入门,并且能够独立生成 ...
智源社区AI周刊No.102：Stable Diffusion背后公司再融1亿美元；体外人脑细胞五分钟学会打乒乓，登Neuron...
汇聚每周AI观点.研究和各类资源,不错过真知灼见和重要资讯!欢迎扫码,关注并订阅智源社区AI周刊. 编辑精选 1. Stable Diffusion背后公司再融1亿美金:独辟蹊径,开源和社区驱动的AI ...
【AI绘画】Midjourney和Stable Diffusion教程
之前我向大家介绍了这两个AI绘画网站: Stable Diffusion介绍: https://mp.csdn.net/mp_blog/creation/editor/130059509 Midjou ...

2022年32篇最佳AI论文：DALL·E 2、Stable Diffusion、ChatGPT等入选

2022年32篇最佳AI论文：DALL·E 2、Stable Diffusion、ChatGPT等入选相关推荐

最新文章

热门文章