One-hot representation

assigns a unique index to each word → a high-dimensional sparse representation
cannot capture the semantic relatedness among words (The difference between cat and dog is as much as the difference between cat and bed in one-hot word representation)
inflexible to deal with new words in a real-world scenario

Distributed representation

Representation learning aims to learn informative representations of objects from raw data automatically. /// Distributed representation has been proved to be more effificient because it usually has low dimensions that can prevent the sparsity issue.
Deep learning is a typical approach for representation learning.

Development of representation learning in NLP

Representation Learning	Major characteristics
N-gram Model	Predicts the next item in a sequence based on its previous n-1 items ∈probabilistic language model
Bag-of-words	disregarding the orders of these words in the document: ①each word that has appeared in the document corresponds to a unique and nonzero dimension. ②a score can be computed for each word (e.g., the numbers of occurrences) to indicate the weights
TF-IDF	BoW → Moreover, researchers usually take the importance of different words into consideration, rather than treat all the words equally
Neural Probabilistic Language Model (NPLM)	NPLM first assigns a distributed vector for each word, then uses a neural network to predict the next word. 例如，前馈神经网络语言模型、循环神经网络语言模型、长短期记忆的循环神经网络语言模型。
Word embeddings: Word2Vec, GloVe, fastText	Inspired by NPLM, there came many methods that embed words into distributed representations. /// Word embeddings in the NLP pipeline map discrete words into informative low-dimensional vectors.
Pre-trained Language Models (PLM): ELMo, BERT	take complicated context in text into consideration /// calculate dynamic representations for the words based on their context, which is especially useful for the words with multiple meanings /// pretrained fine-tuning pipeline

Applications

Neural Relation Extraction

Sentence-Level NRE: A basic form of sentence-level NRE consists of three components: (a) an input encoder to give a representation for each input word (Word Embeddings, Position Embeddings, Part-of-speech (POS) Tag Embeddings, WordNet Hypernym Embeddings). (b) a sentence encoder which computes either a single vector or a sequence of vectors to represent the original sentence.(c) a relation classifier which calculates the conditional probability distribution of all relations.
Bag-Level NRE: utilizing information from multiple sentences (bag-level) rather than a single sentence (sentence-level) to decide if a relation holds between two entities. A basic form of bag-level NRE consists of four components: (a) an input encoder similar to sentence-level NRE, (b) a sentence encoder similar to sentence-level NRE, (c) a bag encoder which computes a vector representing all related sentences in a bag, and (d) a relation classifier similar to sentence-level NRE which takes bag vectors as input instead of sentence vectors.

Topic Model

Topic modeling algorithms do not require any prior annotations or labeling of the documents.

主题模型∈生成模型，一篇文章中每个词都是通过 “以一定概率选择某个主题，并从这个主题中以一定概率选择某个词语” 这样一个过程得到的。

LDA即根据给定的一篇文档，反推其主题分布。在LDA中，一篇文档的生成过程如下：

for each document in the collection, we generate the words in a two-stage process:

1. Randomly choose a distribution over topics.

2. For each word in the document,

• Randomly choose a topic from the distribution over topics in step #1.

• Randomly choose a word from the corresponding distribution over the vocabulary.

Assumptions of the LDA

One assumption that LDA makes is the bag-of-words assumption that the order of the words in the document does not matter.

Another assumption is that the order of documents does not matter → → → This assumption may be unrealistic when analyzing long-running collections that span years or centuries. In such collections, we may want to assume that the topics change over time. One approach to this problem is the dynamic topic model, a model that respects the ordering of the documents and gives a more productive posterior topical structure than LDA.

The third assumption about LDA is that the number of topics is assumed known and fifixed.

Other

Knowledge point

To build an effective machine learning system, we first transform useful information on raw data into internal representations such as feature vectors.

Conventional machine learning systems adopt careful feature engineering as preprocessing to build feature representations from raw data.
The distributional hypothesis that linguistic objects with similar distributions have similar meanings is the basis for distributed word representation learning.

Chapter 6: Sememe Knowledge Representation

For example, the meaning of man can be considered as the combination of the meanings of human, male and adult (Sememe)
WordNet is a large lexical database for the English language. HowNet:Chinese and English

An example of word annotated with sememes in HowNet

【NLP】Representation Learning for Natural Language Processing相关推荐

自然语言处理NLP 2022年最新综述：An introduction to Deep Learning in Natural Language Processing
论文题目:An introduction to Deep Learning in Natural Language Processing: Models, techniques, and tools ...
Deep Learning in Natural Language Processing中文连载（三）
第二章对话语言理解中的深度学习 Gokhan Tur, Asli Celikyilmaz, 何晓东,Dilek Hakkani-Tür 以及邓力摘要人工智能的最新进展导致对话助手的可用性增加, ...
Deep Learning in Natural Language Processing中文连载（一）
前言: 感谢邓力.刘洋博士能够提供给广大NLP从业者.爱好者提供了这本全面.通俗易懂的好书,以及其他专家前辈在具体章节.领域做出的贡献. 本书共338页,涵盖了NLP基本问题的介绍,以及深度学习在对话 ...
【NLP】Contrastive Learning NLP Papers
来自 | 知乎作者 | 光某人地址 | https://zhuanlan.zhihu.com/p/363900943 编辑 | 机器学习算法与自然语言处理公众号本文仅作学术分享,若侵权,请联系后 ...
Recent Trends in Deep Learning Based Natural Language Processing(arXiv)笔记
深度学习方法采用多个处理层来学习数据的层次表示,并在许多领域中产生了最先进的结果.最近,在自然语言处理(NLP)的背景下,各种模型设计和方法蓬勃发展.本文总结了已经用于大量NLP任务的重要深度学习相关 ...
慢慢读《Deep Learning In Natural Language Processing》（一）
第一次浪潮:Rationalism "The approaches, based on the belief that knowledge of language in the human ...
NLP指南 Your Guide to Natural Language Processing (NLP)
原文链接:https://towardsdatascience.com/your-guide-to-natural-language-processing-nlp-48ea2511f6e1 适合新手入 ...
【NLP】FedNLP: 首个联邦学习赋能NLP的开源框架，NLP迈向分布式新时代
文 | 阿毅两周前,南加大Yuchen Lin(PhD student @USC and ex-research intern @GoogleAI)所在的团队在Twitter官宣开源首个以研究为导向 ...
论文阅读笔记（一）【Journal of Machine Learning Research】Natural Language Processing (Almost) from Scratch（未完）
学习内容题目: 自然语言从零开始 Natural Language Processing (Almost) from Scratch 2021年7月28日 1-5页这将是一个长期的过程,因为本文长 ...
【课程笔记】李弘毅2020 Deep Learning for Human Language Processing
简要说明这是我在学习李弘毅老师的2020春季课程[Deep Learning for Human Language Processing]时做的课程笔记.写课程笔记的初衷是为了帮助自己之后快速的回顾 ...

【NLP】Representation Learning for Natural Language Processing

One-hot representation

Distributed representation

Applications

Neural Relation Extraction

Topic Model

Other

Knowledge point

Chapter 6: Sememe Knowledge Representation

【NLP】Representation Learning for Natural Language Processing相关推荐

最新文章

热门文章

【NLP】Representation Learning for Natural Language Processing

​​​​​​One-hot representation

Distributed representation

Applications

Neural Relation Extraction

Topic Model

Other

Knowledge point

Chapter 6: Sememe Knowledge Representation

【NLP】Representation Learning for Natural Language Processing相关推荐

最新文章

热门文章

One-hot representation