python nlp文本摘要_NLP（十一）提取文本摘要

gensim.summarization库的函数 gensim.summarization.summarize(text, ratio=0.2, word_count=None, split=False) Parameters(参数)： text : str Given text. ratio : float, optional Number between 0 and 1 that determines the proportion of the number of sentences of the original text to be chosen for the summary. word_count : int or None, optional Determines how many words will the output contain. If both parameters are provided, the ratio will be ignored. split : bool, optional If True, list of sentences will be returned. Otherwise joined strings will bwe returned.

代码

from gensim.summarization import summarize # 基于文本排序的摘要算法

from bs4 import BeautifulSoup # 用于解析HTML文档的BeautifulSoup库

import requests # 用于下载HTTP资源的库

urls = { # 题目:网站字典

'Deconstructing Voice-over-IP':

'http://scigen.csail.mit.edu/scicache/269/scimakelatex.25977.A.+G.+Hassan.html',

'Exploration of the Location-Identity Split':

'http://scigen.csail.mit.edu/scicache/270/scimakelatex.26087.Ali+Veli.Veli+Ali.Vel+Al.html',

}

# 摘要(真实的)：

# 1.The implications of ambimorphic archetypes have been far-reaching and pervasive. After years of natural research into consistent hashing, we argue the simulation of public-private key pairs, which embodies the confirmed principles of theory. Such a hypothesis might seem perverse but is derived from known results. Our focus in this paper is not on whether the well-known knowledge-based algorithm for the emulation of checksums by Herbert Simon runs in Θ( n ) time, but rather on exploring a semantic tool for harnessing telephony (Swale).

# 2.Superblocks must work. Given the current status of homogeneous configurations, security experts particularly desire the simulation of 802.11b. we consider how the Internet can be applied to the refinement of Scheme.

for key in urls.keys():

url = urls[key]

r = requests.get(url)

soup = BeautifulSoup(r.text,'html.parser')

data = soup.get_text() # HTML去标签后的文本

pos1 = data.find('1 Introduction') + len('1 Introduction')

pos2 = data.find('Related Work')

text = data[pos1:pos2].strip() # 提取pos1与pos2之间的引言部分

print('PAPER URL: {}'.format(url))

print('TITLE: {}'.format(key))

print('GENERATED SUMMARY: {}'.format(summarize(text)))

print()

输出：

PAPER URL: http://scigen.csail.mit.edu/scicache/269/scimakelatex.25977.A.+G.+Hassan.html

TITLE: Deconstructing Voice-over-IP

GENERATED SUMMARY: 。。。。。。

PAPER URL: http://scigen.csail.mit.edu/scicache/270/scimakelatex.26087.Ali+Veli.Veli+Ali.Vel+Al.html

TITLE: Exploration of the Location-Identity Split

GENERATED SUMMARY: 。。。。。。

python nlp文本摘要_NLP（十一）提取文本摘要相关推荐

python从文件中提取特定文本_python利用正则表达式提取文本中特定内容
正则表达式是一个特殊的字符序列,它能帮助你方便的检查一个字符串是否与某种模式匹配. Python 自1.5版本起增加了re 模块,它提供 Perl 风格的正则表达式模式. re 模块使 Python ...
delphi 停电文本数据丢失_NLP中的文本分析和特征工程
语言检测,文本清理,长度测量,情绪分析,命名实体识别,n字频率,词向量,主题建模前言在本文中,我将使用NLP和Python解释如何分析文本数据并为机器学习模型提取特征. NLP(自然语言处理)是人 ...
python论文摘要_Python实现提取文章摘要的方法
本文实例讲述了Python实现提取文章摘要的方法.分享给大家供大家参考.具体如下: 一.概述在博客系统的文章列表中,为了更有效地呈现文章内容,从而让读者更有针对性地选择阅读,通常会同时提供文章的标题 ...
python中文文本分词_SnowNLP：?中文分词?词性标准?提取文本摘要,?提取文本关键词,?转换成拼音?繁体转简体的处理中文文本的Python3 类库...
SnowNLP是一个python写的类库,可以方便的处理中文文本内容,是受到了TextBlob的启发而写的,由于现在大部分的自然语言处理库基本都是针对英文的,于是写了一个方便处理中文的类库,并且和Te ...
python自动翻译pdf_python实现从pdf文件中提取文本,并自动翻译的方法
针对Python 3.5.2 测试首先安装两个包: $ pip install googletrans $ pip install pdfminer3k googletrans会提供一个命令tran ...
python用来自动修改pdf_python实现从pdf文件中提取文本,并自动翻译的方法
针对Python 3.5.2 测试首先安装两个包: $ pip install googletrans $ pip install pdfminer3k googletrans会提供一个命令tran ...
nlp中文文本摘要提取，快速提取文本主要意思
文本摘要提取之前写过一版文本摘要提取,但那版并不完美.有所缺陷(但也获得几十次收藏). 中文文本摘要提取 (文本摘要提取有代码)基于python 今天写改进版的文本摘要提取. 文本摘要旨在将文本 ...
python nlp文本摘要实现_用TextRank算法实现自动文本摘要
[51CTO.com快译]1. 引言文本摘要是自然语言处理(NLP)领域中的应用之一,它必将对我们的生活产生巨大影响.随着数字媒体和出版业的不断发展,谁还有时间浏览整篇文章/文档/书籍来决定它们是 ...
NLP：基于nltk和jieba库对文本实现提取文本摘要(两种方法实现：top_n_summary和mean_scored_summary)
NLP:基于nltk和jieba库对文本实现提取文本摘要(两种方法实现:top_n_summary和mean_scored_summary) 目录输出结果设计思路核心代码输出结果 1.测试文本 ...
NLP：基于snownlp库对文本实现提取文本关键词和文本摘要
NLP:基于snownlp库对文本实现提取文本关键词和文本摘要目录输出结果 1.测试文本设计思路核心代码输出结果 1.测试文本今天一大早,两位男子在故宫抽烟对镜头炫耀的视频在网络上传播,引 ...

python nlp文本摘要_NLP（十一）提取文本摘要

python nlp文本摘要_NLP（十一）提取文本摘要相关推荐

最新文章

热门文章

python nlp文本摘要_NLP（十一） 提取文本摘要

python nlp文本摘要_NLP（十一） 提取文本摘要相关推荐

最新文章

热门文章

python nlp文本摘要_NLP（十一）提取文本摘要

python nlp文本摘要_NLP（十一）提取文本摘要相关推荐