NLTK

从NLTK中的book模块中,载入所有条目

book 模块包含所有数据

from nltk.book import *

*** Introductory Examples for the NLTK Book ***

Loading text1, ..., text9 and sent1, ..., sent9

Type the name of the text or sentence to view it.

Type: ‘texts()‘ or ‘sents()‘ to list the materials.

text1: Moby Dick by Herman Melville 1851

text2: Sense and Sensibility by Jane Austen 1811

text3: The Book of Genesis

text4: Inaugural Address Corpus

text5: Chat Corpus

text6: Monty Python and the Holy Grail

text7: Wall Street Journal

text8: Personals Corpus

text9: The Man Who Was Thursday by G . K . Chesterton 1908

text1

text2

搜索文本或主题

concordance允许在课文中查找单词,并打印出来

similar 用来识别文章中和搜索词相似的词语,可以用在搜索引擎中的相关度识别功能中。

common_contexts 用来识别2个关键词相似的词语。

dispersion_plot 绘制单词的离散图

text1.concordance(‘monstrous‘) # 在text1中查阅词汇‘monstrous‘

# concordance

# 英 [k?n‘k??d(?)ns] 美 [k?n‘k?rdns]

# n. 调和,一致;用语索引;著作或作家全集的重要用字索引

Displaying 11 of 11 matches:

ong the former , one was of a most monstrous size . ... This came towards us ,

ON OF THE PSALMS . " Touching that monstrous bulk of the whale or ork we have r

ll over with a heathenish array of monstrous clubs and spears . Some were thick

d as you gazed , and wondered what monstrous cannibal and savage could ever hav

that has survived the flood ; most monstrous and most mountainous ! That Himmal

they might scout at Moby Dick as a monstrous fable , or still worse and more de

th of Radney .‘" CHAPTER 55 Of the Monstrous Pictures of Whales . I shall ere l

ing Scenes . In connexion with the monstrous pictures of whales , I am strongly

ere to enter upon those still more monstrous stories of them which are to be fo

ght have been rummaged out of this monstrous cabinet there is no telling . But

of Whale - Bones ; for Whales of a monstrous size are oftentimes cast up dead u

text2.concordance(‘affection‘)

Displaying 25 of 79 matches:

, however , and , as a mark of his affection for the three girls , he left them

t . It was very well known that no affection was ever supposed to exist between

deration of politeness or maternal affection on the side of the former , the tw

d the suspicion -- the hope of his affection for me may warrant , without impru

hich forbade the indulgence of his affection . She knew that his mother neither

rd she gave one with still greater affection . Though her late conversation wit

can never hope to feel or inspire affection again , and if her home be uncomfo

m of the sense , elegance , mutual affection , and domestic comfort of the fami

, and which recommended him to her affection beyond every thing else . His soci

ween the parties might forward the affection of Mr . Willoughby , an equally st

the most pointed assurance of her affection . Elinor could not be surprised at

he natural consequence of a strong affection in a young and ardent mind . This

opinion . But by an appeal to her affection for her mother , by representing t

every alteration of a place which affection had established as perfect with hi

e will always have one claim of my affection , which no other can possibly shar

f the evening declared at once his affection and happiness . " Shall we see you

ause he took leave of us with less affection than his usual behaviour has shewn

ness ." " I want no proof of their affection ," said Elinor ; " but of their en

onths , without telling her of his affection ;-- that they should part without

ould be the natural result of your affection for her . She used to be all unres

distinguished Elinor by no mark of affection . Marianne saw and listened with i

th no inclination for expense , no affection for strangers , no profession , an

till distinguished her by the same affection which once she had felt no doubt o

al of her confidence in Edward ‘ s affection , to the remembrance of every mark

was made ? Had he never owned his affection to yourself ?" " Oh , no ; but if

text1.similar(‘monstrous‘)

true contemptible christian abundant few part mean careful puzzled

mystifying passing curious loving wise doleful gamesome singular

delightfully perilous fearless

text2.similar(‘monstrous‘)

very so exceedingly heartily a as good great extremely remarkably

sweet vast amazingly

text2.common_contexts([‘monstrous‘,‘very‘])

a_pretty am_glad a_lucky is_pretty be_glad

# 从文本中检查一个单词的位置,从该单词出现开始出现了多少次。

# Each stripe represents an instance of a word,

# and each row represents the entire text.

text4.dispersion_plot([‘citizens‘,‘democracy‘,‘freedon‘,‘duties‘,‘America‘,‘liberty‘])

# dispersion

# 英 [d?‘sp???(?)n] 美 [d?‘sp??n]

# n. 散布;[统计][数] 离差;驱散

print(text3.generate(‘monstrous‘))

None

统计词汇

len(text3)

44764

sorted(set(text3))

[‘!‘,

"‘",

‘(‘,

‘)‘,

‘,‘,

‘,)‘,

‘.‘,

‘.)‘,

‘:‘,

‘;‘,

‘;)‘,

‘?‘,

‘?)‘,

‘A‘,

‘Abel‘,

‘Abelmizraim‘,

‘Abidah‘,

‘Abide‘,

‘Abimael‘,

‘Abimelech‘,

‘Abr‘,

‘Abrah‘,

‘Abraham‘,

‘Abram‘,

‘Accad‘,

‘Achbor‘,

‘Adah‘,

‘Adam‘,

‘Adbeel‘,

‘Admah‘,

‘Adullamite‘,

‘After‘,

‘Aholibamah‘,

‘Ahuzzath‘,

‘Ajah‘,

‘Akan‘,

‘All‘,

‘Allonbachuth‘,

‘Almighty‘,

‘Almodad‘,

‘Also‘,

‘Alvah‘,

‘Alvan‘,

‘Am‘,

‘Amal‘,

‘Amalek‘,

‘Amalekites‘,

‘Ammon‘,

‘Amorite‘,

‘Amorites‘,

‘Amraphel‘,

‘An‘,

‘Anah‘,

‘Anamim‘,

‘And‘,

‘Aner‘,

‘Angel‘,

‘Appoint‘,

‘Aram‘,

‘Aran‘,

‘Ararat‘,

‘Arbah‘,

‘Ard‘,

‘Are‘,

‘Areli‘,

‘Arioch‘,

‘Arise‘,

‘Arkite‘,

‘Arodi‘,

‘Arphaxad‘,

‘Art‘,

‘Arvadite‘,

‘As‘,

‘Asenath‘,

‘Ashbel‘,

‘Asher‘,

‘Ashkenaz‘,

‘Ashteroth‘,

‘Ask‘,

‘Asshur‘,

‘Asshurim‘,

‘Assyr‘,

‘Assyria‘,

‘At‘,

‘Atad‘,

‘Avith‘,

‘Baalhanan‘,

‘Babel‘,

‘Bashemath‘,

‘Be‘,

‘Because‘,

‘Becher‘,

‘Bedad‘,

‘Beeri‘,

‘Beerlahairoi‘,

‘Beersheba‘,

‘Behold‘,

‘Bela‘,

‘Belah‘,

‘Benam‘,

‘Benjamin‘,

‘Beno‘,

‘Beor‘,

‘Bera‘,

‘Bered‘,

‘Beriah‘,

‘Bethel‘,

‘Bethlehem‘,

‘Bethuel‘,

‘Beware‘,

‘Bilhah‘,

‘Bilhan‘,

‘Binding‘,

‘Birsha‘,

‘Bless‘,

‘Blessed‘,

‘Both‘,

‘Bow‘,

‘Bozrah‘,

‘Bring‘,

‘But‘,

‘Buz‘,

‘By‘,

‘Cain‘,

‘Cainan‘,

‘Calah‘,

‘Calneh‘,

‘Can‘,

‘Cana‘,

‘Canaan‘,

‘Canaanite‘,

‘Canaanites‘,

‘Canaanitish‘,

‘Caphtorim‘,

‘Carmi‘,

‘Casluhim‘,

‘Cast‘,

‘Cause‘,

‘Chaldees‘,

‘Chedorlaomer‘,

‘Cheran‘,

‘Cherubims‘,

‘Chesed‘,

‘Chezib‘,

‘Come‘,

‘Cursed‘,

‘Cush‘,

‘Damascus‘,

‘Dan‘,

‘Day‘,

‘Deborah‘,

‘Dedan‘,

‘Deliver‘,

‘Diklah‘,

‘Din‘,

‘Dinah‘,

‘Dinhabah‘,

‘Discern‘,

‘Dishan‘,

‘Dishon‘,

‘Do‘,

‘Dodanim‘,

‘Dothan‘,

‘Drink‘,

‘Duke‘,

‘Dumah‘,

‘Earth‘,

‘Ebal‘,

‘Eber‘,

‘Edar‘,

‘Eden‘,

‘Edom‘,

‘Edomites‘,

‘Egy‘,

‘Egypt‘,

‘Egyptia‘,

‘Egyptian‘,

‘Egyptians‘,

‘Ehi‘,

‘Elah‘,

‘Elam‘,

‘Elbethel‘,

‘Eldaah‘,

‘EleloheIsrael‘,

‘Eliezer‘,

‘Eliphaz‘,

‘Elishah‘,

‘Ellasar‘,

‘Elon‘,

‘Elparan‘,

‘Emins‘,

‘En‘,

‘Enmishpat‘,

‘Eno‘,

‘Enoch‘,

‘Enos‘,

‘Ephah‘,

‘Epher‘,

‘Ephra‘,

‘Ephraim‘,

‘Ephrath‘,

‘Ephron‘,

‘Er‘,

‘Erech‘,

‘Eri‘,

‘Es‘,

‘Esau‘,

‘Escape‘,

‘Esek‘,

‘Eshban‘,

‘Eshcol‘,

‘Ethiopia‘,

‘Euphrat‘,

‘Euphrates‘,

‘Eve‘,

‘Even‘,

‘Every‘,

‘Except‘,

‘Ezbon‘,

‘Ezer‘,

‘Fear‘,

‘Feed‘,

‘Fifteen‘,

‘Fill‘,

‘For‘,

‘Forasmuch‘,

‘Forgive‘,

‘From‘,

‘Fulfil‘,

‘G‘,

‘Gad‘,

‘Gaham‘,

‘Galeed‘,

‘Gatam‘,

‘Gather‘,

‘Gaza‘,

‘Gentiles‘,

‘Gera‘,

‘Gerar‘,

‘Gershon‘,

‘Get‘,

‘Gether‘,

‘Gihon‘,

‘Gilead‘,

‘Girgashites‘,

‘Girgasite‘,

‘Give‘,

‘Go‘,

‘God‘,

‘Gomer‘,

‘Gomorrah‘,

‘Goshen‘,

‘Guni‘,

‘Hadad‘,

‘Hadar‘,

‘Hadoram‘,

‘Hagar‘,

‘Haggi‘,

‘Hai‘,

‘Ham‘,

‘Hamathite‘,

‘Hamor‘,

‘Hamul‘,

‘Hanoch‘,

‘Happy‘,

‘Haran‘,

‘Hast‘,

‘Haste‘,

‘Have‘,

‘Havilah‘,

‘Hazarmaveth‘,

‘Hazezontamar‘,

‘Hazo‘,

‘He‘,

‘Hear‘,

‘Heaven‘,

‘Heber‘,

‘Hebrew‘,

‘Hebrews‘,

‘Hebron‘,

‘Hemam‘,

‘Hemdan‘,

‘Here‘,

‘Hereby‘,

‘Heth‘,

‘Hezron‘,

‘Hiddekel‘,

‘Hinder‘,

‘Hirah‘,

‘His‘,

‘Hitti‘,

‘Hittite‘,

‘Hittites‘,

‘Hivite‘,

‘Hobah‘,

‘Hori‘,

‘Horite‘,

‘Horites‘,

‘How‘,

‘Hul‘,

‘Huppim‘,

‘Husham‘,

‘Hushim‘,

‘Huz‘,

‘I‘,

‘If‘,

‘In‘,

‘Irad‘,

‘Iram‘,

‘Is‘,

‘Isa‘,

‘Isaac‘,

‘Iscah‘,

‘Ishbak‘,

‘Ishmael‘,

‘Ishmeelites‘,

‘Ishuah‘,

‘Isra‘,

‘Israel‘,

‘Issachar‘,

‘Isui‘,

‘It‘,

‘Ithran‘,

‘Jaalam‘,

‘Jabal‘,

‘Jabbok‘,

‘Jac‘,

‘Jachin‘,

‘Jacob‘,

‘Jahleel‘,

‘Jahzeel‘,

‘Jamin‘,

‘Japhe‘,

‘Japheth‘,

‘Jared‘,

‘Javan‘,

‘Jebusite‘,

‘Jebusites‘,

‘Jegarsahadutha‘,

‘Jehovahjireh‘,

‘Jemuel‘,

‘Jerah‘,

‘Jetheth‘,

‘Jetur‘,

‘Jeush‘,

‘Jezer‘,

‘Jidlaph‘,

‘Jimnah‘,

‘Job‘,

‘Jobab‘,

‘Jokshan‘,

‘Joktan‘,

‘Jordan‘,

‘Joseph‘,

‘Jubal‘,

‘Judah‘,

‘Judge‘,

‘Judith‘,

‘Kadesh‘,

‘Kadmonites‘,

‘Karnaim‘,

‘Kedar‘,

‘Kedemah‘,

‘Kemuel‘,

‘Kenaz‘,

‘Kenites‘,

‘Kenizzites‘,

‘Keturah‘,

‘Kiriathaim‘,

‘Kirjatharba‘,

‘Kittim‘,

‘Know‘,

‘Kohath‘,

‘Kor‘,

‘Korah‘,

‘LO‘,

‘LORD‘,

‘Laban‘,

‘Lahairoi‘,

‘Lamech‘,

‘Lasha‘,

‘Lay‘,

‘Leah‘,

‘Lehabim‘,

‘Lest‘,

‘Let‘,

‘Letushim‘,

‘Leummim‘,

‘Levi‘,

‘Lie‘,

‘Lift‘,

‘Lo‘,

‘Look‘,

‘Lot‘,

‘Lotan‘,

‘Lud‘,

‘Ludim‘,

‘Luz‘,

‘Maachah‘,

‘Machir‘,

‘Machpelah‘,

‘Madai‘,

‘Magdiel‘,

‘Magog‘,

‘Mahalaleel‘,

‘Mahalath‘,

‘Mahanaim‘,

‘Make‘,

‘Malchiel‘,

‘Male‘,

‘Mam‘,

‘Mamre‘,

‘Man‘,

‘Manahath‘,

‘Manass‘,

‘Manasseh‘,

‘Mash‘,

‘Masrekah‘,

‘Massa‘,

‘Matred‘,

‘Me‘,

‘Medan‘,

‘Mehetabel‘,

‘Mehujael‘,

‘Melchizedek‘,

‘Merari‘,

‘Mesha‘,

‘Meshech‘,

‘Mesopotamia‘,

‘Methusa‘,

‘Methusael‘,

‘Methuselah‘,

‘Mezahab‘,

‘Mibsam‘,

‘Mibzar‘,

‘Midian‘,

‘Midianites‘,

‘Milcah‘,

‘Mishma‘,

‘Mizpah‘,

‘Mizraim‘,

‘Mizz‘,

‘Moab‘,

‘Moabites‘,

‘Moreh‘,

‘Moreover‘,

‘Moriah‘,

‘Muppim‘,

‘My‘,

‘Naamah‘,

‘Naaman‘,

‘Nahath‘,

‘Nahor‘,

‘Naphish‘,

‘Naphtali‘,

‘Naphtuhim‘,

‘Nay‘,

‘Nebajoth‘,

‘Neither‘,

‘Night‘,

‘Nimrod‘,

‘Nineveh‘,

‘Noah‘,

‘Nod‘,

‘Not‘,

‘Now‘,

‘O‘,

‘Obal‘,

‘Of‘,

‘Oh‘,

‘Ohad‘,

‘Omar‘,

‘On‘,

‘Onam‘,

‘Onan‘,

‘Only‘,

‘Ophir‘,

‘Our‘,

‘Out‘,

‘Padan‘,

‘Padanaram‘,

‘Paran‘,

‘Pass‘,

‘Pathrusim‘,

‘Pau‘,

‘Peace‘,

‘Peleg‘,

‘Peniel‘,

‘Penuel‘,

‘Peradventure‘,

‘Perizzit‘,

‘Perizzite‘,

‘Perizzites‘,

‘Phallu‘,

‘Phara‘,

‘Pharaoh‘,

‘Pharez‘,

‘Phichol‘,

‘Philistim‘,

‘Philistines‘,

‘Phut‘,

‘Phuvah‘,

‘Pildash‘,

‘Pinon‘,

‘Pison‘,

‘Potiphar‘,

‘Potipherah‘,

‘Put‘,

‘Raamah‘,

‘Rachel‘,

‘Rameses‘,

‘Rebek‘,

‘Rebekah‘,

‘Rehoboth‘,

‘Remain‘,

‘Rephaims‘,

‘Resen‘,

‘Return‘,

‘Reu‘,

‘Reub‘,

‘Reuben‘,

‘Reuel‘,

‘Reumah‘,

‘Riphath‘,

‘Rosh‘,

‘Sabtah‘,

‘Sabtech‘,

‘Said‘,

‘Salah‘,

‘Salem‘,

‘Samlah‘,

‘Sarah‘,

‘Sarai‘,

‘Saul‘,

‘Save‘,

‘Say‘,

‘Se‘,

‘Seba‘,

‘See‘,

‘Seeing‘,

‘Seir‘,

‘Sell‘,

‘Send‘,

‘Sephar‘,

‘Serah‘,

‘Sered‘,

‘Serug‘,

‘Set‘,

‘Seth‘,

‘Shalem‘,

‘Shall‘,

‘Shalt‘,

‘Shammah‘,

‘Shaul‘,

‘Shaveh‘,

‘She‘,

‘Sheba‘,

‘Shebah‘,

‘Shechem‘,

‘Shed‘,

‘Shel‘,

‘Shelah‘,

‘Sheleph‘,

‘Shem‘,

‘Shemeber‘,

‘Shepho‘,

‘Shillem‘,

‘Shiloh‘,

‘Shimron‘,

‘Shinab‘,

‘Shinar‘,

‘Shobal‘,

‘Should‘,

‘Shuah‘,

‘Shuni‘,

‘Shur‘,

‘Sichem‘,

‘Siddim‘,

‘Sidon‘,

‘Simeon‘,

‘Sinite‘,

‘Sitnah‘,

‘Slay‘,

‘So‘,

‘Sod‘,

‘Sodom‘,

‘Sojourn‘,

‘Some‘,

‘Spake‘,

‘Speak‘,

‘Spirit‘,

‘Stand‘,

‘Succoth‘,

‘Surely‘,

‘Swear‘,

‘Syrian‘,

‘Take‘,

‘Tamar‘,

‘Tarshish‘,

‘Tebah‘,

‘Tell‘,

‘Tema‘,

‘Teman‘,

‘Temani‘,

‘Terah‘,

‘Thahash‘,

‘That‘,

‘The‘,

‘Then‘,

‘There‘,

‘Therefore‘,

‘These‘,

‘They‘,

‘Thirty‘,

‘This‘,

‘Thorns‘,

‘Thou‘,

‘Thus‘,

‘Thy‘,

‘Tidal‘,

‘Timna‘,

‘Timnah‘,

‘Timnath‘,

‘Tiras‘,

‘To‘,

‘Togarmah‘,

‘Tola‘,

‘Tubal‘,

‘Tubalcain‘,

‘Twelve‘,

‘Two‘,

‘Unstable‘,

‘Until‘,

‘Unto‘,

‘Up‘,

‘Upon‘,

‘Ur‘,

‘Uz‘,

‘Uzal‘,

‘We‘,

‘What‘,

‘When‘,

‘Whence‘,

‘Where‘,

‘Whereas‘,

‘Wherefore‘,

‘Which‘,

‘While‘,

‘Who‘,

‘Whose‘,

‘Whoso‘,

‘Why‘,

‘Wilt‘,

‘With‘,

‘Woman‘,

‘Ye‘,

‘Yea‘,

‘Yet‘,

‘Zaavan‘,

‘Zaphnathpaaneah‘,

‘Zar‘,

‘Zarah‘,

‘Zeboiim‘,

‘Zeboim‘,

‘Zebul‘,

‘Zebulun‘,

‘Zemarite‘,

‘Zepho‘,

‘Zerah‘,

‘Zibeon‘,

‘Zidon‘,

‘Zillah‘,

‘Zilpah‘,

‘Zimran‘,

‘Ziphion‘,

‘Zo‘,

‘Zoar‘,

‘Zohar‘,

‘Zuzims‘,

‘a‘,

‘abated‘,

‘abide‘,

‘able‘,

‘abode‘,

‘abomination‘,

‘about‘,

‘above‘,

‘abroad‘,

‘absent‘,

‘abundantly‘,

‘accept‘,

‘accepted‘,

‘according‘,

‘acknowledged‘,

‘activity‘,

‘add‘,

‘adder‘,

‘afar‘,

‘afflict‘,

‘affliction‘,

‘afraid‘,

‘after‘,

‘afterward‘,

‘afterwards‘,

‘aga‘,

‘again‘,

‘against‘,

‘age‘,

‘aileth‘,

‘air‘,

‘al‘,

‘alive‘,

‘all‘,

‘almon‘,

‘alo‘,

‘alone‘,

‘aloud‘,

‘also‘,

‘altar‘,

‘altogether‘,

‘always‘,

‘am‘,

‘among‘,

‘amongst‘,

‘an‘,

‘and‘,

‘angel‘,

‘angels‘,

‘anger‘,

‘angry‘,

‘anguish‘,

‘anointedst‘,

‘anoth‘,

‘another‘,

‘answer‘,

‘answered‘,

‘any‘,

‘anything‘,

‘appe‘,

‘appear‘,

‘appeared‘,

‘appease‘,

‘appoint‘,

‘appointed‘,

‘aprons‘,

‘archer‘,

‘archers‘,

‘are‘,

‘arise‘,

‘ark‘,

‘armed‘,

‘arms‘,

‘army‘,

‘arose‘,

‘arrayed‘,

‘art‘,

‘artificer‘,

‘as‘,

‘ascending‘,

‘ash‘,

‘ashamed‘,

‘ask‘,

‘asked‘,

‘asketh‘,

‘ass‘,

‘assembly‘,

‘asses‘,

‘assigned‘,

‘asswaged‘,

‘at‘,

‘attained‘,

‘audience‘,

‘avenged‘,

‘aw‘,

‘awaked‘,

‘away‘,

‘awoke‘,

‘back‘,

‘backward‘,

‘bad‘,

‘bade‘,

‘badest‘,

‘badne‘,

‘bak‘,

‘bake‘,

‘bakemeats‘,

‘baker‘,

‘bakers‘,

‘balm‘,

‘bands‘,

‘bank‘,

‘bare‘,

‘barr‘,

‘barren‘,

‘basket‘,

‘baskets‘,

‘battle‘,

‘bdellium‘,

‘be‘,

‘bear‘,

‘beari‘,

‘bearing‘,

‘beast‘,

‘beasts‘,

‘beautiful‘,

‘became‘,

‘because‘,

‘become‘,

‘bed‘,

‘been‘,

‘befall‘,

‘befell‘,

‘before‘,

‘began‘,

‘begat‘,

‘beget‘,

‘begettest‘,

‘begin‘,

‘beginning‘,

‘begotten‘,

‘beguiled‘,

‘beheld‘,

‘behind‘,

‘behold‘,

‘being‘,

‘believed‘,

‘belly‘,

‘belong‘,

‘beneath‘,

‘bereaved‘,

‘beside‘,

‘besides‘,

‘besought‘,

‘best‘,

‘betimes‘,

‘better‘,

‘between‘,

‘betwixt‘,

‘beyond‘,

‘binding‘,

‘bird‘,

‘birds‘,

‘birthday‘,

‘birthright‘,

‘biteth‘,

‘bitter‘,

‘blame‘,

‘blameless‘,

‘blasted‘,

‘bless‘,

‘blessed‘,

‘blesseth‘,

‘blessi‘,

‘blessing‘,

‘blessings‘,

‘blindness‘,

‘blood‘,

‘blossoms‘,

‘bodies‘,

‘boldly‘,

‘bondman‘,

‘bondmen‘,

‘bondwoman‘,

‘bone‘,

‘bones‘,

‘book‘,

‘booths‘,

‘border‘,

‘borders‘,

‘born‘,

‘bosom‘,

‘both‘,

‘bottle‘,

‘bou‘,

‘boug‘,

‘bough‘,

‘bought‘,

‘bound‘,

‘bow‘,

‘bowed‘,

‘bowels‘,

‘bowing‘,

‘boys‘,

‘bracelets‘,

‘branches‘,

‘brass‘,

‘bre‘,

‘breach‘,

‘bread‘,

‘breadth‘,

‘break‘,

‘breaketh‘,

‘breaking‘,

‘breasts‘,

‘breath‘,

‘breathed‘,

‘breed‘,

‘brethren‘,

‘brick‘,

‘brimstone‘,

‘bring‘,

‘brink‘,

‘broken‘,

‘brook‘,

‘broth‘,

‘brother‘,

‘brought‘,

‘brown‘,

‘bruise‘,

‘budded‘,

‘build‘,

‘builded‘,

‘built‘,

‘bulls‘,

‘bundle‘,

‘bundles‘,

‘burdens‘,

‘buried‘,

‘burn‘,

‘burning‘,

‘burnt‘,

‘bury‘,

‘buryingplace‘,

‘business‘,

‘but‘,

‘butler‘,

‘butlers‘,

‘butlership‘,

‘butter‘,

‘buy‘,

‘by‘,

‘cakes‘,

‘calf‘,

‘call‘,

‘called‘,

‘came‘,

‘camel‘,

‘camels‘,

‘camest‘,

‘can‘,

‘cannot‘,

‘canst‘,

‘captain‘,

‘captive‘,

‘captives‘,

‘carcases‘,

‘carried‘,

‘carry‘,

‘cast‘,

‘castles‘,

‘catt‘,

‘cattle‘,

‘caught‘,

‘cause‘,

‘caused‘,

‘cave‘,

‘cease‘,

‘ceased‘,

‘certain‘,

‘certainly‘,

‘chain‘,

‘chamber‘,

‘change‘,

‘changed‘,

‘changes‘,

‘charge‘,

‘charged‘,

‘chariot‘,

‘chariots‘,

‘chesnut‘,

‘chi‘,

‘chief‘,

‘child‘,

‘childless‘,

‘childr‘,

‘children‘,

‘chode‘,

‘choice‘,

‘chose‘,

‘circumcis‘,

‘circumcise‘,

‘circumcised‘,

‘citi‘,

‘cities‘,

‘city‘,

‘clave‘,

‘clean‘,

‘clear‘,

‘cleave‘,

‘clo‘,

‘closed‘,

‘clothed‘,

‘clothes‘,

‘cloud‘,

‘clusters‘,

‘co‘,

‘coat‘,

‘coats‘,

‘coffin‘,

‘cold‘,

...]

len(set(text3))

2789

len(text3)/len(set(text3))

16.050197203298673

text3.count(‘smote‘)

5

100*text4.count(‘a‘)/len(text4)

1.4643016433938312

def lexical_diversity(text):

# lexical英[‘leks?k(?)l] 美 [‘l?ks?kl]

# adj.词汇的;[语] 词典的;词典编纂的

# diversity英[da?‘v??s?t?; d?-]美 [d??v?s?ti]

# n.多样性;差异

return len(text)/len(set(text))

def percentage(count, total):

return 100*count/total

print(‘text3中词汇多样性指标:{}‘.format(lexical_diversity(text3)))

print(‘text4中单词a占全文的百分比:{}‘.format(percentage(text4.count(‘a‘),len(text4))))

text3中词汇多样性指标:16.050197203298673

text4中单词a占全文的百分比:1.4643016433938312

列表 = Lists

sent1 = [‘Call‘, ‘me‘,‘Ishmael‘,‘.‘]

print(‘打印sent1中的内容:{}‘.format(sent1))

print(‘打印sent1中内容的长度:{}‘.format(len(sent1)))

print(‘sent1中词汇多样性指标:{}‘.format(lexical_diversity(sent1)))

打印sent1中的内容:[‘Call‘, ‘me‘, ‘Ishmael‘, ‘.‘]

打印sent1中内容的长度:4

sent1中词汇多样性指标:1.0

sent1,sent2,sent3,sent4 # 这是内部定义好的列表

([‘Call‘, ‘me‘, ‘Ishmael‘, ‘.‘],

[‘The‘,

‘family‘,

‘of‘,

‘Dashwood‘,

‘had‘,

‘long‘,

‘been‘,

‘settled‘,

‘in‘,

‘Sussex‘,

‘.‘],

[‘In‘,

‘the‘,

‘beginning‘,

‘God‘,

‘created‘,

‘the‘,

‘heaven‘,

‘and‘,

‘the‘,

‘earth‘,

‘.‘],

[‘Fellow‘,

‘-‘,

‘Citizens‘,

‘of‘,

‘the‘,

‘Senate‘,

‘and‘,

‘of‘,

‘the‘,

‘House‘,

‘of‘,

‘Representatives‘,

‘:‘])

sent4+sent1

[‘Fellow‘,

‘-‘,

‘Citizens‘,

‘of‘,

‘the‘,

‘Senate‘,

‘and‘,

‘of‘,

‘the‘,

‘House‘,

‘of‘,

‘Representatives‘,

‘:‘,

‘Call‘,

‘me‘,

‘Ishmael‘,

‘.‘]

sent1.append(‘Some‘)

[‘Call‘, ‘me‘, ‘Ishmael‘, ‘.‘, ‘Some‘, ‘Some‘, ‘Some‘, ‘Some‘]

列表索引

type(text4)

nltk.text.Text

text4[173]

‘awaken‘

text4.index(‘awaken‘)

173

text5[16715:16735]

[‘U86‘,

‘thats‘,

‘why‘,

‘something‘,

‘like‘,

‘gamefly‘,

‘is‘,

‘so‘,

‘good‘,

‘because‘,

‘you‘,

‘can‘,

‘actually‘,

‘play‘,

‘a‘,

‘full‘,

‘game‘,

‘without‘,

‘buying‘,

‘it‘]

text6[1600:1625]

[‘We‘,

"‘",

‘re‘,

‘an‘,

‘anarcho‘,

‘-‘,

‘syndicalist‘,

‘commune‘,

‘.‘,

‘We‘,

‘take‘,

‘it‘,

‘in‘,

‘turns‘,

‘to‘,

‘act‘,

‘as‘,

‘a‘,

‘sort‘,

‘of‘,

‘executive‘,

‘officer‘,

‘for‘,

‘the‘,

‘week‘]

变量

sent1 = [‘Call‘,‘me‘,‘Ishmael‘,‘.‘]

my_sent = [‘Bravely‘,‘bold‘,‘Sir‘,‘Robin‘,‘,‘,‘rode‘,‘forth‘,‘from‘,‘Camelot‘,‘.‘]

noun_phrase = my_sent[1:4]

print(‘打印切片后的列表:noun_phrase-》{}‘.format(noun_phrase))

wOrDs = sorted(noun_phrase)

print(‘打印排序后的列表:wOrDs-》{}‘.format(wOrDs))

打印切片后的列表:noun_phrase-》[‘bold‘, ‘Sir‘, ‘Robin‘]

打印排序后的列表:wOrDs-》[‘Robin‘, ‘Sir‘, ‘bold‘]

字符串

name = ‘bright‘

print(‘打印name中的第一个字母:{}‘.format(name[0]))

print(name[:4])

print(name*2)

print(name + ‘!‘)

打印name中的第一个字母:b

brig

brightbright

bright!

‘ ‘.join([‘Monty‘, ‘Python‘])

‘Monty Python‘

‘Monty Python‘.split()

[‘Monty‘, ‘Python‘]

saying = [‘After‘,‘all‘,‘is‘,‘said‘,‘and‘,‘done‘,‘more‘,‘is‘,‘said‘,‘than‘,‘done‘]

tokens = set(saying)

tokens = sorted(tokens)

tokens[-2:]

[‘said‘, ‘than‘]

fdist1 = FreqDist(text1)

vocabulary1 = fdist1.keys()

type(vocabulary1)

dict_keys

fdist1.plot(50, cumulative=True)

#Cumulative frequency plot for the 50 most frequently used words in Moby Dick, which

#account for nearly half of the tokens.

fdist1.hapaxes() #the words that occur once only

[‘Herman‘,

‘Melville‘,

‘]‘,

‘ETYMOLOGY‘,

‘Late‘,

‘Consumptive‘,

‘School‘,

‘threadbare‘,

‘lexicons‘,

‘mockingly‘,

‘flags‘,

‘mortality‘,

‘signification‘,

‘HACKLUYT‘,

‘Sw‘,

‘HVAL‘,

‘roundness‘,

‘Dut‘,

‘Ger‘,

‘WALLEN‘,

‘WALW‘,

‘IAN‘,

‘RICHARDSON‘,

‘KETOS‘,

‘GREEK‘,

‘CETUS‘,

‘LATIN‘,

‘WHOEL‘,

‘ANGLO‘,

‘SAXON‘,

‘WAL‘,

‘HWAL‘,

‘SWEDISH‘,

‘ICELANDIC‘,

‘BALEINE‘,

‘BALLENA‘,

‘FEGEE‘,

‘ERROMANGOAN‘,

‘Librarian‘,

‘painstaking‘,

‘burrower‘,

‘grub‘,

‘Vaticans‘,

‘stalls‘,

‘higgledy‘,

‘piggledy‘,

‘gospel‘,

‘promiscuously‘,

‘commentator‘,

‘belongest‘,

‘sallow‘,

‘Pale‘,

‘Sherry‘,

‘loves‘,

‘bluntly‘,

‘Subs‘,

‘thankless‘,

‘Hampton‘,

‘Court‘,

‘hie‘,

‘refugees‘,

‘pampered‘,

‘Michael‘,

‘Raphael‘,

‘unsplinterable‘,

‘GENESIS‘,

‘JOB‘,

‘JONAH‘,

‘punish‘,

‘ISAIAH‘,

‘soever‘,

‘cometh‘,

‘incontinently‘,

‘perisheth‘,

‘PLUTARCH‘,

‘MORALS‘,

‘breedeth‘,

‘Whirlpooles‘,

‘Balaene‘,

‘arpens‘,

‘PLINY‘,

‘Scarcely‘,

‘TOOKE‘,

‘LUCIAN‘,

‘TRUE‘,

‘catched‘,

‘OCTHER‘,

‘VERBAL‘,

‘TAKEN‘,

‘MOUTH‘,

‘ALFRED‘,

‘890‘,

‘gudgeon‘,

‘retires‘,

‘MONTAIGNE‘,

‘APOLOGY‘,

‘RAIMOND‘,

‘SEBOND‘,

‘Nick‘,

‘RABELAIS‘,

‘cartloads‘,

‘STOWE‘,

‘ANNALS‘,

‘LORD‘,

‘BACON‘,

‘Touching‘,

‘ork‘,

‘DEATH‘,

‘sovereignest‘,

‘bruise‘,

‘HAMLET‘,

‘leach‘,

‘Mote‘,

‘availle‘,

‘returne‘,

‘againe‘,

‘worker‘,

‘Dinting‘,

‘paine‘,

‘thro‘,

‘maine‘,

‘FAERIE‘,

‘Immense‘,

‘til‘,

‘DAVENANT‘,

‘PREFACE‘,

‘GONDIBERT‘,

‘spermacetti‘,

‘Hosmannus‘,

‘Nescio‘,

‘VIDE‘,

‘Spencer‘,

‘Talus‘,

‘flail‘,

‘threatens‘,

‘jav‘,

‘lins‘,

‘WALLER‘,

‘SUMMER‘,

‘ISLANDS‘,

‘Commonwealth‘,

‘Civitas‘,

‘OPENING‘,

‘SENTENCE‘,

‘HOBBES‘,

‘LEVIATHAN‘,

‘Silly‘,

‘Mansoul‘,

‘chewing‘,

‘sprat‘,

‘PILGRIM‘,

‘PROGRESS‘,

‘Created‘,

‘PARADISE‘,

‘LOST‘,

‘---"‘,

‘Hugest‘,

‘Stretched‘,

‘Draws‘,

‘FULLLER‘,

‘PROFANE‘,

‘HOLY‘,

‘STATE‘,

‘DRYDEN‘,

‘ANNUS‘,

‘MIRABILIS‘,

‘aground‘,

‘EDGE‘,

‘TEN‘,

‘SPITZBERGEN‘,

‘PURCHAS‘,

‘wantonness‘,

‘fuzzing‘,

‘vents‘,

‘HERBERT‘,

‘INTO‘,

‘ASIA‘,

‘AFRICA‘,

‘SCHOUTEN‘,

‘SIXTH‘,

‘CIRCUMNAVIGATION‘,

‘Elbe‘,

‘ducat‘,

‘herrings‘,

‘GREENLAND‘,

‘Several‘,

‘Fife‘,

‘Anno‘,

‘1652‘,

‘Pitferren‘,

‘SIBBALD‘,

‘FIFE‘,

‘KINROSS‘,

‘Myself‘,

‘Sperma‘,

‘ceti‘,

‘fierceness‘,

‘RICHARD‘,

‘STRAFFORD‘,

‘LETTER‘,

‘BERMUDAS‘,

‘PHIL‘,

‘TRANS‘,

‘1668‘,

‘PRIMER‘,

‘COWLEY‘,

‘1729‘,

‘"...‘,

‘frequendy‘,

‘insupportable‘,

‘disorder‘,

‘ULLOA‘,

‘SOUTH‘,

‘AMERICA‘,

‘sylphs‘,

‘petticoat‘,

‘Oft‘,

‘Tho‘,

‘RAPE‘,

‘LOCK‘,

‘NAT‘,

‘wales‘,

‘JOHNSON‘,

‘COOK‘,

‘dung‘,

‘lime‘,

‘juniper‘,

‘UNO‘,

‘VON‘,

‘TROIL‘,

‘LETTERS‘,

‘BANKS‘,

‘SOLANDER‘,

‘1772‘,

‘Nantuckois‘,

‘JEFFERSON‘,

‘MEMORIAL‘,

‘MINISTER‘,

‘REFERENCE‘,

‘PARLIAMENT‘,

‘SOMEWHERE‘,

‘guarding‘,

‘protecting‘,

‘robbers‘,

‘BLACKSTONE‘,

‘Rodmond‘,

‘suspends‘,

‘attends‘,

‘FALCONER‘,

‘Bright‘,

‘roofs‘,

‘domes‘,

‘rockets‘,

‘Around‘,

‘unwieldy‘,

‘COWPER‘,

‘VISIT‘,

‘LONDON‘,

‘HUNTER‘,

‘DISSECTION‘,

‘SMALL‘,

‘SIZED‘,

‘aorta‘,

‘gushing‘,

‘PALEY‘,

‘THEOLOGY‘,

‘mammiferous‘,

‘hind‘,

‘BARON‘,

‘CUVIER‘,

‘COLNETT‘,

‘PURPOSE‘,

‘EXTENDING‘,

‘SPERMACETI‘,

‘Floundered‘,

‘chace‘,

‘peopling‘,

‘Gather‘,

‘Led‘,

‘instincts‘,

‘trackless‘,

‘Assaulted‘,

‘voracious‘,

‘spiral‘,

‘MONTGOMERY‘,

‘WORLD‘,

‘FLOOD‘,

‘Paean‘,

‘fatter‘,

‘Flounders‘,

‘CHARLES‘,

‘LAMB‘,

‘TRIUMPH‘,

‘1690‘,

‘OBED‘,

‘Susan‘,

‘HAWTHORNE‘,

‘TWICE‘,

‘bespeak‘,

‘raal‘,

‘COOPER‘,

‘PILOT‘,

‘Berlin‘,

‘Gazette‘,

‘ECKERMANN‘,

‘CONVERSATIONS‘,

‘GOETHE‘,

‘ESSEX‘,

‘WAS‘,

‘ATTACKED‘,

‘FINALLY‘,

‘DESTROYED‘,

‘OWEN‘,

‘CHACE‘,

‘FIRST‘,

‘SAID‘,

‘VESSEL‘,

‘YORK‘,

‘1821‘,

‘piping‘,

‘dimmed‘,

‘phospher‘,

‘ELIZABETH‘,

‘OAKES‘,

‘SMITH‘,

‘amounted‘,

‘440‘,

‘SCORESBY‘,

‘Mad‘,

‘agonies‘,

‘endures‘,

‘infuriated‘,

‘rears‘,

‘snaps‘,

‘propelled‘,

‘observers‘,

‘opportunities‘,

‘habitudes‘,

‘BEALE‘,

‘offensively‘,

‘artful‘,

‘mischievous‘,

‘FREDERICK‘,

‘DEBELL‘,

‘1840‘,

‘October‘,

‘Raise‘,

‘ay‘,

‘THAR‘,

‘bowes‘,

‘os‘,

‘ROSS‘,

‘ETCHINGS‘,

‘CRUIZE‘,

‘1846‘,

‘Globe‘,

‘transactions‘,

‘relate‘,

‘HUSSEY‘,

‘SURVIVORS‘,

‘parried‘,

‘MISSIONARY‘,

‘JOURNAL‘,

‘TYERMAN‘,

‘boldest‘,

‘persevering‘,

‘REPORT‘,

‘DANIEL‘,

‘SPEECH‘,

‘SENATE‘,

‘APPLICATION‘,

‘ERECTION‘,

‘BREAKWATER‘,

‘CAPTORS‘,

‘WHALEMAN‘,

‘ADVENTURES‘,

‘BIOGRAPHY‘,

‘GATHERED‘,

‘HOMEWARD‘,

‘COMMODORE‘,

‘PREBLE‘,

‘REV‘,

‘CHEEVER‘,

‘MUTINEER‘,

‘BROTHER‘,

‘ANOTHER‘,

‘MCCULLOCH‘,

‘COMMERCIAL‘,

‘reciprocal‘,

‘clews‘,

‘SOMETHING‘,

‘UNPUBLISHED‘,

‘CURRENTS‘,

‘Pedestrians‘,

‘recollect‘,

‘gateways‘,

‘VOYAGER‘,

‘ARCTIC‘,

‘NEWSPAPER‘,

‘TAKING‘,

‘RETAKING‘,

‘HOBOMACK‘,

‘MIRIAM‘,

‘FISHERMAN‘,

‘appliance‘,

‘RIBS‘,

‘TRUCKS‘,

‘Terra‘,

‘Del‘,

‘Fuego‘,

‘DARWIN‘,

‘NATURALIST‘,

";--‘",

‘!\‘"‘,

‘WHARTON‘,

‘Loomings‘,

‘spleen‘,

‘regulating‘,

‘circulation‘,

‘Whenever‘,

‘drizzly‘,

‘hypos‘,

‘philosophical‘,

‘Cato‘,

‘Manhattoes‘,

‘reefs‘,

‘downtown‘,

‘gazers‘,

‘Circumambulate‘,

‘Corlears‘,

‘Coenties‘,

‘Slip‘,

‘Whitehall‘,

‘Posted‘,

‘sentinels‘,

‘spiles‘,

‘pier‘,

‘lath‘,

‘counters‘,

‘desks‘,

‘loitering‘,

‘shady‘,

‘Inlanders‘,

‘lanes‘,

‘alleys‘,

‘attract‘,

‘dale‘,

‘dreamiest‘,

‘shadiest‘,

‘quietest‘,

‘enchanting‘,

‘Saco‘,

‘crucifix‘,

‘Deep‘,

‘mazy‘,

‘Tiger‘,

‘Tennessee‘,

‘Rockaway‘,

‘Persians‘,

‘deity‘,

‘Narcissus‘,

‘ungraspable‘,

‘hazy‘,

‘quarrelsome‘,

‘offices‘,

‘abominate‘,

‘toils‘,

‘trials‘,

‘barques‘,

‘schooners‘,

‘broiling‘,

‘buttered‘,

‘judgmatically‘,

‘peppered‘,

‘reverentially‘,

‘idolatrous‘,

‘dotings‘,

‘ibis‘,

‘roasted‘,

‘bake‘,

‘plumb‘,

‘Van‘,

‘Rensselaers‘,

‘Randolphs‘,

‘Hardicanutes‘,

‘lording‘,

‘tallest‘,

‘decoction‘,

‘Seneca‘,

‘Stoics‘,

‘Testament‘,

‘promptly‘,

‘rub‘,

‘infliction‘,

‘BEING‘,

‘PAID‘,

‘urbane‘,

‘ills‘,

‘monied‘,

‘consign‘,

‘prevalent‘,

‘violate‘,

‘Pythagorean‘,

‘commonalty‘,

‘police‘,

‘surveillance‘,

‘programme‘,

‘solo‘,

‘CONTESTED‘,

‘ELECTION‘,

‘PRESIDENCY‘,

‘UNITED‘,

‘STATES‘,

‘ISHMAEL‘,

‘BLOODY‘,

‘AFFGHANISTAN‘,

‘managers‘,

‘genteel‘,

‘comedies‘,

‘farces‘,

‘cunningly‘,

‘disguises‘,

‘cajoling‘,

‘unbiased‘,

‘freewill‘,

‘discriminating‘,

‘overwhelming‘,

‘undeliverable‘,

‘itch‘,

‘forbidden‘,

‘ignoring‘,

‘lodges‘,

‘Carpet‘,

‘Bag‘,

‘Manhatto‘,

‘candidates‘,

‘penalties‘,

‘Tyre‘,

‘Carthage‘,

‘imported‘,

‘cobblestones‘,

‘bitingly‘,

‘shouldering‘,

‘price‘,

‘fervent‘,

‘asphaltic‘,

‘pavement‘,

‘flinty‘,

‘projections‘,

‘soles‘,

‘Too‘,

‘cheapest‘,

‘cheeriest‘,

‘invitingly‘,

‘particles‘,

‘peer‘,

‘Angel‘,

‘Doom‘,

‘wailing‘,

‘gnashing‘,

‘Wretched‘,

‘entertainment‘,

‘Moving‘,

‘emigrant‘,

‘poverty‘,

‘creak‘,

‘lodgings‘,

‘zephyr‘,

‘hob‘,

‘toasting‘,

‘observest‘,

‘sashless‘,

‘glazier‘,

‘reasonest‘,

‘chinks‘,

‘crannies‘,

‘lint‘,

‘chattering‘,

‘shiverings‘,

‘cob‘,

‘redder‘,

‘Orion‘,

‘glitters‘,

‘conservatories‘,

‘president‘,

‘temperance‘,

‘blubbering‘,

‘straggling‘,

‘wainscots‘,

‘reminding‘,

‘oilpainting‘,

‘besmoked‘,

‘defaced‘,

‘unequal‘,

‘crosslights‘,

‘hags‘,

‘delineate‘,

‘bewitched‘,

‘ponderings‘,

‘boggy‘,

‘soggy‘,

‘squitchy‘,

‘froze‘,

‘heath‘,

‘icebound‘,

‘represents‘,

‘Horner‘,

‘foundered‘,

‘clubs‘,

‘harvesting‘,

‘hacking‘,

‘horrifying‘,

‘Mixed‘,

‘Nathan‘,

‘Swain‘,

‘corkscrew‘,

‘Blanco‘,

‘sojourning‘,

‘fireplaces‘,

‘duskier‘,

‘cockpits‘,

‘rarities‘,

‘Projecting‘,

‘Within‘,

‘shelves‘,

‘flasks‘,

‘bustles‘,

‘deliriums‘,

‘Abominable‘,

‘tumblers‘,

‘cylinders‘,

‘goggling‘,

‘deceitfully‘,

‘tapered‘,

‘Parallel‘,

‘pecked‘,

‘footpads‘,

‘Fill‘,

‘shilling‘,

‘examining‘,

‘SKRIMSHANDER‘,

‘accommodated‘,

‘unoccupied‘,

‘haint‘,

‘pose‘,

‘whalin‘,

‘decidedly‘,

‘objectionable‘,

‘wander‘,

‘Battery‘,

‘ruminating‘,

‘adorning‘,

‘potatoes‘,

‘sartainty‘,

‘diabolically‘,

‘steaks‘,

‘undress‘,

‘looker‘,

‘rioting‘,

‘Grampus‘,

‘seed‘,

‘Feegees‘,

‘tramping‘,

‘Enveloped‘,

‘bedarned‘,

‘eruption‘,

‘officiating‘,

‘brimmers‘,

‘complained‘,

‘potion‘,

‘colds‘,

‘catarrhs‘,

‘liquor‘,

‘arrantest‘,

‘topers‘,

‘obstreperously‘,

‘aloof‘,

‘desirous‘,

‘hilarity‘,

‘coffer‘,

‘Southerner‘,

‘mountaineers‘,

‘Alleghanian‘,

‘missed‘,

‘supernaturally‘,

‘congratulate‘,

‘multiply‘,

‘bachelor‘,

‘abominated‘,

‘tidiest‘,

‘bedwards‘,

‘shan‘,

‘tablecloth‘,

‘Skrimshander‘,

‘bump‘,

‘spraining‘,

‘eider‘,

‘yoking‘,

‘rickety‘,

‘whirlwinds‘,

‘knockings‘,

‘dismissed‘,

‘popped‘,

‘cherishing‘,

‘chuckled‘,

‘chuckle‘,

‘mightily‘,

‘catches‘,

‘bamboozingly‘,

‘overstocked‘,

‘toothpick‘,

‘rayther‘,

‘BROWN‘,

‘slanderin‘,

‘farrago‘,

‘BROKE‘,

‘Sartain‘,

‘Mt‘,

‘Hecla‘,

‘persist‘,

‘mystifying‘,

‘unsay‘,

‘criminal‘,

‘Wall‘,

‘purty‘,

‘sarmon‘,

‘rips‘,

‘tellin‘,

‘bought‘,

‘balmed‘,

‘curios‘,

‘sellin‘,

‘inions‘,

‘fooling‘,

‘idolators‘,

‘Depend‘,

‘reg‘,

‘lar‘,

‘spliced‘,

‘Johnny‘,

‘sprawling‘,

‘Arter‘,

‘glim‘,

‘jiffy‘,

‘irresolute‘,

‘vum‘,

‘WON‘,

‘Folding‘,

‘scrutiny‘,

‘porcupine‘,

‘moccasin‘,

‘ponchos‘,

‘parade‘,

‘rainy‘,

‘remembering‘,

‘commended‘,

‘cobs‘,

‘Nod‘,

‘footfall‘,

‘unlacing‘,

‘blackish‘,

‘plasters‘,

‘inkling‘,

‘Placing‘,

‘crammed‘,

‘scalp‘,

‘mildewed‘,

‘Ignorance‘,

‘parent‘,

‘nonplussed‘,

‘undressing‘,

‘checkered‘,

‘Thirty‘,

‘frogs‘,

‘quaked‘,

‘wrapall‘,

‘dreadnaught‘,

‘fumbled‘,

‘Remembering‘,

‘manikin‘,

‘tenpin‘,

‘andirons‘,

‘jambs‘,

‘bricks‘,

‘appropriate‘,

‘applying‘,

‘hastier‘,

‘withdrawals‘,

‘antics‘,

‘devotee‘,

‘extinguishing‘,

‘unceremoniously‘,

‘bagged‘,

‘sportsman‘,

‘woodcock‘,

‘uncomfortableness‘,

‘deliberating‘,

‘puffed‘,

‘sang‘,

‘Stammering‘,

‘conjured‘,

‘responses‘,

‘debel‘,

‘flourishing‘,

‘Angels‘,

‘flourishings‘,

‘peddlin‘,

‘sleepe‘,

‘grunted‘,

‘gettee‘,

‘motioning‘,

‘comely‘,

‘insured‘,

‘Counterpane‘,

‘parti‘,

‘triangles‘,

‘interminable‘,

‘caper‘,

‘supperless‘,

‘21st‘,

‘hemisphere‘,

‘sigh‘,

‘Sixteen‘,

‘ached‘,

‘coaches‘,

‘stockinged‘,

‘slippering‘,

‘misbehaviour‘,

‘unendurable‘,

‘stepmothers‘,

‘misfortunes‘,

‘steeped‘,

‘shudderingly‘,

‘confounding‘,

‘soberly‘,

‘recurred‘,

‘predicament‘,

‘unlock‘,

‘bridegroom‘,

‘clasp‘,

‘hugged‘,

‘rouse‘,

‘snore‘,

‘scratch‘,

‘Throwing‘,

‘expostulations‘,

‘unbecomingness‘,

‘matrimonial‘,

‘dawning‘,

‘overture‘,

‘innate‘,

‘compliment‘,

‘civility‘,

‘rudeness‘,

‘toilette‘,

‘dressing‘,

‘donning‘,

‘gaspings‘,

‘booting‘,

‘caterpillar‘,

‘outlandishness‘,

‘manners‘,

‘education‘,

‘undergraduate‘,

‘dreamt‘,

‘cowhide‘,

‘pinched‘,

‘curtains‘,

‘indecorous‘,

‘contented‘,

‘restricting‘,

‘donned‘,

‘lathering‘,

‘unsheathes‘,

‘whets‘,

‘Rogers‘,

‘cutlery‘,

‘Afterwards‘,

‘baton‘,

‘Breakfast‘,

‘pleasantly‘,

‘bountifully‘,

‘laughable‘,

‘bosky‘,

‘unshorn‘,

‘gowns‘,

‘toasted‘,

‘lingers‘,

‘tarried‘,

‘barred‘,

‘Grub‘,

‘Park‘,

‘assurance‘,

‘polish‘,

‘occasioned‘,

‘embarrassed‘,

‘bashfulness‘,

‘duelled‘,

‘winking‘,

‘tastes‘,

‘sheepishly‘,

‘bashful‘,

‘icicle‘,

‘admirer‘,

‘cordially‘,

‘grappling‘,

‘genteelly‘,

‘eschewed‘,

‘undivided‘,

‘6‘,

‘circulating‘,

‘nondescripts‘,

‘Chestnut‘,

‘jostle‘,

‘Regent‘,

‘Lascars‘,

‘Bombay‘,

‘Apollo‘,

‘Feegeeans‘,

‘Tongatobooarrs‘,

‘Erromanggoans‘,

‘Pannangians‘,

‘Brighggians‘,

‘weekly‘,

‘Vermonters‘,

‘stalwart‘,

‘frames‘,

‘felled‘,

‘strutting‘,

‘wester‘,

‘bombazine‘,

‘cloak‘,

‘mow‘,

‘gloves‘,

‘joins‘,

‘outfit‘,

‘waistcoats‘,

‘Hay‘,

‘Seed‘,

‘tract‘,

‘dearest‘,

‘pave‘,

‘eggs‘,

‘patrician‘,

‘parks‘,

‘scraggy‘,

‘scoria‘,

‘Herr‘,

‘dowers‘,

‘nieces‘,

‘reservoirs‘,

‘maples‘,

‘bountiful‘,

‘proffer‘,

‘passer‘,

‘cones‘,

‘blossoms‘,

‘superinduced‘,

‘carnation‘,

‘Salem‘,

‘sweethearts‘,

‘Puritanic‘,

‘Whaleman‘,

‘Wrapping‘,

‘Each‘,

‘quote‘,

‘TALBOT‘,

‘Near‘,

‘Desolation‘,

‘1st‘,

‘SISTER‘,

‘ROBERT‘,

‘WILLIS‘,

‘ELLERY‘,

‘NATHAN‘,

‘COLEMAN‘,

‘WALTER‘,

‘CANNY‘,

‘SETH‘,

‘GLEIG‘,

‘Forming‘,

‘ELIZA‘,

‘31st‘,

‘MARBLE‘,

‘SHIPMATES‘,

‘EZEKIEL‘,

‘HARDY‘,

‘AUGUST‘,

‘3d‘,

‘1833‘,

‘WIDOW‘,

‘Shaking‘,

‘glazed‘,

‘Affected‘,

‘relatives‘,

‘unhealing‘,

‘sympathetically‘,

‘wounds‘,

‘bleed‘,

‘blanks‘,

...]

单词的精细选择

the set of all w such that w is an element of V (the vocabulary) and w has property P

{w|w \(\in\) V and P(w)}

The corresponding Python expression is given:

[w for w in V if p(w)]

V = set(text1)

long_words = [w for w in V if len(w)>15]

sorted(long_words)

[‘CIRCUMNAVIGATION‘,

‘Physiognomically‘,

‘apprehensiveness‘,

‘cannibalistically‘,

‘characteristically‘,

‘circumnavigating‘,

‘circumnavigation‘,

‘circumnavigations‘,

‘comprehensiveness‘,

‘hermaphroditical‘,

‘indiscriminately‘,

‘indispensableness‘,

‘irresistibleness‘,

‘physiognomically‘,

‘preternaturalness‘,

‘responsibilities‘,

‘simultaneousness‘,

‘subterraneousness‘,

‘supernaturalness‘,

‘superstitiousness‘,

‘uncomfortableness‘,

‘uncompromisedness‘,

‘undiscriminating‘,

‘uninterpenetratingly‘]

本文选自《Natural Language Processing with Python》

python3 自然语言处理_Python3NLTK-自然语言处理相关推荐

  1. 什么是自然语言处理?自然语言处理可以分为四大类?有哪些具体任务?

    什么是自然语言处理?自然语言处理可以分为四大类?有哪些具体任务? NLP:Natural  Language Processing NLU:Natural  Language Understandin ...

  2. 自然语言c,自然语言处理_自然语言处理常用方法举例说明 - 人工智能 - 电子发烧友网...

    自然语言处理简介 自然语言处理是计算机科学领域与人工智能领域中的一个重要方向.它研究能实现人与计算机之间用自然语言进行有效通信的各种理论和方法.自然语言处理是一门融语言学.计算机科学.数学于一体的科学 ...

  3. 自然语言处理—初始自然语言处理技术—走进自然语言处理

    (一) 自然语言处理概念及发展 什么是自然语言处理? 自然语言处理(NLP)是计算机科学领域以及人工智能领域的一个重要的研究方向,它研究用计算机来处理.理解以及运用人类语言(如中文.英文等),达到人与 ...

  4. 自然语言理解和自然语言处理_自然语言处理简单说明

    自然语言理解和自然语言处理 什么是自然语言处理? (What is natural language processing?) Natural language processing, or NLP, ...

  5. 【自然语言处理】自然语言处理(NLP)知识结构总结

    感谢原创博主,转自:https://blog.csdn.net/meihao5/article/details/79592667 自然语言处理知识太庞大了,网上也都是一些零零散散的知识,比如单独讲某些 ...

  6. r与python自然语言处理_Python自然语言处理实践: 在NLTK中使用斯坦福中文分词器 | 我爱自然语言处理...

    斯坦福大学自然语言处理组是世界知名的NLP研究小组,他们提供了一系列开源的Java文本分析工具,包括分词器(Word Segmenter),词性标注工具(Part-Of-Speech Tagger), ...

  7. nlp自然语言处理_自然语言处理(NLP):不要重新发明轮子

    nlp自然语言处理 介绍 (Introduction) Natural language processing (NLP) is an intimidating name for an intimid ...

  8. 用python进行自然语言处理_Python自然语言处理示例:SVM和贝叶斯分类

    ❝ 关于自然语言处理(NLP)方面的文章.书籍非常之多,对于自然语言处理的上手,很多人是不知所措的.通过对本文的浏览,您应该能够对自然语言处理会有一个能够完整的有趣的基于Python的对自然语言处理的 ...

  9. pcfg 自然语言处理_自然语言处理导航

    NLTK教程: jieba教程: tensorflow教程: Machine Learning Repository:(可下载机器学习中的数据集) NLP参考资源 自然语言处理(Natural Lan ...

  10. python和nltk自然语言处理书评_python自然语言处理_自然语言处理入门

    说明:本文是<Python数据分析与数据化运营>中的"3.12.4 自然语言文本预处理".下面是正文内容-与数据库 本文从概念和实际操作量方面,从零开始,介绍在Pyth ...

最新文章

  1. python全栈开发 * 22 面向对象 知识点汇总 * 180703
  2. 前端面试高频题:删除数组重复元素的多种方法
  3. java多个mapreduce_一个简单的MapReduce示例(多个MapReduce任务处理)
  4. 信用贷款常见问题应对话术
  5. Go http访问使用代理
  6. 第 0001 天:聊聊成长型思维模式者
  7. Java文件– java.nio.file.Files类
  8. Oracle_PL/SQL(3) 游标
  9. 前端性能优化—js代码打包 1
  10. python如何退出while循环_python如何跳出while循环
  11. cortex系列处理器排行_ARM Cortex-M 处理器家族介绍和比较-控制器/处理器-与非网...
  12. 计算机ppt以学校生活为主题的作文,校园生活为话题的作文(精选10篇)
  13. ios开发的p12和provision
  14. 三次方程求根公式例子
  15. 静态路由和动态路由的融会贯通(思科/华为)
  16. 南阳oj入门题-蛇形填数
  17. lol国服维护可以玩别的服务器吗,LOL:除了艾欧尼亚,其他服务器都只能算是“郊区”吗?...
  18. 简单粗暴安卓全屏幕适配
  19. 计算机网络实验指导书谢希仁,计算机网络(谢希仁)实验指导书.doc
  20. 思源宋体(Source Han Serif)安装

热门文章

  1. spark on yarn启用动态分配
  2. 读写分离怎么做,怎么实现
  3. 药家鑫案二审驳回上诉 维持一审死刑判决,
  4. iqooneo5隐藏应用方法分享(2021)
  5. 软考高项:信息系统项目管理师试题(2022年11月)
  6. Reactive Programming with RxJava,介绍一本书和Rx
  7. element ui 日期选择器 选择日期范围 添加默认值
  8. XSS小游戏的通关之路
  9. Python轻松实现对英文文章单词总数统计
  10. unity编辑器拓展十一——将两张RGB图合并成一张