
 ~/irstlm/bin/compile-lm  \--text yes \news-commentary-v8.fr-en.lm.en.gz \news-commentary-v8.fr-en.arpa.en


compile-lm - compiles an ARPA format LM into an IRSTLM format oneUSAGE:compile-lm [options] <input-file.lm> [output-file.blm]DESCRIPTION:compile-lm reads a standard LM file in ARPA format and producesa compiled representation that the IRST LM toolkit can quicklyread and process. LM file can be compressed.OPTIONS:
Parameters:Help:      print this helpd:      verbose output for --eval option; default is 0debug:      verbose output for --eval option; default is 0dict_load_factor:      sets the load factor for ngram cache; it should be a positive real value; default is 0dub:      dictionary upperbound to compute OOV word penalty: default 10^7e:      computes perplexity of the specified text fileeval:      computes perplexity of the specified text filef:      filter a binary language model with a word listfilter:      filter a binary language model with a word listh:      print this helpi:      builds an inverted n-gram binary table for fast access; default if falseinvert:      builds an inverted n-gram binary table for fast access; default if falsekeepunigrams:      filter by keeping all unigrams in the table, default  is trueku:      filter by keeping all unigrams in the table, default  is truel:      maximum level to load from the LM; if value is larger than the actual LM order, the latter is takenlevel:      maximum level to load from the LM; if value is larger than the actual LM order, the latter is takenmemmap:      uses memory map to read a binary LMmm:      uses memory map to read a binary LMngram_load_factor:      sets the load factor for ngram cache; it should be a positive real value; default is falser:      computes N random calls on the specified text filerandcalls:      computes N random calls on the specified text files:      computes log-prob scores of n-grams from standard inputscore:      computes log-prob scores of n-grams from standard inputsentence:      computes perplexity at sentence level (identified through the end symbol)t:      output is again in text format; default is falsetext:      output is again in text format; default is falsetmpdir:      directory for temporary computation, default is either the environment variable TMP if defined or "/tmp")

也就是说 --text参数后面无需再加yes,不知道为什么Hieu加了yes,可能是版本不同?今晚给mailing list发个邮件试试


Moses manual 中Basline System 2.3.4节用IRSTLM创建语言模型的命令有误相关推荐

