Python_PyQuery模块打开本地文件报错UnicodeDecodeError: 'gbk' codec can't decode byte 0xa0 in position 84

问题描述：在使用pyquery.PyQuery打开本地文件的时候，会报错，不是安装的问题，报错如下

UnicodeDecodeError: 'gbk' codec can't decode byte 0xa0 in position 84: illegal multibyte sequence

'gbk'编解码器无法解码位置84的0xa0字节：非法多字节序列

代码如下：

from pyquery import PyQueryresult = PyQuery(filename='21.html')
print(result('p'))

Traceback (most recent call last):
File "E:/project/allow/cuiqingcai_pdf/014_PyQuery.py", line 28, in <module>
result = PyQuery(filename='21.html')
File "E:\project\venv\lib\site-packages\pyquery\pyquery.py", line 230, in __init__
elements = fromstring(html, self.parser)
File "E:\project\venv\lib\site-packages\pyquery\pyquery.py", line 95, in fromstring
result = getattr(etree, meth)(context)
File "src\lxml\etree.pyx", line 3426, in lxml.etree.parse
File "src\lxml\parser.pxi", line 1861, in lxml.etree._parseDocument
File "src\lxml\parser.pxi", line 1881, in lxml.etree._parseFilelikeDocument
File "src\lxml\parser.pxi", line 1776, in lxml.etree._parseDocFromFilelike
File "src\lxml\parser.pxi", line 1187, in lxml.etree._BaseParser._parseDocFromFilelike
File "src\lxml\parser.pxi", line 601, in lxml.etree._ParserContext._handleParseResultDoc
File "src\lxml\parser.pxi", line 707, in lxml.etree._handleParseResult
File "src\lxml\etree.pyx", line 316, in lxml.etree._ExceptionContext._raise_if_stored
File "src\lxml\parser.pxi", line 370, in lxml.etree._FileReaderContext.copyToBuffer
UnicodeDecodeError: 'gbk' codec can't decode byte 0xa0 in position 84: illegal multibyte sequence

特此百度了一下，很多小伙伴说把中文去掉中文就可以了，试了下果然成功，但是感觉哪里不对，去pyquery的完整官方文档中也没有发现

官网地址：https://pythonhosted.org/pyquery/

于是乎，换一种方式：先将文件使用utf-8格式读取到定义的变量中，再将这个变量作为PyQuery的参数

试了下。成功了(虽然不知道具体的原理)

代码如下:

from pyquery import PyQuerywith open("./21.html","r",encoding="utf-8")as f:content = f.read()
result = PyQuery(content)
print(result('p'))

执行结果：

这意味着要把这段文字用斜体来显示
 如果强调太少
 如果规定了 !DOCTYPE，
 如
 asdfasdf
 0000000

Python_PyQuery模块打开本地文件报错UnicodeDecodeError: 'gbk' codec can't decode byte 0xa0 in position 84相关推荐

解决Python打开文件报错UnicodeDecodeError: 'gbk' codec can't decode byte
用Python打开文件时报错: UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 10: illegal multi ...
解决Python报错UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 658: illegal multibyte
解决Python报错–UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 658: illegal multibyte ...
python 读取文件时报错UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 205: illegal multib
python 读取文件时报错UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 205: illegal multib ...
python报错UnicodeDecodeError: ‘gbk‘ codec can‘t decode byte 0x97 in position的解决方法
在编写代码时,调用python解释器中的模块时出现 UnicodeDecodeError: 'gbk' codec can't decode byte 0x97 in position 20: ill ...
GBK解码报错-UnicodeDecodeError: ‘gbk‘ codec can‘t decode byte 0xa1 in position 98: illegal multibyte seq
文章目录背景原因分析解决办法办法一方法二总结背景在PyCharm中,创建一个带有中文的html文件,进行读取的时候出现如下报错: UnicodeDecodeError: 'gbk' c ...
Python报错UnicodeDecodeError: ‘gbk‘ codec can‘t decode byte 0x80 in position 10
Python报错(字节编码gbk) UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 10: illegal mul ...
Python报错:UnicodeDecodeError: ‘gbk‘ codec can‘t decode byte 0x8c in position 20: illegal multibyte...
非法的多字节序列,转换的时候发生错误. 如果你在直接读取txt文件: 代码是open(f,"r") f=r"H:\python_project\a.txt" f ...
import configparser config.read(config_path) 报错 UnicodeDecodeError: ‘gbk‘ codec can‘t decode byte 0x
import configparser config.read(config_path) 报错 UnicodeDecodeError: 'gbk' codec can't decode byte 0x ...
python读txt文件报错UnicodeDecodeError: ‘gbk‘ codec can‘t decode
python读取文件时提示"UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 205: illegal m ...

Python_PyQuery模块打开本地文件报错UnicodeDecodeError: 'gbk' codec can't decode byte 0xa0 in position 84

Python_PyQuery模块打开本地文件报错UnicodeDecodeError: 'gbk' codec can't decode byte 0xa0 in position 84相关推荐

最新文章

热门文章