mysql skewed_Hive分区字段含中文报错问题解决方案
使用Hive创建动态分区时,如果分区中含有中文,会报以下错误。
Illegal mix of collations (latin1_bin,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation ‘=’
复制代码
原因应该是分区表的编码和全局编码不相同造成的
提供两种解决方案:
方案一:修改mysql配置
临时修改:进入mysql中设置如下参数
set character_set_client = utf8;
set character_set_connection = utf8;
set character_set_results = utf8;
SET collation_server = utf8_general_ci
SET collation_database = utf8_general_ci
复制代码
永久修改:修改mysql配置文件后重启
[root@ambari03 etc] vim /etc/my.cnf
[client]下添加如下内容
[client]
default-character-set=utf8
[mysqld]下添加如下内容
[mysqld]
default-character-set=utf8
init_connect=’SET NAMES utf8′
[mysql]下添加如下内容
[mysql]
default-character-set=utf8
复制代码
然后systemctl restart mysqld重启mysql服务
使用该方法后仍未能解决问题!
方案二:修改mysql中Hive元数据库各个表的编码
进入mysql后,执行以下语句修改hive元数据表的编码信息
alter database hive_meta default character set utf8;
alter table BUCKETING_COLS default character set utf8;
alter table CDS default character set utf8;
alter table COLUMNS_V2 default character set utf8;
alter table DATABASE_PARAMS default character set utf8;
alter table DBS default character set utf8;
alter table FUNCS default character set utf8;
alter table FUNC_RU default character set utf8;
alter table GLOBAL_PRIVS default character set utf8;
alter table PARTITIONS default character set utf8;
alter table PARTITION_KEYS default character set utf8;
alter table PARTITION_KEY_VALS default character set utf8;
alter table PARTITION_PARAMS default character set utf8;
— alter table PART_COL_STATS default character set utf8;
alter table ROLES default character set utf8;
alter table SDS default character set utf8;
alter table SD_PARAMS default character set utf8;
alter table SEQUENCE_TABLE default character set utf8;
alter table SERDES default character set utf8;
alter table SERDE_PARAMS default character set utf8;
alter table SKEWED_COL_NAMES default character set utf8;
alter table SKEWED_COL_VALUE_LOC_MAP default character set utf8;
alter table SKEWED_STRING_LIST default character set utf8;
alter table SKEWED_STRING_LIST_VALUES default character set utf8;
alter table SKEWED_VALUES default character set utf8;
alter table SORT_COLS default character set utf8;
alter table TABLE_PARAMS default character set utf8;
alter table TAB_COL_STATS default character set utf8;
alter table TBLS default character set utf8;
alter table VERSION default character set utf8;
alter table BUCKETING_COLS convert to character set utf8;
alter table CDS convert to character set utf8;
alter table COLUMNS_V2 convert to character set utf8;
alter table DATABASE_PARAMS convert to character set utf8;
alter table DBS convert to character set utf8;
alter table FUNCS convert to character set utf8;
alter table FUNC_RU convert to character set utf8;
alter table GLOBAL_PRIVS convert to character set utf8;
alter table PARTITIONS convert to character set utf8;
alter table PARTITION_KEYS convert to character set utf8;
alter table PARTITION_KEY_VALS convert to character set utf8;
alter table PARTITION_PARAMS convert to character set utf8;
— alter table PART_COL_STATS convert to character set utf8;
alter table ROLES convert to character set utf8;
alter table SDS convert to character set utf8;
alter table SD_PARAMS convert to character set utf8;
alter table SEQUENCE_TABLE convert to character set utf8;
alter table SERDES convert to character set utf8;
alter table SERDE_PARAMS convert to character set utf8;
alter table SKEWED_COL_NAMES convert to character set utf8;
alter table SKEWED_COL_VALUE_LOC_MAP convert to character set utf8;
alter table SKEWED_STRING_LIST convert to character set utf8;
alter table SKEWED_STRING_LIST_VALUES convert to character set utf8;
alter table SKEWED_VALUES convert to character set utf8;
alter table SORT_COLS convert to character set utf8;
alter table TABLE_PARAMS convert to character set utf8;
alter table TAB_COL_STATS convert to character set utf8;
alter table TBLS convert to character set utf8;
alter table VERSION convert to character set utf8;
— alter table PART_COL_STATS convert to character set utf8;
SET character_set_client = utf8 ;
— SET character_set_connection = utf8 ;
— alter table PART_COL_STATS convert to character set utf8;
SET character_set_database = utf8 ;
SET character_set_results = utf8 ;
SET character_set_server = utf8 ;
— SET collation_connection = utf8 ;
— SET collation_database = utf8 ;
— SET collation_server = utf8 ;
SET NAMES ‘utf8’;
复制代码
修改后能够插入带有中文分区的表格,但是插入后虽然可以查询到数据已经插入,Hive仍然会报错
查看日志后定位到如下错误
2021-02-23T10:54:26,249 ERROR [HiveServer2-Background-Pool: Thread-248] exec.StatsTask: Failed to run stats task
org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Insert of object “org.apache.hadoop.hive.metastore.model.MPartitionColumnStatistics@710c136f” using statement “INSERT INTOPART_COL_STATS
(CS_ID
,AVG_COL_LEN
,BIT_VECTOR
,CAT_NAME
,COLUMN_NAME
,COLUMN_TYPE
,DB_NAME
,BIG_DECIMAL_HIGH_VALUE
,BIG_DECIMAL_LOW_VALUE
,DOUBLE_HIGH_VALUE
,DOUBLE_LOW_VALUE
,LAST_ANALYZED
,LONG_HIGH_VALUE
,LONG_LOW_VALUE
,MAX_COL_LEN
,NUM_DISTINCTS
,NUM_FALSES
,NUM_NULLS
,NUM_TRUES
,PART_ID
,PARTITION_NAME
,TABLE_NAME
) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)” failed : Incorrect string value: ‘\xE6\xB9\x96\xE5\x8C\x97’ for column ‘PARTITION_NAME’ at row 1)
复制代码
Mysql中查看metastore库的ART_COL_STATS表
use metastore;
— show create table PART_COL_STATS;
show full columns from PART_COL_STATS;
复制代码
部分字段结果如下
Field Type Collation CS_ID bigint(20) CAT_NAME varchar(256) latin1_bin DB_NAME varchar(128) latin1_bin TABLE_NAME varchar(256) latin1_bin PARTITION_NAME varchar(500) latin1_bin COLUMN_NAME varchar(767) latin1_bin COLUMN_TYPE varchar(128) latin1_bin PART_ID bigint(20) LONG_LOW_VALUE bigint(20) LONG_HIGH_VALUE bigint(20) DOUBLE_HIGH_VALUE double(53,4) DOUBLE_LOW_VALUE double(53,4) BIG_DECIMAL_LOW_VALUE varchar(4000) latin1_bin BIG_DECIMAL_HIGH_VALUE varchar(4000) latin1_bin NUM_NULLS bigint(20) NUM_DISTINCTS bigint(20) BIT_VECTOR blob AVG_COL_LEN double(53,4) MAX_COL_LEN bigint(20) NUM_TRUES bigint(20) NUM_FALSES bigint(20) LAST_ANALYZED bigint(20)
可以看到需要PARTITION_NAME字段编码仍然为latin1,因此需要将PARTITION_NAME字段编码修改为utf8
use metastore;
alter table PART_COL_STATS modify column PARTITION_NAME varchar(500) character set utf8;
复制代码
修改后重新插入带有中文分区的表格,可以正常运行
参考案例如下
create database mydb;
use mydb;
— ① 创建不带分区的stu表
create table stu
(
name string,
age int
) row format delimited fields terminated by ‘\t’;
— ② 上传数据到hdfs对应目录
— ③ 创建分区表emp
create table emp
(
name string,
age int
)
partitioned by (provice string)
row format delimited fields terminated by ‘\t’;
— ④ 插入数据含有中文分区的emp表中
insert into emp partition(provice = “湖北”)
select * from stu;
复制代码
参考资料:
hive报错Illegal mix of collations
Hive 解决中文分区问题 Illegal mix of collations
Hive 中文分区 展示 乱码 注释乱码问题
mysql skewed_Hive分区字段含中文报错问题解决方案相关推荐
- python 格式化时间含中文报错
报错内容 UnicodeEncodeError: 'locale' codec can't encode character '\u5e74' in position 2: Illegal byte ...
- JAVA连接mysql字段插入中文报错Incorrect string value: ‘\xXX\xXX\xXX\xXX‘ for column ‘xxx‘
解决方案 数据库字符集使用utf8mb4 表字符集使用utf8mb4 如果报错字段类型为longtext,需要在数据库连接字符串中增加参数clobCharacterEncoding=utf-8
- mysql 对 GENERATED 字段更新时候报错
在MySQL中,由于GENERATED字段是动态生成的,不能进行直接的更新操作,需要通过修改生成规则来修改字段的值.如果尝试直接更新GENERATED字段的值,会出现以下错误信息: ERROR 310 ...
- 通俗易懂地解决中文乱码问题(2) --- 分析解决Mysql插入移动端表情符报错 ‘incorrect string value: '\xF0......
原文:[原创]通俗易懂地解决中文乱码问题(2) --- 分析解决Mysql插入移动端表情符报错 'incorrect string value: '\xF0... 这篇blog重点在解决问题,如果你对 ...
- mysql插入中文报错
关于插入mysql数据库的中文报错的问题 原因基本上就是一个:字符集格式未统一 查询一下字符集的格式: show variables like '%char%' 上图显示的是字符编码有很多种,未统一便 ...
- MySQL中的中文报错--保姆级解决方法
MySQL中的中文报错问题解决方法 文章目录 MySQL中的中文报错问题解决方法 一.搜索"服务",找到MySQL的服务器 二.找到MySQL服务器之后,右键打开属性,找到MySQ ...
- 微信表情符号 mysql_Emoji表情符号入MySQL数据库报错的解决方案
Emoji表情符号入MySQL数据库报错的解决方案 发布时间:2020-08-15 08:21:52 来源:ITPUB博客 阅读:136 作者:bestpaydata auther:Jane.Hoo ...
- 解决MySQL事务未提交导致死锁报错 避免死锁的方法
版权声明:本文为博主原创文章,遵循 CC 4.0 by-sa 版权协议,转载请附上原文出处链接和本声明. 本文链接:https://blog.csdn.net/xuheng8600/article/d ...
- 解决mysql特殊字符或者Emoji表情插入报错问题
解决mysql特殊字符或者Emoji表情插入报错问题 原因: MySQL的utf8编码最多3个字节,Emoji表情或者某些特殊字符是4个字节,所以数据插入不了,需要修改编码. 在MySQL 的&quo ...
- django html中文乱码,django中文乱码及中文报错问题
django是一个不错的WEB开源框架.今天测试,发现有些页面中文乱码,肯定是编码哪儿出了问题.django配置要修改settings. 1 2 LANGUAGE_CODE = 'zh-cn' TIM ...
最新文章
- pythonis啥意思-Python中is和==的区别
- 分类9个无理数并比较他们之间的分布差异
- 【Cocosd2d实例教程八】Cocos2d实现碰撞检测(含实例)
- Spring中事务内部调用引发的惨案
- spring-boot注解详解(六)
- AgileJava开源项目正式开始
- 超级烧脑惊悚悬疑电影《恐怖游轮》(原片+解说)
- ICLR'22 | 图机器学习最近都在研究什么?
- Retrofit的使用
- bing翻译api php,免费翻译接口
- 推荐Potplayer ----抛弃暴风影音
- 添加logviewer用户
- 随机预言模型和标准模型 -2014-03-24 15:35
- [零基础学Python]字典,你还记得吗?
- linux系统运行flash3d,在Linux上运行STM32,快来试试!
- 一款App的开发成本是多少?
- java中的反射机制是什么
- Keras教学(1):Keras是什么
- 吉林大学软件学院期末题答案(10-16级)
- selenium获取京东前三页奶瓶信息
热门文章
- 底层码农的Stanford梦 --- 从SCPD开始 [转]
- LeetCode——线段树
- Bluetooth 蓝牙介绍(五):低功耗蓝牙BLE Security
- iOS打包ipa文件
- 【问题记录】运行python+selenium程序报错,NoSuchWindowException: Browsing context has been discarded
- 面向对象之多态【向上转型与向下转型】
- E.03.08. Scrapped Plans for London Concert Hall Sour Mood for U.K. Musicians
- 企业如何进行客户细分 客户细分的方法和类型
- selenium+chromedriver自动打开谷歌进行搜索
- ArcGIS Server 发布服务失败