Found duplicate column(s) when inserting into hdfs://nameservice1/origin_data/events_7/data: `dt`;
1.现象
spark_sql运行有报警如下;
有问题的sql:
select g.dt, frequent , wk , hr , user_id , k.`$name` as user_name , os , manufacturer , page_name , page_url , regexp_replace(button_name,'\\n|\\r|\\t','') as button_name , button_type , first_visit_time
, last_visit_time , pv , session_cnt , page_cnt , session_dur , total_dur , load_dur , max_load_dur , min_load_dur , search_content , search_cnt
, max_search_dur , min_search_dur , total_search_dur , max_search_cnt , page_visit_dur , buy_time , error_reason , type , uv , father , son , index,g.dt
from (
select dt , frequent , wk , hr , user_id , os , manufacturer , page_name , page_url , button_name , button_type , first_visit_time
, last_visit_time , pv , session_cnt , page_cnt , session_dur , total_dur , load_dur , max_load_dur , min_load_dur , search_content , search_cnt
, max_search_dur , min_search_dur , total_search_dur , max_search_cnt , page_visit_dur , buy_time , error_reason , type , uv , father , son , index
from day_total
union all select * from hour_total
union all select * from day_page
union all select * from day_button
union all select * from hour_error
union all select * from launch
union all select * from decision
union all select * from visit_back
union all select * from province
union all select * from os
union all select * from manufacturer
union all select * from roadmap1
union all select * from roadmap2
) g
left join users k on g.user_id = k.id
报警详细信息:
Exception in thread "main" org.apache.spark.sql.AnalysisException: Found duplicate column(s) when inserting into hdfs://nameservice1/origin_data/events_7/data: `dt`;at org.apache.spark.sql.util.SchemaUtils$.checkColumnNameDuplication(SchemaUtils.scala:85)at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:65)at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)at org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122)at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:668)at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:668)at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:668)at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:276)at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:270)at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:228)at org.apache.spark.sql.DataFrameWriter.csv(DataFrameWriter.scala:656)at com.tcl.kudu.crumb_applet$.main(crumb_applet.scala:476)at com.tcl.kudu.crumb_applet.main(crumb_applet.scala)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:851)at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:926)at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:935)at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
2.问题解决
最后查询的sql有两个相同的dt 的字段, g.dt 删除一个后恢复,
Found duplicate column(s) when inserting into hdfs://nameservice1/origin_data/events_7/data: `dt`;相关推荐
- SQL Error: 957, SQLState: 42000 ORA-00957: duplicate column name
1. PositionLinkage config <?xml version="1.0" encoding="utf-8"?> <!DOCT ...
- SQL错误Duplicate column name 'NAME'名字重复应使用别名
SQL错误Duplicate column name 'NAME'名字重复应使用别名 <select id="getCurrentAndInformation" result ...
- Java java.sql.SQLSyntaxErrorException:Duplicate column name ‘xxx‘问题解决
问题描述: java.sql.SQLSyntaxErrorException: Duplicate column name 'username'; bad SQL grammar []; 问题分析: ...
- duplicate column name
在建数据库表的字段时,duplicate column name是因为有重复列,删除一个或者改个别的名字就可以了
- Duplicate column name错误办法
mysql报错Duplicate column name错误怎么解决? 将重复的列名重命名即可.
- 创建表时出现Duplicate column product问题的解决
本文将介绍MySQL在创建表时出现'Duplicate column product'问题的解决方法,问题如下. 在应用表自连接时,出现Duplicate column product的问题. 重复的 ...
- Navicat 筛选或插入某个字段出现1060 - Duplicate column name ‘XXX‘错误,以及导入sql文件时数据丢失问题。
在mysql中,多个表联合查询或添加某个字段时,出现错误:[Err] 1060 - Duplicate column name 'XXX',主要原因是表中存在重复字段造成的结果,分两种情况: (1)使 ...
- mysql错误代码: 1060 Duplicate column name ‘sno‘
文章目录 报错信息 报错原因解释 解决办法 报错信息 错误代码: 1060 Duplicate column name 'sno' 报错:重复列名'sno' 报错原因解释 官方说法:当查出来的虚拟表中 ...
- 解决 duplicate column name
duplicate column name ,列名重复. 原因: 1.有重复的列 解决: 1.删除其中一个列
最新文章
- [14] 薪酬迅速翻倍的13条跳槽原则
- 12个现实世界中的机器学习真相
- MSChart使用导航之开发
- 精心收集汇总的Python学习资源(书籍+工具+视频),强烈建议收藏!
- Direct2D (35) : 通过 DirectWrite 获取字体列表
- IPV6 Socket编程
- haproxy高可用
- CentOS依赖包查找工具(https://centos.pkgs.org)
- 51单片机C语言智能小车,基于51单片机智能小车的设计与实现
- McaFee企业版v8.0i设置指南
- 计算机的三种基础运算,计算机基础知识(计算机的基本运算).ppt
- 使用Selenium启动火狐浏览器
- matlab二维函数的傅立叶变换,二维傅里叶变换和滤波(Two
- SlideBox 间隔滚动效果
- Electron修改图标
- 华为系列服务器账号密码,常用设备管理口默认用户名密码汇总
- DL1 - Neural Networks and Deep Learning
- 深度学习和神经网络的介绍(一)
- linux的多重启动管理器,使用多重启动管理器GRUB引导Linux系统.pdf
- 百度地图电子围栏判断
热门文章
- GIT用SSH链接的相关文档的整理与补充
- Linux C获取当前时间(精确到微秒)
- 国密SM2公钥点压缩解压C源码
- 中国手机支付行业竞争现状及市场发展格局分析报告2022-2028年版
- 减少在线去重造成的数据碎片
- 【信号与系统】(一 )信号与系统概述——信号的基本概念与分类
- linux bios设置界面,BIOS怎么设置 史上最详细的bios设置图解教程
- 如何用word制作英语答题卡_考研英语答题卡模板(word打印版)
- 京东商品及评论 数据采集
- PyQt(Python+Qt)学习随笔:树型部件QTreeWidget中当前列currentColumn和选中项selectedItems访问方法