1.现象

spark_sql运行有报警如下;
有问题的sql:

select g.dt, frequent , wk , hr , user_id , k.`$name` as user_name , os , manufacturer , page_name , page_url , regexp_replace(button_name,'\\n|\\r|\\t','') as button_name , button_type , first_visit_time
, last_visit_time , pv , session_cnt , page_cnt , session_dur , total_dur , load_dur , max_load_dur , min_load_dur , search_content , search_cnt
, max_search_dur , min_search_dur , total_search_dur , max_search_cnt , page_visit_dur , buy_time , error_reason , type , uv , father , son , index,g.dt
from (
select dt , frequent , wk , hr , user_id , os , manufacturer , page_name , page_url , button_name , button_type , first_visit_time
, last_visit_time , pv , session_cnt , page_cnt , session_dur , total_dur , load_dur , max_load_dur , min_load_dur , search_content , search_cnt
, max_search_dur , min_search_dur , total_search_dur , max_search_cnt , page_visit_dur , buy_time , error_reason , type , uv , father , son , index
from day_total
union all select * from hour_total
union all select * from day_page
union all select * from day_button
union all select * from hour_error
union all select * from launch
union all select * from decision
union all select * from visit_back
union all select * from province
union all select * from os
union all select * from manufacturer
union all select * from roadmap1
union all select * from roadmap2
) g
left join users k on g.user_id = k.id

报警详细信息:

Exception in thread "main" org.apache.spark.sql.AnalysisException: Found duplicate column(s) when inserting into hdfs://nameservice1/origin_data/events_7/data: `dt`;at org.apache.spark.sql.util.SchemaUtils$.checkColumnNameDuplication(SchemaUtils.scala:85)at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:65)at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)at org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122)at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:668)at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:668)at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:668)at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:276)at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:270)at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:228)at org.apache.spark.sql.DataFrameWriter.csv(DataFrameWriter.scala:656)at com.tcl.kudu.crumb_applet$.main(crumb_applet.scala:476)at com.tcl.kudu.crumb_applet.main(crumb_applet.scala)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:851)at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:926)at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:935)at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

2.问题解决

最后查询的sql有两个相同的dt 的字段, g.dt 删除一个后恢复,

Found duplicate column(s) when inserting into hdfs://nameservice1/origin_data/events_7/data: `dt`;相关推荐

  1. SQL Error: 957, SQLState: 42000 ORA-00957: duplicate column name

    1. PositionLinkage config <?xml version="1.0" encoding="utf-8"?> <!DOCT ...

  2. SQL错误Duplicate column name 'NAME'名字重复应使用别名

    SQL错误Duplicate column name 'NAME'名字重复应使用别名 <select id="getCurrentAndInformation" result ...

  3. Java java.sql.SQLSyntaxErrorException:Duplicate column name ‘xxx‘问题解决

    问题描述: java.sql.SQLSyntaxErrorException: Duplicate column name 'username'; bad SQL grammar []; 问题分析: ...

  4. duplicate column name

    在建数据库表的字段时,duplicate column name是因为有重复列,删除一个或者改个别的名字就可以了

  5. Duplicate column name错误办法

    mysql报错Duplicate column name错误怎么解决? 将重复的列名重命名即可.

  6. 创建表时出现Duplicate column product问题的解决

    本文将介绍MySQL在创建表时出现'Duplicate column product'问题的解决方法,问题如下. 在应用表自连接时,出现Duplicate column product的问题. 重复的 ...

  7. Navicat 筛选或插入某个字段出现1060 - Duplicate column name ‘XXX‘错误,以及导入sql文件时数据丢失问题。

    在mysql中,多个表联合查询或添加某个字段时,出现错误:[Err] 1060 - Duplicate column name 'XXX',主要原因是表中存在重复字段造成的结果,分两种情况: (1)使 ...

  8. mysql错误代码: 1060 Duplicate column name ‘sno‘

    文章目录 报错信息 报错原因解释 解决办法 报错信息 错误代码: 1060 Duplicate column name 'sno' 报错:重复列名'sno' 报错原因解释 官方说法:当查出来的虚拟表中 ...

  9. 解决 duplicate column name

    duplicate column name ,列名重复. 原因: 1.有重复的列 解决: 1.删除其中一个列

最新文章

  1. [14] 薪酬迅速翻倍的13条跳槽原则
  2. 12个现实世界中的机器学习真相
  3. MSChart使用导航之开发
  4. 精心收集汇总的Python学习资源(书籍+工具+视频),强烈建议收藏!
  5. Direct2D (35) : 通过 DirectWrite 获取字体列表
  6. IPV6 Socket编程
  7. haproxy高可用
  8. CentOS依赖包查找工具(https://centos.pkgs.org)
  9. 51单片机C语言智能小车,基于51单片机智能小车的设计与实现
  10. McaFee企业版v8.0i设置指南
  11. 计算机的三种基础运算,计算机基础知识(计算机的基本运算).ppt
  12. 使用Selenium启动火狐浏览器
  13. matlab二维函数的傅立叶变换,二维傅里叶变换和滤波(Two
  14. SlideBox 间隔滚动效果
  15. Electron修改图标
  16. 华为系列服务器账号密码,常用设备管理口默认用户名密码汇总
  17. DL1 - Neural Networks and Deep Learning
  18. 深度学习和神经网络的介绍(一)
  19. linux的多重启动管理器,使用多重启动管理器GRUB引导Linux系统.pdf
  20. 百度地图电子围栏判断

热门文章

  1. GIT用SSH链接的相关文档的整理与补充
  2. Linux C获取当前时间(精确到微秒)
  3. 国密SM2公钥点压缩解压C源码
  4. 中国手机支付行业竞争现状及市场发展格局分析报告2022-2028年版
  5. 减少在线去重造成的数据碎片
  6. 【信号与系统】(一 )信号与系统概述——信号的基本概念与分类
  7. linux bios设置界面,BIOS怎么设置 史上最详细的bios设置图解教程
  8. 如何用word制作英语答题卡_考研英语答题卡模板(word打印版)
  9. 京东商品及评论 数据采集
  10. PyQt(Python+Qt)学习随笔:树型部件QTreeWidget中当前列currentColumn和选中项selectedItems访问方法