java数据流编辑 kylo,kylo问题总结1
kylo问题总结1
Spark configuration
cp /etc/hive/conf/hive-site.xml /etc/spark/conf/hive-site.xml
# Snappy isn't working well for Spark on Cloudera
echo "spark.io.compression.codec=lz4" >> /etc/spark/conf/spark-defaults.conf
Kylo-Nifi安装后的相关问题
防火墙关闭
systemctl stop firewalld.service
8400,8079端口已开放
iptables -A INPUT -ptcp --dport 8400 -j ACCEPT
mysql有kylo库及kylo用户并正确授权
spark-shell能够正常启动
相应用户有操作hive的权限(Sentry集群中)
问题
目录与权限
安装完成后,如未自动创建下列目录,需要考虑权限问题或手动创建所需目录,并赋予相应权限。
hdfs dfs -mkdir /user/kylo
hdfs dfs -chown kylo:kylo /user/kylo
hdfs dfs -mkdir /user/nifi
hdfs dfs -chown nifi:nifi /user/nifi
hdfs dfs -mkdir /etl
hdfs dfs -chown nifi:nifi /etl
hdfs dfs -mkdir /model.db
hdfs dfs -chown nifi:nifi /model.db
hdfs dfs -mkdir /archive
hdfs dfs -chown nifi:nifi /archive
hdfs dfs -mkdir -p /app/warehouse
hdfs dfs -chown nifi:nifi /app/warehouse
本地/tmp目录也需要提供相应权限,如果无法读写/tmp/kylo-nifi/目录,怎有可能会报错
Elasticsearch索引
EsIndexException in Kylo services logs
Problem
Kylo services log contains errors similar to this:org.modeshape.jcr.index.elasticsearch.EsIndexException: java.io.IOException: Not Found
Solution
Pre-create the indexes used by Kylo in Elasticsearch. Execute this script: /opt/kylo/bin/create-kylo-indexes-es.sh
The script takes 4 parameters.
Examples values:
host: localhost
rest-port: 9200
num-shards: 1
num-replicas: 1
Note: num-shards and num-replicas can be set to 1 for development environment
spark默认压缩格式
错误信息: UnsatisfiedLinkError: org.xerial.snappy.SnappyNative.maxCompressedLength(I)I
cp /etc/hive/conf/hive-site.xml /etc/spark/conf/hive-site.xml
# Snappy isn't working well for Spark on Cloudera
echo "spark.io.compression.codec=lz4" >> /etc/spark/conf/spark-defaults.conf
return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
hdfs dfs -mkdir /user/dladmin
hdfs dfs -chown dladmin: dladmin /user/dladmin
默认路径多为hdp的路径
在nifi的模版中,多数流程的默认路径都为hdp的路径,因此需要修改相应的路径为当前环境正确的路径,否则会造成Not Found的错误。
data_transformation模版中,Prepare Script生成的脚本存在问题
该问题待确认 Prepare Script中的判断存在问题,也可能是环境配置有误,导致生成的脚本同时存在insertInto()与partitionBy()。 当前解决方法为注释相关代码:
if (!isPreFeed && (sparkVersion == null || sparkVersion == "1")) {
//该行为注释行
//script = script + ".partitionBy(\"processing_dttm\")"
}
使用模版导入数据
File Filter中的数据类型识别存在一定问题,不够准确,导致后续的数据处理产生问题,因此在创建Feed时,需要留意每个field真实的数据类型。
原数据与目标数据的数据类型
在nifi的处理中,Feed中设定的数据格式为目标数据格式,由原数据ETL而来,因此,需要原数据与目标数据的数据格式相同,否则报错。
[图片上传失败...(image-89a745-1548150455520)]
这个错误是由于spark-shell启动不起来
查看配置,nifi服务的principal认证失败造成
这个错误是由于hive2连接不上
[root@kylo2 soft]# beeline
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: Using incremental CMS is deprecated and will likely be removed in a future release
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0
Beeline version 1.1.0-cdh5.15.0 by Apache Hive
beeline>
beeline> !connect jdbc:hive2://10.88.88.120:10000/default;principal=hive/kylo1.hypers.cc@KYLO.CC
scan complete in 2ms
Connecting to jdbc:hive2://10.88.88.120:10000/default;principal=hive/kylo1.hypers.cc@KYLO.CC
18/09/30 17:10:54 [main]: ERROR transport.TSaslTransport: SASL negotiation failure
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
... 35 more
Unknown HS2 problem when communicating with Thrift server.
Error: Could not open client transport with JDBC Uri: jdbc:hive2://10.88.88.120:10000/default;principal=hive/kylo1.hypers.cc@KYLO.CC: GSS initiate failed (state=08S01,code=0)
hive链接失败
kylo-service.log 日志
2018-10-12 13:56:54 ERROR http-nio-8420-exec-6:ConnectionPool:182 - Unable to create initial connections of pool.
java.sql.SQLException: Could not open client transport with JDBC Uri: jdbc:hive2://10.88.88.120:10000/default: java.net.ConnectException: Connection refused (Connection refused)
at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:215)
at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:163)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused)
at org.apache.thrift.transport.TSocket.open(TSocket.java:185)
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:248)
at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:190)
... 147 more
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.thrift.transport.TSocket.open(TSocket.java:180)
... 150 more
解决方案
vim /opt/kylo/kylo-services/conf/application.properties
hive.datasource.driverClassName=org.apache.hive.jdbc.HiveDriver
hive.datasource.url=jdbc:hive2://10.88.88.120:10000/default
hive.datasource.username=hive
hive.datasource.password=hive
hive.datasource.validationQuery=show tables 'test'
问题
kylo-service.log 日志
2018-10-12 14:07:17 ERROR http-nio-8420-exec-5:ThrowableMapper:43 - toResponse() caught throwable
org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is java.sql.SQLInvalidAuthorizationSpecException: Could not connect: Access denied for user 'kylo'@'10.88.88.122' (using password: NO)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.sql.SQLInvalidAuthorizationSpecException: Could not connect: Access denied for user 'kylo'@'10.88.88.122' (using password: NO)
at org.mariadb.jdbc.internal.util.ExceptionMapper.get(ExceptionMapper.java:135)
at org.mariadb.jdbc.internal.util.ExceptionMapper.getException(ExceptionMapper.java:101)
at org.mariadb.jdbc.internal.util.ExceptionMapper.throwException(ExceptionMapper.java:91)
at org.mariadb.jdbc.Driver.connect(Driver.java:109)
at org.apache.tomcat.jdbc.pool.PooledConnection.connectUsingDriver(PooledConnection.java:307)
at org.apache.tomcat.jdbc.pool.PooledConnection.connect(PooledConnection.java:200)
at org.apache.tomcat.jdbc.pool.ConnectionPool.createConnection(ConnectionPool.java:710)
at org.apache.tomcat.jdbc.pool.ConnectionPool.borrowConnection(ConnectionPool.java:644)
at org.apache.tomcat.jdbc.pool.ConnectionPool.init(ConnectionPool.java:466)
at org.apache.tomcat.jdbc.pool.ConnectionPool.(ConnectionPool.java:143)
at org.apache.tomcat.jdbc.pool.DataSourceProxy.pCreatePool(DataSourceProxy.java:115)
at org.apache.tomcat.jdbc.pool.DataSourceProxy.createPool(DataSourceProxy.java:102)
at org.apache.tomcat.jdbc.pool.DataSourceProxy.getConnection(DataSourceProxy.java:126)
at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:111)
at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:77)
... 126 more
Caused by: org.mariadb.jdbc.internal.util.dao.QueryException: Could not connect: Access denied for user 'kylo'@'10.88.88.122' (using password: NO)
at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.authentication(AbstractConnectProtocol.java:557)
at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.handleConnectionPhases(AbstractConnectProtocol.java:499)
at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connect(AbstractConnectProtocol.java:384)
at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connectWithoutProxy(AbstractConnectProtocol.java:825)
at org.mariadb.jdbc.internal.util.Utils.retrieveProxy(Utils.java:469)
at org.mariadb.jdbc.Driver.connect(Driver.java:104)
... 137 more
Visual Query
162 hive.datasource.driverClassName=org.apache.hive.jdbc.HiveDriver
163 hive.datasource.url=jdbc:hive2://10.16.4.68:10000/default;principal=hive/kylo-cdh5.cs1cloud.internal@CS1CLOUD.INTERNAL
164 hive.datasource.username=hive
165 hive.datasource.password=123456
166 hive.datasource.validationQuery=show tables 'test'
...
170 ##Also Clouder url should be /metastore instead of /hive
171 hive.metastore.datasource.driverClassName=org.mariadb.jdbc.Driver
172 hive.metastore.datasource.url=jdbc:mysql://10.16.4.68:3306/hive
173 #hive.metastore.datasource.url=jdbc:mysql://10.16.4.68:3306/hive
174 hive.metastore.datasource.username=metastore
175 hive.metastore.datasource.password=hive123
176 hive.metastore.datasource.validationQuery=SELECT 1
177 hive.metastore.datasource.testOnBorrow=true
hive的principal可以用nifi的principal
要给nifi的keytab权限777
hive的模拟用户设置为false
2018-11-23 15:30:46,289 ERROR [Timer-Driven Process Thread-2] c.t.nifi.v2.ingest.HdiMergeTable HdiMergeTable[id=0b301887-5de0-3dce-725d-b6e81225713d] Unable to execute merge doMerge for StandardFlowFileRecord[uuid=c822a690-7d3e-4a1d-8a28-d18dbc35e18d,claim=,offset=0,name=701401108440816,size=0] due to java.lang.RuntimeException: Failed to execute query; routing to failure: java.lang.RuntimeException: Failed to execute query
java.lang.RuntimeException: Failed to execute query
at com.thinkbiganalytics.ingest.TableMergeSyncSupport.doExecuteSQL(TableMergeSyncSupport.java:759)
at com.thinkbiganalytics.ingest.HdiTableMergeSyncSupport.doRolling(HdiTableMergeSyncSupport.java:97)
at com.thinkbiganalytics.nifi.v2.ingest.HdiMergeTable.onTrigger(HdiMergeTable.java:267)
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1147)
at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
确认hive的hive.metastore链接信息是否正确
在hive的日志中,找没添加的用,在hdfs 上加上并即可
echo "spark.io.compression.codec=lz4" >> /etc/spark/conf/spark-defaults.conf
eleaseLocks start=1544077817490 end=1544077817541 duration=51 from=org.apache.hadoop.hive.ql.Driver>
2018-12-06 14:30:17,611 ERROR org.apache.hive.service.cli.operation.Operation: [HiveServer2-Background-Pool: Thread-67]: Error running hive query:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:400)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:238)
at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:89)
at org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:301)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
at org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:314)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
2018-12-06 14:30:17,627 INFO org.apache.hive.service.cli.operation.OperationManager: [HiveServer2-Handler-Pool: Thread-54]: Closing operation: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=6dc999b0-deb5-43cc-9b54-a63945f81bc3]
问题
Compression codec com.hadoop.compression.lzo.LzoCodec not found
2019-01-11 16:28:21,995 ERROR [Timer-Driven Process Thread-7] o.apache.nifi.processors.hadoop.PutHDFS PutHDFS[id=10f86baf-cc95-32df-4f6a-95237b99d4ed] Failed to write to HDFS due to java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.lzo.LzoCodec not found.: java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.lzo.LzoCodec not found.
java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.lzo.LzoCodec not found.
at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:139)
at org.apache.hadoop.io.compress.CompressionCodecFactory.(CompressionCodecFactory.java:180)
at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.getCompressionCodec(AbstractHadoopProcessor.java:415)
解决方案:
P:Archive Originals
Additional Classpath Resources: /usr/lib/hadoop-lzo/lib/hadoop-lzo.jar
P:Upload to HDFS
Additional Classpath Resources: /usr/lib/hadoop-lzo/lib/hadoop-lzo.jar
Transform Exception An error occurred while initializing the Spark Shell.
解决方案:重启kylo后,在UI上点击Visual Query 后等着,耐心等待。。。直到kylo-spark-shell.log文件中看到如下日志内容后在去查询
...Successfully registered client.
...
INFO SparkShellApp: Started SparkShellApp in 156.081 seconds (JVM running for 156.983)
java数据流编辑 kylo,kylo问题总结1相关推荐
- 使用GZIP和Zip压缩Java数据流
转载自 使用GZIP和Zip压缩Java数据流 本文通过对数据压缩算法的简要介绍,然后以详细的示例演示了利用java.util.zip包实现数据的压缩与解压,并扩展到在网络传输方面如何应用java ...
- Java jdt 编辑_JDT入门
1.打开Java类型 要打开一个Java类或Java接口以进行编辑,可以执行以下操作之一: 在编辑器中所显示的源代码里选择所要编辑的Java类或Java接口的名字(或者简单地将插入光标定位到所要编辑的 ...
- [剑指offer]面试题第[41]题[Leetcode][第235题][JAVA][数据流中的中位数][优先队列][堆]
[问题描述][困难] [解答思路] 1. 思路1 时间复杂度:O(logN) 空间复杂度:O(N) import java.util.PriorityQueue;public class Median ...
- java界面编辑教程_java程序设计基础教程第六章图形用户界面编辑.docx
java程序设计基础教程第六章图形用户界面编辑.docx 还剩 27页未读, 继续阅读 下载文档到电脑,马上远离加班熬夜! 亲,很抱歉,此页已超出免费预览范围啦! 如果喜欢就下载吧,价低环保! 内容要 ...
- java word编辑_java实现word在线编辑及流转
[实例简介] java开发web办公系统,调用PageOffice组件实现word在线编辑及流转 [实例截图] [核心代码] worddemo ├── worddemo │ ├── css │ ...
- java 协作编辑,在线协作编辑器之周报收集
在线协作编辑器之周报收集 一.实验说明 下述介绍为实验楼默认环境,如果您使用的是定制环境,请修改成您自己的环境介绍. 1. 环境登录 无需密码自动登录,系统用户名shiyanlou 2. 环境介绍 本 ...
- java jtable 编辑_JTable可编辑
/** * * Title:[FileFieldEditor] * Description: [JTable可编辑] * Copyright 2009 Upengs Co., Ltd. * All r ...
- Java数据流和打印流
数据流 DataInputStream和DataOutputStream两个类创建的对象分别被称为数据输入流和数据输出流.这是 很有用的两个流,它们允许程序按与机器无关的风格读写Java数据.所以比较 ...
- java excel 编辑_Java 创建、编辑和删除Excel迷你图表
在Excel中,迷你图表是指在单元格中表示数据的微型图表.用其可以清晰简明地表现出相邻数据的变化趋势,同时也不会占用大量空间.根据图表形式的不同,迷你图表可分为折线迷你图.柱状迷你图及盈亏迷你图.本文 ...
- Java 创建/编辑/删除Excel迷你图表
迷你图是Excel工作表单元格中表示数据的微型图表.使用迷你图可以非常直观的显示数据变化趋势,突出最大值.最小值,放在数据表格中可起到很好的数据分析效果.本文将通过Java代码示例介绍如何在Excel ...
最新文章
- Android属性动画 ValueAnimator
- c语言设计成行考核答案,20秋广东开放大学C语言程序设计成性考核参考答案(10页)-原创力文档...
- HTML中直接写js 函数
- Python:IndentationError: unexpected indent
- 魔方阵(奇数,单偶,双偶)
- tensor数据类型转换_PyTorch的tensor数据类型及其相关转换
- 启动rrt什么意思_面试官:你来说一下springboot启动时的一个自动装配过程吧!...
- 谈谈BFC与ie特有属性hasLayout
- OpenGL编程指南(第8版)PDF
- QT控件之QComboBox(下拉框相关)
- 如何不授权情况下获取自己微信openid/傻瓜式获取
- 微信小程序直接打开第三方app,如何实现?
- 管理培训决定企业生死的5个层面
- 最新版腾讯防水墙(二代)识别
- USB3014-应用程序开发
- STM32粗略延时,大致精确
- [Java]使用ArrayList类来把54张扑克牌发给3位玩家各17张,剩下3张是底牌
- 杭电LCY-ACM算法入门习题(01-04)
- 短信验证码安全常见逻辑漏洞
- 李开复解密微软成功之道 盖茨鲍尔默好搭档(zz)