来自:http://stackoverflow.com/questions/11130145/hadoop-multipleinputs-fails-with-classcastexception

Following up on my comment, the Javadocs for TaggedInputSplit confirms that you are probably wrongly casting the input split to a FileSplit:

/*** An {@link InputSplit} that tags another InputSplit with extra data for use* by {@link DelegatingInputFormat}s and {@link DelegatingMapper}s.*/

My guess is your setup method looks something like this:

@Override
protected void setup(Context context) throws IOException,InterruptedException {FileSplit split = (FileSplit) context.getInputSplit();
}

Unfortunately TaggedInputSplit is not public visible, so you can't easily do an instanceof style check, followed by a cast and then call to TaggedInputSplit.getInputSplit() to get the actual underlying FileSplit. So either you'll need to update the source yourself and re-compile&deploy, post a JIRA ticket to ask this to be fixed in future version (if it already hasn't been actioned in 2+) or perform some nasty nasty reflection hackery to get to the underlying InputSplit

This is completely untested:

@Override
protected void setup(Context context) throws IOException,InterruptedException {InputSplit split = context.getInputSplit();Class<? extends InputSplit> splitClass = split.getClass();FileSplit fileSplit = null;if (splitClass.equals(FileSplit.class)) {fileSplit = (FileSplit) split;} else if (splitClass.getName().equals("org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit")) {// begin reflection hackery...try {Method getInputSplitMethod = splitClass.getDeclaredMethod("getInputSplit");getInputSplitMethod.setAccessible(true);fileSplit = (FileSplit) getInputSplitMethod.invoke(split);} catch (Exception e) {// wrap and re-throw errorthrow new IOException(e);}// end reflection hackery}
}

Reflection Hackery Explained:

With TaggedInputSplit being declared protected scope, it's not visible to classes outside the org.apache.hadoop.mapreduce.lib.input package, and therefore you cannot reference that class in your setup method. To get around this, we perform a number of reflection based operations:

  1. Inspecting the class name, we can test for the type TaggedInputSplit using it's fully qualified name

    splitClass.getName().equals("org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit")

  2. We know we want to call the TaggedInputSplit.getInputSplit() method to recover the wrapped input split, so we utilize the Class.getMethod(..) reflection method to acquire a reference to the method:

    Method getInputSplitMethod = splitClass.getDeclaredMethod("getInputSplit");

  3. The class still isn't public visible so we use the setAccessible(..) method to override this, stopping the security manager from throwing an exception

    getInputSplitMethod.setAccessible(true);

  4. Finally we invoke the method on the reference to the input split and cast the result to a FileSplit (optimistically hoping its a instance of this type!):

    fileSplit = (FileSplit) getInputSplitMethod.invoke(split);

转载于:https://www.cnblogs.com/sunxucool/p/3727200.html

hadoop MultipleInputs fails with ClassCastException (get fileName)相关推荐

  1. Hadoop综合项目——二手房统计分析(可视化篇)

    Hadoop综合项目--二手房统计分析(可视化篇) 文章目录 Hadoop综合项目--二手房统计分析(可视化篇) 0. 写在前面 1.数据可视化 1.1 二手房四大一线城市总价Top5 1.2 统计各 ...

  2. hadoop高可用hdfs搭建(三节点)

    hadoop高可用HDFS搭建(三节点) 一:准备工作 本次搭建使用的hadoop版本是2.6.5,使用虚拟机准备三个干净的节点服务器,我们从零开始搭建.规范下三个节点的主机名分别是node003,n ...

  3. Hadoop Yarn配置参数整理(非常全面)

    RM与NM相关参数 ResourceManager 参数名称 作用 默认值 yarn.resourcemanager.address ResourceManager 对客户端暴露的地址.客户端通过该地 ...

  4. hadoop 配置 docker伪分布式(单节点)

    在~/.bashrc中添加环境变量 export JAVA_HOME=/bigdata/jdk1.8.0_212 export PATH=$PATH:$JAVA_HOME/bin export HAD ...

  5. Apache Hadoop集群设置示例(带虚拟机)

    This Article Is From:https://examples.javacodegeeks.com/enterprise-java/apache-hadoop/apache-hadoop- ...

  6. hadoop面试100道收集(带答案)

    a) 创建hadoop账号 b) 更改ip c) 安装java 更改/etc/profile 配置环境变量 d) 修改host文件域名 e) 安装ssh 配置无密码登录 f) 解压hadoop g)  ...

  7. Hadoop之Sqoop框架学习(笔记19)

    一.Sqoop基础:连接关系型数据库与Hadoop的桥梁 1.1 Sqoop的基本概念 Hadoop正成为企业用于大数据分析的最热门选择,但想将你的数据移植过去并不容易.Apache Sqoop正在加 ...

  8. 搭建Hadoop完全分布式集群(三台虚拟机)

    经常查阅资料搭建Hadoop集群进行hadoop生态组件的学习,于是打算自己做一套完整的资料,方便以后查阅. 一.模板机准备 1.安装虚拟机 模板机安装前置工作. 2.三处ip配置 第一处:虚拟机ip ...

  9. Hadoop分布式集群搭建(完整版)

    一.前期准备工作 VMware和Centos7下载安装教程: https://blog.csdn.net/m0_59209350/article/details/117793482 XShell和Xf ...

最新文章

  1. python3官方说明文档_接下来? · Python3.7.3官方文档 简体中文 · 看云
  2. idea使用MybatisCodeHelperPro逆向生成
  3. 总结的比较好的vim命令
  4. 文献学习(part6)--Clustering ensemble based on sample’s stability
  5. VS2017 安装 QT5.9
  6. 华为推出业界首个分布式云原生产品:华为云UCS,持续创新,深耕数字化
  7. C++实现二叉树的相应操作
  8. 变分法理解1——泛函简介
  9. ECCV 2020 GigaVision挑战赛“行人和车辆检测”和“多目标追踪”冠军方案解读
  10. Flutter实战之Android混合开发初探
  11. 单变量微积分笔记—— 积分方法之换元法总结(简单换元和三角换元)
  12. 【iphone4s/ipad2回滚ios6.1.3】file:installer.cpp; line: 71; what:_assert(teams.empty()) 报错解决方法
  13. *TEST 1 for NOIP
  14. 华硕路由器WOL局域网唤醒失效解决方案
  15. tomcat启动异常:A child container failed during start
  16. 【android逆向笔记】(二)滚动的天空逆向
  17. 英语语言学u c,英语语言学资料(一)
  18. java mousemotionadapter_MouseMotionAdapter 类
  19. 黑苹果新手指导:名词解释常用软件常见问题说明
  20. 获取linux系统编码,Android获取IMEI码

热门文章

  1. 前端学习(3010):vue+element今日头条管理--回顾
  2. 工作394-注册页面学习
  3. [vue] $nextTick有什么作用?
  4. [vue] 你知道style加scoped属性的用途和原理吗?
  5. [css] 为什么要使用css sprites?
  6. [css] 举例说明你知道的css技巧有哪些?
  7. [js] ajax请求地址只支持http/https吗?能做到让它支持rtmp://等其它自定义协议吗 ?
  8. 前端学习(2796):实现左侧数据渲染和点击高亮
  9. 前端学习(2553):内容概述
  10. 贪吃蛇小游戏源码再回顾