给大家 一个建议 如果使用1.XX的版本hadoop  建议大家严格按照  以下的  第一条 中的博文的版本安装,可以是单机或者伪分布式,主要是因为,hadoop版本和 eclipse插件版本如果不一致会带来很多问题,解决起来比较麻烦,

先说明一下  此次搭建 hadoop  运行示例 ,过程中参考的文章:

1. eclipse 搭建 hadoop环境:

http://www.cnblogs.com/xia520pi/archive/2012/05/20/2510723.html

2.运行hadoop是遇到的问题:

第一个问题:XX.XX.XXX.WordCount$TokenizerMapper   这个主要是因为 插件版本不一致的问题,但是目前我没有解决,我用的还是1.2.1的 有源码 ,查了一些资料 也手工打了一个  eclipse-plugin 但是最终还是不好使,打eclipse-plugins 的 可以参考  利用hadoop源码 打plugins的文章(我打了包但是还是不好用),绕过这个问题的方法可以参考 (http://www.cnblogs.com/spork/archive/2010/04/21/1717592.html  文章有几篇 写的不错,可以都看看也算都hadoop执行过程的一个了解) ,这个最终解决办法是模拟了 hadoop将本地文件打成jar 的过程

以上问题解决 按照网上的文章应该是已经可以运行了 ,但是我这又遇到了 其他问题:

问题二:

eclipse 中运行 org.apache.hadoop.mapreduce.lib.input.InvalidInputException

执行  wordount 的时候,配置input文件找不到:明明就在那里防止相对路径绝对路径都试过了 还是不行,最后想着 hadoop在命令行执行 的话,是 将本地文件上传到  hadoop中的,但是我运行时候每次都报的的是本地文件找不到,所以应该写远程文件地址,参开 执行成功的hadoop文件发现文件地址为:

mapred.output.dirhdfs://172.16.236.11:9000/user/libin/output/wordcount-ec

所以就修改了本地代码程序:

       FileInputFormat.addInputPath(job, new Path("hdfs://172.16.236.11:9000"+ File.separator + otherArgs[0]));FileOutputFormat.setOutputPath(job, new Path("hdfs://172.16.236.11:9000" + File.separator + otherArgs[1]));

这样 本地执行的代码就可以 在hadoop服务端执行了。

这里还有一个问题 就是,如果每次测试都需要将本地文件上传到 hadoop服务端 ,好像有点麻烦,所以 可以考虑,每次addInputPath 的时候,在这个之前,先执行以下 类型 hadoop fs  -put 的 代码,将本地文件上传到  hadoop服务端,这样就不用每次手工上传文件到 服务器 参考

http://younglibin.iteye.com/admin/blogs/1925109

贴一个完整的 示例程序吧:

package com.younglibin.hadoop.test;import java.io.File;
import java.io.IOException;
import java.util.StringTokenizer;import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;import com.younglibin.hadoop.EJob;public class WordCount {public static class TokenizerMapper extendsMapper<Object, Text, Text, IntWritable> {private final static IntWritable one = new IntWritable(1);private Text word = new Text();public void map(Object key, Text value, Context context)throws IOException, InterruptedException {StringTokenizer itr = new StringTokenizer(value.toString());while (itr.hasMoreTokens()) {word.set(itr.nextToken());context.write(word, one);}}}public static class IntSumReducer extendsReducer<Text, IntWritable, Text, IntWritable> {private IntWritable result = new IntWritable();public void reduce(Text key, Iterable<IntWritable> values,Context context) throws IOException, InterruptedException {int sum = 0;for (IntWritable val : values) {sum += val.get();}result.set(sum);context.write(key, result);}}public static void main(String[] args) throws Exception {File jarFile = EJob.createTempJar("bin");EJob.addClasspath("/home/libin/software/hadoop/hadoop-1.2.1/conf");ClassLoader classLoader = EJob.getClassLoader();Thread.currentThread().setContextClassLoader(classLoader);Configuration conf = new Configuration();conf.set("mapred.job.tracker", "172.16.236.11:9001");args = new String[] { "/user/libin/input/libin", "/user/libin/output/wordcount-ec" };String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();if (otherArgs.length != 2) {System.err.println("Usage: wordcount <in> <out>");System.exit(2);}Job job = new Job(conf, "word count");job.setJarByClass(WordCount.class);((JobConf) job.getConfiguration()).setJar(jarFile.toString());job.setMapperClass(TokenizerMapper.class);job.setCombinerClass(IntSumReducer.class);job.setReducerClass(IntSumReducer.class);job.setOutputKeyClass(Text.class);job.setOutputValueClass(IntWritable.class);FileInputFormat.addInputPath(job, new Path("hdfs://172.16.236.11:9000"+ File.separator + otherArgs[0]));FileOutputFormat.setOutputPath(job, new Path("hdfs://172.16.236.11:9000" + File.separator + otherArgs[1]));System.exit(job.waitForCompletion(true) ? 0 : 1);}
}
/*** Licensed to the Apache Software Foundation (ASF) under one* or more contributor license agreements.  See the NOTICE file* distributed with this work for additional information* regarding copyright ownership.  The ASF licenses this file* to you under the Apache License, Version 2.0 (the* "License"); you may not use this file except in compliance* with the License.  You may obtain a copy of the License at**     http://www.apache.org/licenses/LICENSE-2.0** Unless required by applicable law or agreed to in writing, software* distributed under the License is distributed on an "AS IS" BASIS,* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.* See the License for the specific language governing permissions and* limitations under the License.*/package com.younglibin.hadoop;import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.lang.reflect.Array;
import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;
import java.net.URL;
import java.net.URLClassLoader;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Enumeration;
import java.util.jar.JarEntry;
import java.util.jar.JarFile;
import java.util.jar.JarOutputStream;
import java.util.jar.Manifest;public class EJob {private static ArrayList<URL> classPath = new ArrayList<URL>();/** Unpack a jar file into a directory. */public static void unJar(File jarFile, File toDir) throws IOException {JarFile jar = new JarFile(jarFile);try {Enumeration entries = jar.entries();while (entries.hasMoreElements()) {JarEntry entry = (JarEntry) entries.nextElement();if (!entry.isDirectory()) {InputStream in = jar.getInputStream(entry);try {File file = new File(toDir, entry.getName());if (!file.getParentFile().mkdirs()) {if (!file.getParentFile().isDirectory()) {throw new IOException("Mkdirs failed to create "+ file.getParentFile().toString());}}OutputStream out = new FileOutputStream(file);try {byte[] buffer = new byte[8192];int i;while ((i = in.read(buffer)) != -1) {out.write(buffer, 0, i);}} finally {out.close();}} finally {in.close();}}}} finally {jar.close();}}/*** Run a Hadoop job jar. If the main class is not in the jar's manifest, then* it must be provided on the command line.*/public static void runJar(String[] args) throws Throwable {String usage = "jarFile [mainClass] args...";if (args.length < 1) {System.err.println(usage);System.exit(-1);}int firstArg = 0;String fileName = args[firstArg++];File file = new File(fileName);String mainClassName = null;JarFile jarFile;try {jarFile = new JarFile(fileName);} catch (IOException io) {throw new IOException("Error opening job jar: " + fileName).initCause(io);}Manifest manifest = jarFile.getManifest();if (manifest != null) {mainClassName = manifest.getMainAttributes().getValue("Main-Class");}jarFile.close();if (mainClassName == null) {if (args.length < 2) {System.err.println(usage);System.exit(-1);}mainClassName = args[firstArg++];}mainClassName = mainClassName.replaceAll("/", ".");File tmpDir = new File(System.getProperty("java.io.tmpdir"));tmpDir.mkdirs();if (!tmpDir.isDirectory()) {System.err.println("Mkdirs failed to create " + tmpDir);System.exit(-1);}final File workDir = File.createTempFile("hadoop-unjar", "", tmpDir);workDir.delete();workDir.mkdirs();if (!workDir.isDirectory()) {System.err.println("Mkdirs failed to create " + workDir);System.exit(-1);}Runtime.getRuntime().addShutdownHook(new Thread() {public void run() {try {fullyDelete(workDir);} catch (IOException e) {}}});unJar(file, workDir);classPath.add(new File(workDir + "/").toURL());classPath.add(file.toURL());classPath.add(new File(workDir, "classes/").toURL());File[] libs = new File(workDir, "lib").listFiles();if (libs != null) {for (int i = 0; i < libs.length; i++) {classPath.add(libs[i].toURL());}}ClassLoader loader = new URLClassLoader(classPath.toArray(new URL[0]));Thread.currentThread().setContextClassLoader(loader);Class<?> mainClass = Class.forName(mainClassName, true, loader);Method main = mainClass.getMethod("main", new Class[] { Array.newInstance(String.class, 0).getClass() });String[] newArgs = Arrays.asList(args).subList(firstArg, args.length).toArray(new String[0]);try {main.invoke(null, new Object[] { newArgs });} catch (InvocationTargetException e) {throw e.getTargetException();}}/*** Delete a directory and all its contents. If we return false, the directory* may be partially-deleted.*/public static boolean fullyDelete(File dir) throws IOException {File contents[] = dir.listFiles();if (contents != null) {for (int i = 0; i < contents.length; i++) {if (contents[i].isFile()) {if (!contents[i].delete()) {return false;}} else {// try deleting the directory// this might be a symlinkboolean b = false;b = contents[i].delete();if (b) {// this was indeed a symlink or an empty directorycontinue;}// if not an empty directory or symlink let// fullydelete handle it.if (!fullyDelete(contents[i])) {return false;}}}}return dir.delete();}/*** Add a directory or file to classpath.* * @param component*/public static void addClasspath(String component) {if ((component != null) && (component.length() > 0)) {try {File f = new File(component);if (f.exists()) {URL key = f.getCanonicalFile().toURL();if (!classPath.contains(key)) {classPath.add(key);}}} catch (IOException e) {}}}/*** Add default classpath listed in bin/hadoop bash.* * @param hadoopHome*/public static void addDefaultClasspath(String hadoopHome) {// Classpath initially contains conf dir.addClasspath(hadoopHome + "/conf");// For developers, add Hadoop classes to classpath.addClasspath(hadoopHome + "/build/classes");if (new File(hadoopHome + "/build/webapps").exists()) {addClasspath(hadoopHome + "/build");}addClasspath(hadoopHome + "/build/test/classes");addClasspath(hadoopHome + "/build/tools");// For releases, add core hadoop jar & webapps to classpath.if (new File(hadoopHome + "/webapps").exists()) {addClasspath(hadoopHome);}addJarsInDir(hadoopHome);addJarsInDir(hadoopHome + "/build");// Add libs to classpath.addJarsInDir(hadoopHome + "/lib");addJarsInDir(hadoopHome + "/lib/jsp-2.1");addJarsInDir(hadoopHome + "/build/ivy/lib/Hadoop/common");}/*** Add all jars in directory to classpath, sub-directory is excluded.* * @param dirPath*/public static void addJarsInDir(String dirPath) {File dir = new File(dirPath);if (!dir.exists()) {return;}File[] files = dir.listFiles();if (files == null) {return;}for (int i = 0; i < files.length; i++) {if (files[i].isDirectory()) {continue;} else {addClasspath(files[i].getAbsolutePath());}}}/*** Create a temp jar file in "java.io.tmpdir".* * @param root* @return* @throws IOException*/public static File createTempJar(String root) throws IOException {if (!new File(root).exists()) {return null;}Manifest manifest = new Manifest();manifest.getMainAttributes().putValue("Manifest-Version", "1.0");final File jarFile = File.createTempFile("EJob-", ".jar", new File(System.getProperty("java.io.tmpdir")));Runtime.getRuntime().addShutdownHook(new Thread() {public void run() {jarFile.delete();}});JarOutputStream out = new JarOutputStream(new FileOutputStream(jarFile),manifest);createTempJarInner(out, new File(root), "");out.flush();out.close();return jarFile;}private static void createTempJarInner(JarOutputStream out, File f,String base) throws IOException {if (f.isDirectory()) {File[] fl = f.listFiles();if (base.length() > 0) {base = base + "/";}for (int i = 0; i < fl.length; i++) {createTempJarInner(out, fl[i], base + fl[i].getName());}} else {out.putNextEntry(new JarEntry(base));FileInputStream in = new FileInputStream(f);byte[] buffer = new byte[1024];int n = in.read(buffer);while (n != -1) {out.write(buffer, 0, n);n = in.read(buffer);}in.close();}}/*** Return a classloader based on user-specified classpath and parent* classloader.* * @return*/public static ClassLoader getClassLoader() {ClassLoader parent = Thread.currentThread().getContextClassLoader();if (parent == null) {parent = EJob.class.getClassLoader();}if (parent == null) {parent = ClassLoader.getSystemClassLoader();}return new URLClassLoader(classPath.toArray(new URL[0]), parent);}}

看这个比较迷茫的是参数的传递:

简单说明一下,这里map参数传递 在 FileInputFormat 指定的 ,所以在map方法中,中的value

就是经过FileInputFormat  处理的   ,在处理 createRecordReader 的时候,根据参数来判断使用哪一个子类处理,这里使用了TextInputFormat

eclipse 运行hadoop wordcount相关推荐

  1. 运行wordcount.java_运行hadoop wordcount程序

    我正在通过以下michael-noll教程学习hadoop . 当我试图通过运行 hadoop jar hadoop-examples-1.2.1.jar wordcount tmp/Files tm ...

  2. wordcount linux java_linux下在eclipse上运行hadoop自带例子wordcount

    启动eclipse:打开windows->open perspective->other->map/reduce 可以看到map/reduce开发视图.设置Hadoop locati ...

  3. eclipse运行WordCount

    1) 可以完全参考http://www.cnblogs.com/archimedes/p/4539751.html在eclipse下创建MapReduce工程,创建了MR工程,并完成WordCount ...

  4. Hadoop完全分布式搭建过程、maven和eclipse配置hadoop开发环境、配置Map/Reduce Locations、简单wordcount测试!

    Hadoop完全分布式搭建及测试 项目开始前准备工作 1.下载并安装VM workstation pro 15安装包,这里选择: VMware-workstation-full-15.1.0-1359 ...

  5. 命令行运行hadoop实例wordcount程序

    参考1:http://www.cnblogs.com/flying5/archive/2011/05/04/2078408.html 需要说明的有以下几点. 1.如果wordcount程序不含层次,即 ...

  6. linux hadoop 运行jar,Linux下执行Hadoop WordCount.jar

    Linux执行 Hadoop WordCount Ubuntu 终端进入快捷键 :ctrl + Alt +t hadoop启动命令:start-all.sh 正常执行效果如下: hadoop@HADO ...

  7. eclipse安装hadoop插件及配置

    第一次使用hadoop,另外eclipse也不太熟悉,现在把自己在安装过程中得琐碎问题记录下来. eclipse版本:eclipse-jee-indigo-SR2-linux-gtk.tar.gz h ...

  8. eclipse 运行MapReduce程序错误异常汇总(解决Map not fount)

    错误一: Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class wordCount.wordCount$ ...

  9. windows下eclipse调试hadoop详解

    1)下载Eclipse http://www.eclipse.org/downloads/ Eclipse Standard 4.3.2 64位 2) 下载Hadoop版本对应的eclipse插件 我 ...

最新文章

  1. AI一分钟|传谷歌正与腾讯等洽谈合作,欲在中国推云服务;国产超算运算速度或达每秒百亿亿次...
  2. 返回语句C语言return关键字
  3. mysql启动报错2002_mysql登陆启动报错 ERROR 2002 (HY000) 解决方法
  4. 华为交换机RRPP配置实验
  5. 第三十三期:使用wireshark抓包分析-抓包实用技巧
  6. c语言 cstring “+”: 运算符不起任何作用;应输入带副作用的运算符_国家计算机二级考试C语言选择题高频考点汇总,干货满满...
  7. lintcode 单词接龙II
  8. nodeJs 控制台打印中文显示为Unicode解决方案
  9. 【浙江大学PAT真题练习乙级】1001 害死人不偿命的(3n+1)猜想(15分)真题解析
  10. UTXO 和 Account 模型对比
  11. linux运维搭建官网,Linux运维学习之LAMP搭建个人博客网站
  12. Win10蓝牙驱动程序错误怎么回事?
  13. 《惢客创业日记》2020.02.11-02.21(周二)惢客的三个发展阶段(下)
  14. 在线作图工具:ProcessOn,流程图-思维导图-原型图-UML图等
  15. 获取计算机用户名,java获取计算机用户名
  16. 吐血分享:QQ群霸屏技术教程2017(维护篇)
  17. 观展指南|《星火·新生》沉浸式体验展倒计时1天
  18. 个人微信淘客机器人api开发
  19. 安装win7和Ubuntu双系统后,win7耳机没声音,外放有声音
  20. Arcmap做地图要领总结

热门文章

  1. 演唱会门票1秒钟就没了?没错,跟你竞争的不是人……
  2. 计算机上哪个键可以按出符号,键盘符号怎么打出来_各种符号在键盘上怎么打出来-win7之家...
  3. 易代账好会计zip导入提示不平衡
  4. 1949年的国庆节(10月1日)是星期六.......
  5. 阿里巴巴发布AliGenie 语音开放平台 “智联网”战略又落一子
  6. PHP红包搭建步骤,PHP 生成微信红包代码简单
  7. 认识中药(4)--陈皮
  8. Ubuntu 16.04 64位 安装 modelsim
  9. 拉杰尔安卓服务器注册上限,航海王启航409区圣地玛丽杰尔开服时间表_航海王启航新区开服预告_第一手游网手游开服表...
  10. 每日一题——判断素数