HadoopIntellijPlugin 插件还没安装好的可以参考这篇文章，IntelliJ IDEA搭建Hadoop开发环境（上），安装好插件后，下一步就是导入 hadoop 的依赖包，这些包可以在 hadoop 的 share/hadoop 目录下找到，这里以经典的 WordCount 程序来进行演示

1、新建 maven 项目

输入 GroupId 和 ArtifactId，然后 Next --> Finsh

2、新建class

名字输入 org.apache.hadoop.examples.WordCount

将下面代码复制过去

package org.apache.hadoop.examples;import java.io.IOException;
import java.util.Iterator;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;public class WordCount {public WordCount() {}public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable> {private static final IntWritable one = new IntWritable(1);private Text word = new Text();public TokenizerMapper() {}public void map(Object key, Text value, Mapper<Object, Text, Text, IntWritable>.Context context)throws IOException, InterruptedException {StringTokenizer itr = new StringTokenizer(value.toString());while (itr.hasMoreTokens()) {this.word.set(itr.nextToken());context.write(this.word, one);// System.out.println(this.word);}}}public static class IntSumReducer extends Reducer<Text, IntWritable, Text, IntWritable> {private IntWritable result = new IntWritable();public IntSumReducer() {}public void reduce(Text key, Iterable<IntWritable> values,Reducer<Text, IntWritable, Text, IntWritable>.Context context)throws IOException, InterruptedException {int sum = 0;IntWritable val;for (Iterator i$ = values.iterator(); i$.hasNext(); sum += val.get()) {val = (IntWritable) i$.next();}this.result.set(sum);context.write(key, this.result);}}public static void main(String[] args) throws Exception {Configuration conf = new Configuration();conf.set("mapreduce.app-submission.cross-platform", "true");// String[] otherArgs = (new GenericOptionsParser(conf,// args)).getRemainingArgs();String[] otherArgs = new String[] { "/user/hadoop/input", "/user/hadoop/output" };if (otherArgs.length < 2) {System.err.println("Usage: wordcount <in> [<in>...] <out>");System.exit(2);}Job job = Job.getInstance(conf, "word count");job.setJarByClass(WordCount.class);job.setMapperClass(WordCount.TokenizerMapper.class);job.setCombinerClass(WordCount.IntSumReducer.class);job.setReducerClass(WordCount.IntSumReducer.class);job.setOutputKeyClass(Text.class);job.setOutputValueClass(IntWritable.class);Path outputPath = new Path(otherArgs[1]);outputPath.getFileSystem(conf).delete(outputPath, true);for (int i = 0; i < otherArgs.length - 1; ++i) {// System.out.println(otherArgs[i]);FileInputFormat.addInputPath(job, new Path(otherArgs[i]));}// System.out.println(otherArgs[otherArgs.length - 1]);FileOutputFormat.setOutputPath(job, new Path(otherArgs[otherArgs.length - 1]));System.exit(job.waitForCompletion(true) ? 0 : 1);}}

注意 main 方法里的这段代码

String[] otherArgs = new String[] { "/user/hadoop/input/", "/user/hadoop/output" };

第一个是输入目录（里面放好要统计单词的文件，可以自己建个文件然后写上一些单词），第二个是输出目录（运行前要不存在），这两个目录改成你自己的，可以用绝对路径，因为在配置文件中有配置 fs.defaultFS ，因此也可以用相对路径
比如下面这个是我的文件结构，假如我的 ip 是192.168.xxx.123

那么我的这两个参数可以是

String[] otherArgs = new String[] { "hdfs://192.168.xxx.123:9000/user/hadoop/input", "hdfs://192.168.xxx.123:9000/user/hadoop/output" };

也可以是用相对路径的

String[] otherArgs = new String[] { "/user/hadoop/input", "/user/hadoop/output" };

如果不想每次都手动删除 output 文件夹，可以加上这段代码（我上面给的代码已经加上了）

Path outputPath = new Path(otherArgs[1]);
outputPath.getFileSystem(conf).delete(outputPath, true);

3、导入依赖包

代码复制进去后，应该会看到有很多报错，因为还没有导入我们所需的依赖包
点击 File --> Project Structure

选择Modules，选中你的项目然后点击 Dependencies，右边的 + 号，JARs or directories

将下图几个文件夹添加进去，在你的 hadoop/share/hadoop 目录下

以及 common/lib 目录

添加完成后如下

然后选择 Artifacts --> JAR --> From modules with dependencies

Module 选择刚刚这个， Main Class 选择 org.apache.hadoop.examples.WordCount

点击 OK

选中 Include in project build，点击 OK

4、配置文件

我们将 core-site.xml ，hdfs-site.xml，mapred-site.xml，yarn-site.xml，log4j.properties 这五个文件复制到项目的 resources 目录下，前四个是你 hadoop 的配置文件，log4j.properties 用于记录程序的输出日志，没有的话就看不到报错信息
我的目录结构如下

5、运行测试

下面我们开始运行 WordCount ，运行前记得先启动 hadoop 集群
右键 WordCount --> Run ‘WorldCount.main()’

如果出现以下错误，说明项目编译配置使用的Java版本不对，需要检查一下项目及环境使用的Java编译版本配置

解决方法可以参考这篇文章：https://blog.csdn.net/qq_22076345/article/details/82392236
在下方控制台可以看到输出的日志信息，等待运行完成

右键文件夹 Refresh，可以看见多了一个 output 文件夹，里面有两个文件，part-r-00000 这个文件就是统计结果了

在8088页面也能看到刚刚提交到集群的 WorldCount 成功完成

这样就可以使用 idea 来开发 hadoop 程序并进行调试了！

IntelliJ IDEA搭建Hadoop开发环境（下）相关推荐

在ubuntu下使用Eclipse搭建Hadoop开发环境
一.安装准备 1.JDK版本:jdk1.7.0(jdk-7-linux-i586.tar.gz) 2.hadoop版本:hadoop-1.1.1(hadoop-1.1.1.tar.gz) 3.ecli ...
如何使用IntelliJ IDEA搭建spark开发环境（上）
本文部分转自http://www.beanmoon.com/2014/10/11/%E5%A6%82%E4%BD%95%E4%BD%BF%E7%94%A8intellij%E6%90%AD%E5%BB ...
Intellij Idea搭建Spark开发环境
在Spark快速入门指南 – Spark安装与基础使用中介绍了Spark的安装与配置,在那里还介绍了使用spark-submit提交应用,不过不能使用vim来开发Spark应用,放着IDE的方便不用. ...
java 工程新建ivy文件_Hadoop学习之路（八）在eclispe上搭建Hadoop开发环境
一.添加插件将hadoop-eclipse-plugin-2.7.5.jar放入eclipse的plugins文件夹中二.在Windows上安装Hadoop2.7.5 版本最好与Linux集群中的 ...
idea spark java,IntelliJ Idea 搭建spark 开发环境
笔者介绍的是在MAC环境下使用Idea搭建spark环境. 环境: spark 2.0.0 scala 2.11.8 maven 3.9.9 idea 15 1.Idea的安装.Idea可以在官网上下 ...
hadoop启动_Mac OS X 上搭建 Hadoop 开发环境指南
Hadoop 的配置有些麻烦,目前没有一键配置的功能,虽然当时我在安装过程中也参考了有关教程,但还是遇到了很多坑,一些老版本的安装过程已不适用于 hadoop2.x,下面就介绍一下具体步骤. 安装 J ...
Mac OS X 上搭建 Hadoop 开发环境指南
Hadoop 的配置有些麻烦,目前没有一键配置的功能,虽然当时我在安装过程中也参考了有关教程,但还是遇到了很多坑,一些老版本的安装过程已不适用于 hadoop2.x,下面就介绍一下具体步骤. 安装 J ...
Eclipse下搭建Hadoop开发环境，并运行第一个实例
有同学无法正常运行程序,这里将Eclipse下Hadoop环境配置进行一下说明: 1.新建Map/Reduce工程 2.设置Hadoop Locaiton,第一次使用的话,点击大象,新建配置. 3.设 ...
搭建hadoop开发环境--基于xp+cygwin
2019独角兽企业重金招聘Python工程师标准>>> 1.安装cygwin 参考博文:http://hi.baidu.com/%BD%AB%D6%AE%B7%E7_%BE%B2%D ...
大数据平台架构实战（二）IntelliJ IDEA搭建hadoop
IntelliJ IDEA 构建工程工具类开发,随便找,IntelliJ IDEA搭建Hadoop开发环境(下)_Captain.Y.的博客-CSDN博客生成jar包由于我们的开发环境和hado ...

IntelliJ IDEA搭建Hadoop开发环境（下）