HBase JAVA API（大章鱼版）

阅读前请注意：
此api中的环境为大章鱼大数据学习平台提供，非此环境，jar包，与程序代码存在一定问题。如果想本地虚拟机运行请参考分布式数据应用，进行操作

任务目标

1.了解HBase语言的基本语法
2.了解HBase开发的原理
3.了解HBase Java API的使用

任务内容

使用Java API对HBase表的基本操作，主要包含以下四个部分：

1.编写Java代码，实现创建HBase表的操作。

2.编写Java代码，实现删除HBase表的操作。

3.编写Java代码，实现写数据到HBase表中的操作。

4.编写Java代码，实现读取HBase表中数据的操作。

任务步骤

1.首先检查Hadoop相关进程，是否已经启动。若未启动，切换到/apps/hadoop/sbin目录下，启动Hadoop。

1.   jps
2.  cd /apps/hadoop/sbin
3.  ./start-all.sh

当Hadoop相关进程启动后，进入HBase的bin目录下，启动HBase服务。

1.   cd /apps/hbase/bin/
2.  ./start-hbase.sh

2.切换到/data/hbase2目录下，如不存在需提前创建hbase2文件夹。

1.   mkdir -p /data/hbase2
2.  cd /data/hbase2

3.使用wget命令，下载http://192.168.1.100:60000/allfiles/hbase2中的文件。

1.   wget http://192.168.1.100:60000/allfiles/hbase2/hbasedemolib.tar.gz
2.  wget http://192.168.1.100:60000/allfiles/hbase2/hbasedemo.tar.gz
3.  wget http://192.168.1.100:60000/allfiles/hbase2/CreateMyTable.java
4.  wget http://192.168.1.100:60000/allfiles/hbase2/DeleteMyTable.java
5.  wget http://192.168.1.100:60000/allfiles/hbase2/GetData.java
6.  wget http://192.168.1.100:60000/allfiles/hbase2/PutData.java

4.解压/data/hbase2中的hbasedemolib.tar.gz包到/data/hbase2中。

1.   tar zxvf hbasedemolib.tar.gz

5.打开Eclipse，创建java项目，名为hbasedemo。

在hbasedemo项目下，创建包，包名为myhbase。

添加项目依赖的jar包，右击hbasedemo，选择import。

进入下面界面，选择General中的File System，点击Next。

进入以下界面，选择/data/hbase2中的hbasedemolib文件夹，并勾选Create top-level folder，点击Finish。

然后，选中hbasedemolib里面的所有文件，单击右键Build Path=>Add to Build Path选项，就将所有jar包加载到项目里面了。

6.创建表的API
创建类，名为CreateMyTable，功能为在HBase中创建名为mytb，列族为mycf的表。

程序代码

1.   package myhbase;
2.  import java.io.IOException;
3.  import org.apache.hadoop.conf.Configuration;
4.  import org.apache.hadoop.hbase.HBaseConfiguration;
5.  import org.apache.hadoop.hbase.HColumnDescriptor;
6.  import org.apache.hadoop.hbase.HTableDescriptor;
7.  import org.apache.hadoop.hbase.MasterNotRunningException;
8.  import org.apache.hadoop.hbase.ZooKeeperConnectionException;
9.  import org.apache.hadoop.hbase.client.HBaseAdmin;
10. public class CreateMyTable {
11.     public static void main(String[] args) throws MasterNotRunningException, ZooKeeperConnectionException, IOException {
12.         String tableName = "mytb";
13.         String columnFamily = "mycf";
14.         create(tableName, columnFamily);
15.     }
16.
17.     public static Configuration getConfiguration() {
18.         Configuration conf = HBaseConfiguration.create();
19.         conf.set("hbase.rootdir", "hdfs://localhost:9000/hbase");
20.         conf.set("hbase.zookeeper.quorum", "localhost");
21.         return conf;
22.     }
23.     public static void create(String tableName, String columnFamily)
24.             throws MasterNotRunningException, ZooKeeperConnectionException,
25.             IOException {
26.         HBaseAdmin hBaseAdmin = new HBaseAdmin(getConfiguration());
27.         if (hBaseAdmin.tableExists(tableName)) {
28.             System.err.println("Table exists!");
29.         } else {
30.             HTableDescriptor tableDesc = new HTableDescriptor(tableName);
31.             tableDesc.addFamily(new HColumnDescriptor(columnFamily));
32.             hBaseAdmin.createTable(tableDesc);
33.             System.err.println("Create Table SUCCESS!");
34.         }
35.     }
36. }

在Eclipse中执行程序代码，在CreateMyTable类文件中，单击右键=>Run As=>Run on Hadoop选项，将任务提交到Hadoop中。

然后查看HBase中新创建的mytb表，先启动hbase shell命令行模式。

1.   hbase shell

执行list，列出当前HBase中的表。

1.   list

执行以下命令，查看创建的表结构。

1.   describe 'mytb'

7.删除表的API
创建类，命名为DeleteMyTable，功能为将HBase中表mytb删除。

程序代码

1.   package myhbase;
2.  import java.io.IOException;
3.  import org.apache.hadoop.conf.Configuration;
4.  import org.apache.hadoop.hbase.HBaseConfiguration;
5.  import org.apache.hadoop.hbase.client.HBaseAdmin;
6.  public class DeleteMyTable {
7.      public static void main(String[] args) throws IOException {
8.          String tableName = "mytb";
9.          delete(tableName);
10.     }
11.
12.     public static Configuration getConfiguration() {
13.         Configuration conf = HBaseConfiguration.create();
14.         conf.set("hbase.rootdir", "hdfs://localhost:9000/hbase");
15.         conf.set("hbase.zookeeper.quorum", "localhost");
16.         return conf;
17.     }
18.
19.     public static void delete(String tableName) throws IOException {
20.         HBaseAdmin hAdmin = new HBaseAdmin(getConfiguration());
21.         if(hAdmin.tableExists(tableName)){
22.             try {
23.                 hAdmin.disableTable(tableName);
24.                 hAdmin.deleteTable(tableName);
25.                 System.err.println("Delete table Success");
26.             } catch (IOException e) {
27.                 System.err.println("Delete table Failed ");
28.             }
29.         }else{
30.             System.err.println("table not exists");
31.         }
32.     }
33. }

在DeleteMyTable类文件中，单击右键=>Run As=>Run on Hadoop选项，将任务提交到Hadoop中。

在Eclipse中执行完成，然后在hbase中查看结果，查看mytb表是否被删除。

1.   list

8.写入数据的API
某电商网站，后台有买家信息表buyer，每注册一名新用户网站后台会产生一条日志，并写入HBase中。
数据格式为：用户ID（buyer_id），注册日期（reg_date），注册IP（reg_ip），卖家状态（buyer_status，0表示冻结，1表示正常），以“\t”分割，数据内容如下：

1.   用户ID   注册日期  注册IP   卖家状态
2.  20385,2010-05-04,124.64.242.30,1
3.  20386,2010-05-05,117.136.0.172,1
4.  20387,2010-05-06 ,114.94.44.230,1

将数据以buyer_id作为行键写入到HBase的buyer表中，插入之前，需确保buyer表已存在，若不存在，提前创建。

1.   create 'buyer','reg_date'

创建类，名为PutData，功能为将以上三条数据写入到buyer表中。

程序代码如下：

1.   package myhbase;
2.  import java.io.IOException;
3.  import org.apache.hadoop.conf.Configuration;
4.  import org.apache.hadoop.hbase.HBaseConfiguration;
5.  import org.apache.hadoop.hbase.MasterNotRunningException;
6.  import org.apache.hadoop.hbase.ZooKeeperConnectionException;
7.  import org.apache.hadoop.hbase.client.HTable;
8.  import org.apache.hadoop.hbase.client.Put;
9.  import org.apache.hadoop.hbase.util.Bytes;
10. public class PutData {
11.     public static void main(String[] args) throws MasterNotRunningException,
12.             ZooKeeperConnectionException, IOException {
13.         String tableName = "buyer";
14.         String columnFamily = "reg_date";
15.         put(tableName, "20385", columnFamily, "2010-05-04:reg_ip", "124.64.242.30");
16.         put(tableName, "20385", columnFamily, "2010-05-04:buyer_status", "1");
17.
18.         put(tableName, "20386", columnFamily, "2010-05-05:reg_ip", "117.136.0.172");
19.         put(tableName, "20386", columnFamily, "2010-05-05:buyer_status", "1");
20.
21.         put(tableName, "20387", columnFamily, "2010-05-06:reg_ip", "114.94.44.230");
22.         put(tableName, "20387", columnFamily, "2010-05-06:buyer_status", "1");
23.
24.     }
25.     public static Configuration getConfiguration() {
26.         Configuration conf = HBaseConfiguration.create();
27.         conf.set("hbase.rootdir", "hdfs://localhost:9000/hbase");
28.         conf.set("hbase.zookeeper.quorum", "localhost");
29.         return conf;
30.     }
31.     public static void put(String tableName, String row, String columnFamily,
32.             String column, String data) throws IOException {
33.         HTable table = new HTable(getConfiguration(), tableName);
34.         Put put = new Put(Bytes.toBytes(row));
35.         put.add(Bytes.toBytes(columnFamily),
36.                 Bytes.toBytes(column),
37.                 Bytes.toBytes(data));
38.         table.put(put);
39.         System.err.println("SUCCESS");
40.     }
41. }

在Eclipse中执行程序代码，在PutData类文件中，右键并点击=>Run As=>Run on Hadoop选项，将任务提交到Hadoop中。

执行完成后，进入HBase中查看buyer表结果。

1.   scan 'buyer'

9.查询数据的API
创建类GetData，功能为查询HBase的buyer表中rowkey为20386的数据。

程序代码如下：

1.   package myhbase;
2.  import java.io.IOException;
3.  import org.apache.hadoop.conf.Configuration;
4.  import org.apache.hadoop.hbase.HBaseConfiguration;
5.  import org.apache.hadoop.hbase.client.Get;
6.  import org.apache.hadoop.hbase.client.HTable;
7.  import org.apache.hadoop.hbase.client.Result;
8.  import org.apache.hadoop.hbase.util.Bytes;
9.  public class GetData {
10.     public static void main(String[] args) throws IOException {
11.         String tableName = "buyer";
12.         get(tableName, "20386");
13.     }
14.     public static Configuration getConfiguration() {
15.         Configuration conf = HBaseConfiguration.create();
16.         conf.set("hbase.rootdir", "hdfs://localhost:9000/hbase");
17.         conf.set("hbase.zookeeper.quorum", "localhost");
18.         return conf;
19.     }
20.     public static void get(String tableName, String rowkey) throws IOException {
21.         HTable table = new HTable(getConfiguration(), tableName);
22.         Get get = new Get(Bytes.toBytes(rowkey));
23.         Result result = table.get(get);
24.         byte[] value1 = result.getValue("reg_date".getBytes(), "2010-05-05:reg_ip".getBytes());
25.         byte[] value2 = result.getValue("reg_date".getBytes(), "2010-05-05:buyer_status".getBytes());
26.         System.err.println("line1:SUCCESS");
27.         System.err.println("line2:"
28.                 + new String(value1) + "\t"
29.                 + new String(value2));
30.     }
31. }

在Eclipse中执行程序代码，在GetData类文件中，单击右键=>Run As=>Run on Hadoop选项，将任务提交到Hadoop中。

执行完成后，可以在Eclipse中的console界面查看到执行结果为：