HBase眼高手低从Shell到IDEA编程、心路笔记、踩坑过程

HBase眼高手低

通过shell操作Hbase

Foundation

在terminal中输入hbase,就可以查看hbase命令的使用:

[root@big]# hbase

Usage: hbase [<options>] <command> [<args>]

Options:

--config DIR    Configuration direction to use. Default: ./conf

--hosts HOSTS   Override the list in 'regionservers' file

Commands:

Some commands take arguments. Pass no args or -h for usage.

Dare you don’t see this!!!

shell           Run the HBase shell

hbck            Run the hbase 'fsck' tool

snapshot        Create a new snapshot of a table

wal             Write-ahead-log analyzer

hfile           Store file analyzer

zkcli           Run the ZooKeeper shell

upgrade         Upgrade hbase

master          Run an HBase HMaster node

regionserver    Run an HBase HRegionServer node

zookeeper       Run a Zookeeper server

rest            Run an HBase REST server

thrift          Run the HBase Thrift server

thrift2         Run the HBase Thrift2 server

clean           Run the HBase clean up script

classpath       Dump hbase CLASSPATH

mapredcp        Dump CLASSPATH entries required by mapreduce

pe              Run PerformanceEvaluation

ltt             Run LoadTestTool

version         Print the version

CLASSNAME       Run the class named CLASSNAME

正如commands后面的提示,有些命令需要参数,但是没有关系,student take it easy,开发人员已经尽力使得一切变得容易了。

慢慢来比较快。

[root@big]# hbase upgrade

usage: $bin/hbase upgrade -check [-dir DIR]|-execute

-check       Run upgrade check; looks for HFileV1  under ${hbase.rootdir}

or provided 'dir' directory.

-dir <arg>   Relative path of dir to check for HFileV1s.

-execute     Run upgrade; zk and hdfs must be up, hbase down

-h,--help    Help

Read http://hbase.apache.org/book.html#upgrade0.96 before attempting upgrade

Example usage:

Run upgrade check; looks for HFileV1s under ${hbase.rootdir}:

$ bin/hbase upgrade -check

Run the upgrade:

$ bin/hbase upgrade –execute

[root@big]# hbase shell -h

Usage: shell [OPTIONS] [SCRIPTFILE [ARGUMENTS]]

--format=OPTION                Formatter for outputting results.

Valid options are: console, html.

(Default: console)

-d | --debug                   Set DEBUG log levels.

-h | --help                    This help.

-n | --noninteractive          Do not run within an IRB session

and exit with non-zero status on

first error.

[root@big]# hbase shell

15/10/09 10:35:10 INFO Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available

HBase Shell; enter 'help<RETURN>' for list of supported commands.

Type "exit<RETURN>" to leave the HBase Shell

Version 1.0.0-cdh5.4.2, rUnknown, Tue May 19 17:07:29 PDT 2015

hbase(main):001:0>

进入Hbase shell后人家有给你提供了一次帮助:

HBase Shell; enter 'help<RETURN>' for list of supported commands.

do something

创建表

create 'table_name','cf1','cf2','cf3'

Example:

hbase(main):002:0> create 'users','user_id','address','info'

0 row(s) in 0.8740 seconds

=> Hbase::Table – users

创建表的时候可以不加列名,加入数据的时候再具体申明,这就是为什么Hbase这么灵活了。

那么不加列族名那???

查看有哪些表

Example:

hbase(main):003:0> list

TABLE

users

1 row(s) in 0.0080 seconds

=> ["users"]

查看表结构

desc 't1'

describe 't1'

Example:

hbase(main):007:0> describe

ERROR: wrong number of arguments (0 for 1)

Here is some help for this command:

Describe the named table. For example:

hbase> describe 't1'

hbase> describe 'ns1:t1'

Alternatively, you can use the abbreviated 'desc' for the same thing.

hbase> desc 't1'

hbase> desc 'ns1:t1'

hbase(main):008:0> describe 'users'

Table users is ENABLED

users

COLUMN FAMILIES DESCRIPTION

{NAME => 'address', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE'

, MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}

{NAME => 'info', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', M

IN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}

{NAME => 'user_id', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE'

, MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}

3 row(s) in 0.0270 seconds

注意:每一个列族对应一个{}进行相关属性的描述。

删除表

注意:删除表分为两步,第一步先disable这张表,然后再删除该表

disable 't1'

drop 't1'

Example:

hbase(main):018:0> disable 'users'

0 row(s) in 1.3070 seconds

hbase(main):019:0> drop 'users'

0 row(s) in 0.1830 seconds

hbase(main):020:0> list

TABLE

0 row(s) in 0.0060 seconds

=> []

添加数据

hbase(main):001:0> put 'users','xiaoming','info:age','24'

0 row(s) in 0.2950 seconds

hbase(main):002:0> put 'users','xiaoming','info:birthday','1987-04-24'

0 row(s) in 0.0100 seconds

hbase(main):003:0> put 'users','xiaoming','info:company','Alibaba'

0 row(s) in 0.0080 seconds

hbase(main):004:0> put 'users','xiaoming','address:country','China'

0 row(s) in 0.0080 seconds

hbase(main):005:0> put 'users','xiaoming','address:province','Zhejiang'

0 row(s) in 0.0110 seconds

hbase(main):006:0> put 'users','xiaoming','address:city','HangZhou'

0 row(s) in 0.0080 seconds

hbase(main):007:0> put 'users','zhangsan','info:birthday','1999-09-06'

0 row(s) in 0.0190 seconds

hbase(main):008:0> put 'users','zhangsan','info:favourite','football'

0 row(s) in 0.0080 seconds

hbase(main):009:0> put 'users','zhangsan','info:company','Tecent'

0 row(s) in 0.0110 seconds

hbase(main):010:0> put 'users','zhangsan','info:country','China'

0 row(s) in 0.0090 seconds

hbase(main):011:0> put 'users','zhangsan','info:province','Fujian'

0 row(s) in 0.0090 seconds

hbase(main):012:0> put 'users','zhangsan','info:city','Xiamen'

0 row(s) in 0.0270 seconds

hbase(main):013:0> put 'users','zhangsan','info:town','daxuecheng'

0 row(s) in 0.0080 seconds

查询

KeyàValue

hbase(main):014:0> get 'users','xiaoming'

COLUMN                              CELL

address:city                       timestamp=1444360941172, value=HangZhou

address:country                    timestamp=1444360876501, value=China

address:province                   timestamp=1444360906626, value=Zhejiang

info:age                           timestamp=1444360701107, value=24

info:birthday                      timestamp=1444360779743, value=1987-04-24

info:company                       timestamp=1444360818861, value=Alibaba

6 row(s) in 0.0300 seconds

hbase(main):015:0> get 'users','zhangsan'

COLUMN                              CELL

info:birthday                      timestamp=1444361085631, value=1999-09-06

info:city                          timestamp=1444361275813, value=Xiamen

info:company                       timestamp=1444361177871, value=Tecent

info:country                       timestamp=1444361203037, value=China

info:favourite                     timestamp=1444361121247, value=football

info:province                      timestamp=1444361245038, value=Fujian

info:town                          timestamp=1444361343392, value=daxuecheng

7 row(s) in 0.0150 seconds

虽然在同一张表里但是,他们的列尽然可以不相同。

hbase(main):016:0> get 'users','xiaoming','address:city'

COLUMN                              CELL

address:city                       timestamp=1444360941172, value=HangZhou

1 row(s) in 0.0220 seconds

修改信息

修改小明的city值,see what happen

hbase(main):018:0>  put 'users','xiaoming','address:city','Zhoushan'

0 row(s) in 0.0120 seconds

hbase(main):019:0> get 'users','xiaoming','address:city'

COLUMN                              CELL

address:city                       timestamp=1444361861897, value=Zhoushan

1 row(s) in 0.0110 seconds

如果想显示所有的时间戳下的数据,该怎么做那?

查看版本数据

Student进行版本查看,可是只看到了一个。

hbase(main):008:0> get 'users','xiaoming',{COLUMN => 'address:city',VERSIONS => 3}

COLUMN                              CELL

address:city                       timestamp=1444361861897, value=Zhoushan

1 row(s) in 0.0080 seconds

所以,student查看表的描述

hbase(main):007:0> desc 'users'

Table users is ENABLED

users

COLUMN FAMILIES DESCRIPTION

{NAME => 'address', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE'

, MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}

{NAME => 'info', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', M

IN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}

{NAME => 'user_id', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE'

, MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}

3 row(s) in 0.1170 seconds

问题原来在这里!

注意表的描述是由几个{}构成的(恰好是每个列族进行一次属性描述),所以倒逼思维,student想到,那么建表的时候是可以对这些属性进行自定义的指定的。

查询表中的所有信息

hbase(main):009:0> scan 'users'

ROW                                 COLUMN+CELL

xiaoming                           column=address:city, timestamp=1444361861897, value=Zhoushan

xiaoming                           column=address:country, timestamp=1444360876501, value=China

xiaoming                          column=address:province, timestamp=1444360906626, value=Zhejiang

xiaoming                           column=info:age, timestamp=1444360701107, value=24

xiaoming                           column=info:birthday, timestamp=1444360779743, value=1987-04-24

xiaoming                           column=info:company, timestamp=1444360818861, value=Alibaba

zhangsan                           column=info:birthday, timestamp=1444361085631, value=1999-09-06

zhangsan                           column=info:city, timestamp=1444361275813, value=Xiamen

zhangsan                           column=info:company, timestamp=1444361177871, value=Tecent

zhangsan                           column=info:country, timestamp=1444361203037, value=China

zhangsan                           column=info:favourite, timestamp=1444361121247, value=football

zhangsan                           column=info:province, timestamp=1444361245038, value=Fujian

zhangsan                           column=info:town, timestamp=1444361343392, value=daxuecheng

2 row(s) in 0.0540 seconds

为什么是2 row(s),因为这里面xiaoming和zhangsan就是rowkey。

删除某列

hbase(main):010:0> get 'users','xiaoming','info:age'

COLUMN                              CELL

info:age                           timestamp=1444360701107, value=24

1 row(s) in 0.0090 seconds

hbase(main):011:0> delete 'users','xiaoming','info:age'

0 row(s) in 0.0450 seconds

hbase(main):012:0> get 'users','xiaoming','info:age'

COLUMN                              CELL

0 row(s) in 0.0080 seconds

删除整行数据

hbase(main):013:0> get 'users','xiaoming'

COLUMN                              CELL

address:city                       timestamp=1444361861897, value=Zhoushan

address:country                    timestamp=1444360876501, value=China

address:province                   timestamp=1444360906626, value=Zhejiang

info:birthday                      timestamp=1444360779743, value=1987-04-24

info:company                       timestamp=1444360818861, value=Alibaba

5 row(s) in 0.0130 seconds

hbase(main):014:0> deleteall 'users','xiaoming'

0 row(s) in 0.0080 seconds

hbase(main):015:0> get 'users','xiaoming'

COLUMN                              CELL

0 row(s) in 0.0060 seconds

统计表中数据的行数(即rowkey的个数)

hbase(main):016:0> count 'users'

1 row(s) in 0.0210 seconds

=> 1

清空表数据

hbase(main):017:0> truncate 'users'

Truncating 'users' table (it may take a while):

- Disabling table...

- Truncating table...

0 row(s) in 1.8630 seconds

注意:步骤依旧是需要先对表进行disable操作。

Hbase try to help you

Student实验在创建表的时候对版本数目进行描述。

看看看:

Hbase try to help you. Amazing, so many explanation and examples;

hbase(main):024:0> create 'users','user_id','address','info',{NAME => 'address',VERSIONS=>3}

Family 'address' already exists, the old one will be replaced

ERROR: Table already exists: users!

Here is some help for this command:

Creates a table. Pass a table name, and a set of column family

specifications (at least one), and, optionally, table configuration.

Column specification can be a simple string (name), or a dictionary

(dictionaries are described below in main help output), necessarily

including NAME attribute.

Examples:

Create a table with namespace=ns1 and table qualifier=t1

hbase> create 'ns1:t1', {NAME => 'f1', VERSIONS => 5}

Create a table with namespace=default and table qualifier=t1

hbase> create 't1', {NAME => 'f1'}, {NAME => 'f2'}, {NAME => 'f3'}

hbase> # The above in shorthand would be the following:

hbase> create 't1', 'f1', 'f2', 'f3'

hbase> create 't1', {NAME => 'f1', VERSIONS => 1, TTL => 2592000, BLOCKCACHE => true}

hbase> create 't1', {NAME => 'f1', CONFIGURATION => {'hbase.hstore.blockingStoreFiles' => '10'}}

Table configuration options can be put at the end.

Examples:

hbase> create 'ns1:t1', 'f1', SPLITS => ['10', '20', '30', '40']

hbase> create 't1', 'f1', SPLITS => ['10', '20', '30', '40']

hbase> create 't1', 'f1', SPLITS_FILE => 'splits.txt', OWNER => 'johndoe'

hbase> create 't1', {NAME => 'f1', VERSIONS => 5}, METADATA => { 'mykey' => 'myvalue' }

hbase> # Optionally pre-split the table into NUMREGIONS, using

hbase> # SPLITALGO ("HexStringSplit", "UniformSplit" or classname)

hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit'}

hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit', REGION_REPLICATION => 2, CONFIGURATION => {'hbase.hregion.scan.loadColumnFamiliesOnDemand' => 'true'}}

You can also keep around a reference to the created table:

hbase> t1 = create 't1', 'f1'

Which gives you a reference to the table named 't1', on which you can then

call methods.

这是什么?

hbase(main):001:0> desc 'users'

Table users is ENABLED

users

COLUMN FAMILIES DESCRIPTION

{NAME => 'address', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => 'NONE'

, MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}

{NAME => 'info', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE', M

IN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}

{NAME => 'user_id', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'NONE'

, MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}

3 row(s) in 0.4290 seconds

分解他们:divide and conquor

NAME => 'address'

DATA_BLOCK_ENCODING => 'NONE'

BLOOMFILTER => 'ROW'

REPLICATION_SCOPE => '0'

VERSIONS => '3'

COMPRESSION => 'NONE'

MIN_VERSIONS => '0'

TTL => 'FOREVER'

KEEP_DELETED_CELLS => 'FALSE'

BLOCKSIZE => '65536'

IN_MEMORY => 'false'

BLOCKCACHE => 'true'

总结

  1. 慢慢来比较快
  2. 就像退出往往使用quit,exit,q,或者Q一样,求助的时候往往是-h,--help等。
  3. 所以说,没事敲敲help
  4. Hbase shell操作时候的习惯,敲错以后如何处理:delete,Backspace(倒回来)

在gradle中添加依赖包

第一,添加依赖包

compile 'org.apache.hbase:hbase-common:1.0.0-cdh5.4.2'compile 'org.apache.hbase:hbase-examples:1.0.0-cdh5.4.2'//hbase-client-1.0.0-cdh5.4.2.jarcompile 'org.apache.hbase:hbase-client:1.0.0-cdh5.4.2'compile 'org.apache.hbase:hbase:1.0.0-cdh5.4.2'

第二,添加配置文件

确保HBase环境已启动且能连接到CDH下不存在该问题。

将HBase环境的hbase-site.xml文件拷贝到Gradle工程的src/main/resources目录下。

类似student使用API进行HDFS的操作。

通过java api操作Hbase

C:\Windows\System32\

上CM平台查看Zookeeper的日志信息:

Source code

package bigdata.thrill;import java.io.IOException;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.hbase.HBaseConfiguration;import org.apache.hadoop.hbase.HColumnDescriptor;import org.apache.hadoop.hbase.HTableDescriptor;import org.apache.hadoop.hbase.KeyValue;import org.apache.hadoop.hbase.client.Delete;import org.apache.hadoop.hbase.client.Get;import org.apache.hadoop.hbase.client.HBaseAdmin;import org.apache.hadoop.hbase.client.HTable;import org.apache.hadoop.hbase.client.HTablePool;import org.apache.hadoop.hbase.client.Put;import org.apache.hadoop.hbase.client.Result;import org.apache.hadoop.hbase.client.ResultScanner;import org.apache.hadoop.hbase.client.Scan;import org.apache.hadoop.hbase.util.Bytes;public class HbaseInAction {// declare static configurationstatic Configuration conf = null;static {conf = HBaseConfiguration.create();//conf.set("hbase.zookeeper.quorum", "bigdata1,bigdata2,bigdata3");//you can find this configuration information in hbase-site.xml.}public static void main(String[] args) throws Exception {// create hbase tableString tableName = "blog2";String[] family = { "article", "author" };creatTable(tableName, family);// put data to the tableString[] column1 = { "title", "content", "tag" };String[] value1 = {"Head First HBase","HBase is the Hadoop database. Use it when you need random, realtime read/write access to your Big Data.","Hadoop,HBase,NoSQL" };String[] column2 = { "name", "nickname" };String[] value2 = { "nicholas", "lee" };addData("rowkey1", "blog2", column1, value1, column2, value2);addData("rowkey2", "blog2", column1, value1, column2, value2);addData("rowkey3", "blog2", column1, value1, column2, value2);//CRUD:create read(retrieve) update deletegetResultScannAll("blog2");//        getResultScann("blog2", "rowkey4", "rowkey5");//        getResult("blog2", "rowkey1");//        getResultByColumn("blog2", "rowkey1", "author", "name");//        updateTable("blog2", "rowkey1", "author", "name", "bin");//        getResultByColumn("blog2", "rowkey1", "author", "name");//        getResultByVersion("blog2", "rowkey1", "author", "name");//        deleteColumn("blog2", "rowkey1", "author", "nickname");//        deleteAllColumn("blog2", "rowkey1");//        deleteTable("blog2");}/** 创建表** @tableName 表名** @family 列族列表*/public static void creatTable(String tableName, String[] family)throws Exception {HBaseAdmin admin = new HBaseAdmin(conf);HTableDescriptor desc = new HTableDescriptor(tableName);for (int i = 0; i < family.length; i++) {desc.addFamily(new HColumnDescriptor(family[i]));}if (admin.tableExists(tableName)) {System.out.println("table Exists!");System.exit(0);} else {admin.createTable(desc);System.out.println("create table Success!");}}/** 为表添加数据(适合知道有多少列族的固定表)** @rowKey rowKey** @tableName 表名** @column1 第一个列族列表** @value1 第一个列的值的列表** @column2 第二个列族列表** @value2 第二个列的值的列表*/public static void addData(String rowKey, String tableName,String[] column1, String[] value1, String[] column2, String[] value2)throws IOException {Put put = new Put(Bytes.toBytes(rowKey));// 设置rowkeyHTable table = new HTable(conf, Bytes.toBytes(tableName));// HTabel负责跟记录相关的操作如增删改查等//// 获取表HColumnDescriptor[] columnFamilies = table.getTableDescriptor().getColumnFamilies();// 获取所有的列族for (int i = 0; i < columnFamilies.length; i++) {String familyName = columnFamilies[i].getNameAsString(); // 获取列族名if (familyName.equals("article")) { // article列族put数据for (int j = 0; j < column1.length; j++) {put.add(Bytes.toBytes(familyName),Bytes.toBytes(column1[j]), Bytes.toBytes(value1[j]));}}if (familyName.equals("author")) { // author列族put数据for (int j = 0; j < column2.length; j++) {put.add(Bytes.toBytes(familyName),Bytes.toBytes(column2[j]), Bytes.toBytes(value2[j]));}}}table.put(put);System.out.println("add data Success!");}/** 根据rwokey查询** @rowKey rowKey** @tableName 表名*/public static Result getResult(String tableName, String rowKey)throws IOException {Get get = new Get(Bytes.toBytes(rowKey));HTable table = new HTable(conf, Bytes.toBytes(tableName));// 获取表Result result = table.get(get);for (KeyValue kv : result.list()) {System.out.println("family:" + Bytes.toString(kv.getFamily()));System.out.println("qualifier:" + Bytes.toString(kv.getQualifier()));System.out.println("value:" + Bytes.toString(kv.getValue()));System.out.println("Timestamp:" + kv.getTimestamp());System.out.println("-------------------------------------------");}return result;}/** 遍历查询hbase表** 进行全表的查询,不指定rowkey的范围** @tableName 表名*/public static void getResultScannAll(String tableName) throws IOException {Scan scan = new Scan();ResultScanner rs = null;HTable table = new HTable(conf, Bytes.toBytes(tableName));try {rs = table.getScanner(scan);for (Result r : rs) {for (KeyValue kv : r.list()) {System.out.println("row:" + Bytes.toString(kv.getRow()));System.out.println("family:"+ Bytes.toString(kv.getFamily()));System.out.println("qualifier:"+ Bytes.toString(kv.getQualifier()));System.out.println("value:" + Bytes.toString(kv.getValue()));System.out.println("timestamp:" + kv.getTimestamp());System.out.println("-------------------------------------------");}}} finally {rs.close();}}/** 遍历查询hbase表** 指定起始和结束的rowkey** @tableName 表名*/public static void getResultScann(String tableName, String start_rowkey,String stop_rowkey) throws IOException {Scan scan = new Scan();scan.setStartRow(Bytes.toBytes(start_rowkey));scan.setStopRow(Bytes.toBytes(stop_rowkey));ResultScanner rs = null;HTable table = new HTable(conf, Bytes.toBytes(tableName));try {rs = table.getScanner(scan);for (Result r : rs) {for (KeyValue kv : r.list()) {System.out.println("row:" + Bytes.toString(kv.getRow()));System.out.println("family:"+ Bytes.toString(kv.getFamily()));System.out.println("qualifier:"+ Bytes.toString(kv.getQualifier()));System.out.println("value:" + Bytes.toString(kv.getValue()));System.out.println("timestamp:" + kv.getTimestamp());System.out.println("-------------------------------------------");}}} finally {rs.close();}}/** 查询表中的某一列** @tableName 表名** @rowKey rowKey*/public static void getResultByColumn(String tableName, String rowKey,String familyName, String columnName) throws IOException {HTable table = new HTable(conf, Bytes.toBytes(tableName));Get get = new Get(Bytes.toBytes(rowKey));get.addColumn(Bytes.toBytes(familyName), Bytes.toBytes(columnName)); // 获取指定列族和列修饰符对应的列Result result = table.get(get);for (KeyValue kv : result.list()) {System.out.println("family:" + Bytes.toString(kv.getFamily()));System.out.println("qualifier:" + Bytes.toString(kv.getQualifier()));System.out.println("value:" + Bytes.toString(kv.getValue()));System.out.println("Timestamp:" + kv.getTimestamp());System.out.println("-------------------------------------------");}}/** 更新表中的某一列** @tableName 表名** @rowKey rowKey** @familyName 列族名** @columnName 列名** @value 更新后的值*/public static void updateTable(String tableName, String rowKey,String familyName, String columnName, String value)throws IOException {HTable table = new HTable(conf, Bytes.toBytes(tableName));Put put = new Put(Bytes.toBytes(rowKey));put.add(Bytes.toBytes(familyName), Bytes.toBytes(columnName),Bytes.toBytes(value));table.put(put);System.out.println("update table Success!");}/** 查询某列数据的多个版本** @tableName 表名** @rowKey rowKey** @familyName 列族名** @columnName 列名*/public static void getResultByVersion(String tableName, String rowKey,String familyName, String columnName) throws IOException {HTable table = new HTable(conf, Bytes.toBytes(tableName));Get get = new Get(Bytes.toBytes(rowKey));get.addColumn(Bytes.toBytes(familyName), Bytes.toBytes(columnName));get.setMaxVersions(5);Result result = table.get(get);for (KeyValue kv : result.list()) {System.out.println("family:" + Bytes.toString(kv.getFamily()));System.out.println("qualifier:" + Bytes.toString(kv.getQualifier()));System.out.println("value:" + Bytes.toString(kv.getValue()));System.out.println("Timestamp:" + kv.getTimestamp());System.out.println("-------------------------------------------");}/** List<?> results = table.get(get).list(); Iterator<?> it =* results.iterator(); while (it.hasNext()) {* System.out.println(it.next().toString()); }*/}/** 删除指定的列** @tableName 表名** @rowKey rowKey** @familyName 列族名** @columnName 列名*/public static void deleteColumn(String tableName, String rowKey,String falilyName, String columnName) throws IOException {HTable table = new HTable(conf, Bytes.toBytes(tableName));Delete deleteColumn = new Delete(Bytes.toBytes(rowKey));deleteColumn.deleteColumns(Bytes.toBytes(falilyName),Bytes.toBytes(columnName));table.delete(deleteColumn);System.out.println(falilyName + ":" + columnName + "is deleted!");}/** 删除指定的行** @tableName 表名** @rowKey rowKey*/public static void deleteAllColumn(String tableName, String rowKey)throws IOException {HTable table = new HTable(conf, Bytes.toBytes(tableName));Delete deleteAll = new Delete(Bytes.toBytes(rowKey));table.delete(deleteAll);System.out.println("all columns are deleted!");}/** 删除整张表** @tableName 表名*/public static void deleteTable(String tableName) throws IOException {HBaseAdmin admin = new HBaseAdmin(conf);admin.disableTable(tableName);admin.deleteTable(tableName);System.out.println(tableName + "is deleted!");}}

Hbase API 介绍

致敬:HBase总结(十一)hbase Java API 介绍及使用示例

几个相关类与HBase数据模型之间的对应关系

java类

HBase数据模型

HBaseAdmin

数据库(DataBase)

HBaseConfiguration

HTable

表(Table)

HTableDescriptor

列族(Column Family)

Put

列修饰符(Column Qualifier)

Get

Scanner

一、HBaseConfiguration

关系:org.apache.hadoop.hbase.HBaseConfiguration

作用:对HBase进行配置

返回值

函数

描述

void

addResource(Path file)

通过给定的路径所指的文件来添加资源

void

clear()

清空所有已设置的属性

string

get(String name)

获取属性名对应的值

String

getBoolean(String name, boolean defaultValue)

获取为boolean类型的属性值,如果其属性值类型不为boolean, 则返回默认属性值。

void

set(String name, String value)

通过属性名来设置值

void

setBoolean(String name, boolean value)

设置boolean类型的属性值

用法示例:

HBaseConfiguration hconfig = new HBaseConfiguration();

hconfig.set("hbase.zookeeper.property.clientPort","2181");

该方法设置了"hbase.zookeeper.property.clientPort"的端口号为2181。一般情况下,HBaseConfiguration会使用构造函数进行初始化,然后在使用其他方法。

二、HBaseAdmin

关系:org.apache.hadoop.hbase.client.HBaseAdmin

作用:提供了一个接口来管理HBase数据库的表信息。它提供的方法包括:创建表,删除表,列出表项,使表有效或无效,以及添加或删除表列族成员等。

返回值

函数

描述

void

addColumn(String tableName, HColumnDescriptor column)

向一个已经存在的表添加列

checkHBaseAvailable(HBaseConfiguration conf)

静态函数,查看HBase是否处于运行状态

createTable(HTableDescriptor desc)

创建一个表,同步操作

deleteTable(byte[] tableName)

删除一个已经存在的表

enableTable(byte[] tableName)

使表处于有效状态

disableTable(byte[] tableName)

使表处于无效状态

HTableDescriptor[]

listTables()

列出所有用户控件表项

void

modifyTable(byte[] tableName, HTableDescriptor htd)

修改表的模式,是异步的操作,可能需要花费一定的时间

boolean

tableExists(String tableName)

检查表是否存在

用法示例:

HBaseAdmin admin = new HBaseAdmin(config);

admin.disableTable("tablename")

三、HTableDescriptor

关系:org.apache.hadoop.hbase.HTableDescriptor

作用:包含了表的名字极其对应表的列族

返回值

函数

描述

void

addFamily(HColumnDescriptor)

添加一个列族

HColumnDescriptor

removeFamily(byte[] column)

移除一个列族

byte[]

getName()

获取表的名字

byte[]

getValue(byte[] key)

获取属性的值

void

setValue(String key, String value)

设置属性的值

用法示例:

HTableDescriptor htd = new HTableDescriptor(table);

htd.addFamily(new HcolumnDescriptor("family"));

在上述例子中,通过一个HColumnDescriptor实例,为HTableDescriptor添加了一个列族:family

四、HColumnDescriptor

关系:org.apache.hadoop.hbase.HColumnDescriptor

作用:维护着关于列族的信息,例如版本号,压缩设置等。它通常在创建表或者为表添加列族的时使用。列族被创建后不能直接修改,只能通过删除然后重新创建的方式。列族被删除的时候,列族里面的数据也会同时被删除。

返回值

函数

描述

byte[]

getName()

获取列族的名字

byte[]

getValue(byte[] key)

获取对应的属性的值

void

setValue(String key, String value)

设置对应属性的值

用法示例:

HTableDescriptor htd = new HTableDescriptor(tablename);

HColumnDescriptor col = new HColumnDescriptor("content:");

htd.addFamily(col);

此例添加了一个content的列族。

五、HTable

关系:org.apache.hadoop.hbase.client.HTable

作用:可以用来和HBase表直接通信。此方法对于更新操作来说是非线程安全的。

返回值

函数

描述

void

checkAndPut(byte[] row, byte[] family, byte[] qualifier, byte[] value, Put put)

自动的检查row/family/qualifier是否与给定的值匹配

void

close()

释放所有的资源或挂起内部缓冲区中的更新

Boolean

exists(Get get)

检查Get实例所指定的值是否存在于HTable的列中

Result

get(Get get)

获取指定行的某些单元格所对应的值

byte[][]

getEndKeys()

获取当前一打开的表每个区域的结束键值

ResultScanner

getScanner(byte[] family)

获取当前给定列族的scanner实例

HTableDescriptor

getTableDescriptor()

获取当前表的HTableDescriptor实例

byte[]

getTableName()

获取表名

static boolean

isTableEnabled(HBaseConfiguration conf, String tableName)

检查表是否有效

void

put(Put put)

向表中添加值

用法示例:

HTable table = new HTable(conf, Bytes.toBytes(tablename));

ResultScanner scanner =  table.getScanner(family);

六、Put

关系:org.apache.hadoop.hbase.client.Put

作用:用来对单个行执行添加操作

返回值

函数

描述

Put

add(byte[] family, byte[] qualifier, byte[] value)

将指定的列和对应的值添加到Put实例中

Put

add(byte[] family, byte[] qualifier, long ts, byte[] value)

将指定的列和对应的值及时间戳添加到Put实例中

byte[]

getRow()

获取Put实例的行

RowLock

getRowLock()

获取Put实例的行锁

long

getTimeStamp()

获取Put实例的时间戳

boolean

isEmpty()

检查familyMap是否为空

Put

setTimeStamp(long timeStamp)

设置Put实例的时间戳

用法示例:

HTable table = new HTable(conf,Bytes.toBytes(tablename));

Put p = new Put(brow); //为指定行创建一个Put操作

p.add(family,qualifier,value);

table.put(p);

七、Get

关系:org.apache.hadoop.hbase.client.Get

作用:用来获取单个行的相关信息

返回值

函数

描述

Get

addColumn(byte[] family, byte[] qualifier)

获取指定列族和列修饰符对应的列

Get

addFamily(byte[] family)

通过指定的列族获取其对应列的所有列

Get

setTimeRange(long minStamp,long maxStamp)

获取指定取件的列的版本号

Get

setFilter(Filter filter)

当执行Get操作时设置服务器端的过滤器

用法示例:

HTable table = new HTable(conf, Bytes.toBytes(tablename));

Get g = new Get(Bytes.toBytes(row));

八、Result

关系:org.apache.hadoop.hbase.client.Result

作用:存储Get或者Scan操作后获取表的单行值。使用此类提供的方法可以直接获取值或者各种Map结构(key-value对)

返回值

函数

描述

boolean

containsColumn(byte[] family, byte[] qualifier)

检查指定的列是否存在

NavigableMap<byte[],byte[]>

getFamilyMap(byte[] family)

获取对应列族所包含的修饰符与值的键值对

byte[]

getValue(byte[] family, byte[] qualifier)

获取对应列的最新值

九、ResultScanner

关系:Interface

作用:客户端获取值的接口

返回值

函数

描述

void

close()

关闭scanner并释放分配给它的资源

Result

next()

获取下一行的值

HBase眼高手低从Shell到IDEA编程、心路笔记、踩坑过程相关推荐

  1. TCPIP网络编程第一章踩坑过程 bind() error connect() error

    目录 服务端和客户端代码 设备选择 过程 最近在学习TCP/IP网络编程,第一章就卡了好久,特地写这个来记录过程 服务端和客户端代码 hello_client,c #include <stdio ...

  2. hbase 数据插入指定rowkey_「HBase大爆炸」HBase之常用Shell命令

    HBase之常用Shell命令 1.进入 HBase客户端命令操作界面 2.查看帮助命令 3.查看当前数据库中有哪些表 4.创建一张表 创建user表,包含info.data两个列族 或者 5.添加数 ...

  3. HBase学习指南之HBase原理和Shell使用

    HBase学习指南之HBase原理和Shell使用 参考资料: 1.https://www.cnblogs.com/nexiyi/p/hbase_shell.html,hbase shell 转载于: ...

  4. Shell 编程进阶笔记

    这几篇博文主要记录博主的Linux 学习之路,用作以后回顾和参考.大家可以选择略过也可以作参考. (一)Linux 初步笔记 (二)Linux 进阶笔记(一) (三)Linux 进阶笔记(二) (四) ...

  5. [SuperM]Shell编程课堂笔记+PPT总结

    [SuperM]Shell编程课堂笔记+PPT总结 转载于:https://www.cnblogs.com/DreamDrive/p/4800818.html

  6. 02、体验Spark shell下RDD编程

    02.体验Spark shell下RDD编程 1.Spark RDD介绍 RDD是Resilient Distributed Dataset,中文翻译是弹性分布式数据集.该类是Spark是核心类成员之 ...

  7. linux下Hbase的常用shell命令

    本文作者:合肥工业大学 管理学院 钱洋 email:1563178220@qq.com 目录 linux下查看hbase的安装路径 HBase Shell和HBase交互 HBase常用shell语句 ...

  8. linux运维脚本编写,最强Linux自动化运维 Shell高级脚本编程实战 带习题+项目实战案例+全套配置脚本...

    最强Linux自动化运维 Shell高级脚本编程实战 带习题+项目实战案例+全套配置脚本 大家可以通过参考下面的课程学习目录,就会发现单单只从目录上来分析就知道这是一部非常系统的Shell自动化脚本运 ...

  9. linux 脚本编写 -eq,关于shell脚本基础编程第四篇

    shell脚本基础编程第四篇 本章主要内容:函数 函数 function: function 名称 { 命令 ; } 或 name () { 命令 ; } 定义 shell 函数. 创建一个以 NAM ...

最新文章

  1. linux中普通文件和块设备文件的区别
  2. java使用线程求素数和1000个0~0.9随机数_求素数(多线程练习题)
  3. gis属性表怎么导成excel_使用Python脚本将Excel表批量赋值到ArcGIS属性表
  4. MySQL—【加餐1】高效查询方法
  5. 2022-02-21
  6. HDU 4505 小Q系列故事——电梯里的爱情
  7. Codeforces 814C - An impassioned circulation of affection
  8. java8 函数式编程_如何使用Java 8函数式编程生成字母序列
  9. 一款纯css3实现的超炫动画背画特效
  10. SQL中多表查询:左连接、右连接、内连接、全连接、交叉连接
  11. 41 MM配置-采购-采购订单-STO配置-定义凭证类型和可用性检查设置
  12. dubbo+zookeeper管理控制台搭建
  13. ecmall ajax,ajax
  14. 通过vmstat命令判断服务器瓶颈
  15. python创建二维空列表_python创建与遍历List二维列表的方法
  16. css文本外观属性大全(内附实例与图解)
  17. 嫡权法赋权法_赋权法_
  18. 微软笔试题-老鼠与毒药
  19. 研发岗和产品岗的时间管理策略总结-大局观概述
  20. arcgis操作:dwg数据转为shp数据

热门文章

  1. java中bean文件主要实现内容_JavaBean简单及使用
  2. mysql excel 命令行_MySQL 命令行数据导出到 Excel
  3. python提交表单无效_使用Django Form解决表单数据无法动态刷新的两种方法
  4. Linux下进程间通信——管道
  5. 作业12图的着色问题
  6. C#调用系统默认打印机打印文字和图片
  7. idea androidx控件不显示预览_如何解决SOLIDWORKS不显示缩略图预览的方法?
  8. ISAIR2022征稿【中国・上海​, 2022年10月21-23日】
  9. 130万奖金池!国家智能网联汽车创新中心ICV创新算法攻关任务报名通道火热开启!...
  10. 公开处刑:PapersWithCode上线“论文复现报告”,遏制耍流氓行为!