http://www.cnblogs.com/skyl/p/4807793.html

比较运算符 CompareFilter.CompareOp
比较运算符用于定义比较关系,可以有以下几类值供选择:

  • EQUAL 相等
  • GREATER 大于
  • GREATER_OR_EQUAL 大于等于
  • LESS 小于
  • LESS_OR_EQUAL 小于等于
  • NOT_EQUAL 不等于

比较器 ByteArrayComparable
通过比较器可以实现多样化目标匹配效果,比较器有以下子类可以使用:

  • BinaryComparator 匹配完整字节数组
  • BinaryPrefixComparator 匹配字节数组前缀
  • BitComparator  不常用
  • NullComparator  不常用
  • RegexStringComparator 匹配正则表达式
  • SubstringComparator 匹配子字符串

1.多重过滤器--FilterList(Shell不支持)
FilterList代表一个过滤器链,它可以包含一组即将应用于目标数据集的过滤器,过滤器间具有“与”FilterList.Operator.MUST_PASS_ALL 和“或” FilterList.Operator.MUST_PASS_ONE 关系。

//结合过滤器,获取所有age在15到30之间的行
private static void scanFilter() throws IOException,UnsupportedEncodingException {Configuration conf = HBaseConfiguration.create();conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");conf.set("hbase.zookeeper.quorum", "ncst");HTable ht = new HTable(conf, "users");// AndFilterList filterList = new FilterList(FilterList.Operator.MUST_PASS_ALL);// >=15SingleColumnValueFilter filter1 = new SingleColumnValueFilter("info".getBytes(), "age".getBytes(), CompareOp.GREATER_OR_EQUAL, "15".getBytes());// =<30SingleColumnValueFilter filter2 = new SingleColumnValueFilter("info".getBytes(), "age".getBytes(), CompareOp.LESS_OR_EQUAL, "30".getBytes());filterList.addFilter(filter1);filterList.addFilter(filter2);        Scan scan = new Scan();// set Filterscan.setFilter(filterList);ResultScanner rs = ht.getScanner(scan);for(Result result : rs){for(Cell cell : result.rawCells()){System.out.println(new String(CellUtil.cloneRow(cell))+"\t"+new String(CellUtil.cloneFamily(cell))+"\t"+new String(CellUtil.cloneQualifier(cell))+"\t"+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"+cell.getTimestamp());}}ht.close();
}    

2. 列值过滤器--SingleColumnValueFilter
用于测试列值相等(CompareOp.EQUAL ),不等(CompareOp.NOT_EQUAL),或单侧范围 (如CompareOp.GREATER)。构造函数:
2.1.比较的关键字是一个字符数组(Shell不支持?)
SingleColumnValueFilter(byte[] family, byte[] qualifier, CompareFilter.CompareOp compareOp, byte[] value)

//SingleColumnValueFilter例子
private static void scanFilter01() throws IOException,UnsupportedEncodingException {Configuration conf = HBaseConfiguration.create();conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");conf.set("hbase.zookeeper.quorum", "ncst");HTable ht = new HTable(conf, "users");SingleColumnValueFilter scvf = new SingleColumnValueFilter("info".getBytes(), "age".getBytes(), CompareOp.EQUAL, "18".getBytes());Scan scan = new Scan();scan.setFilter(scvf);ResultScanner rs = ht.getScanner(scan);for(Result result : rs){for(Cell cell : result.rawCells()){System.out.println(new String(CellUtil.cloneRow(cell))+"\t"+new String(CellUtil.cloneFamily(cell))+"\t"+new String(CellUtil.cloneQualifier(cell))+"\t"+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"+cell.getTimestamp());}}ht.close();
}

2.2.比较的关键字是一个比较器ByteArrayComparable
SingleColumnValueFilter(byte[] family, byte[] qualifier, CompareFilter.CompareOp compareOp, ByteArrayComparable comparator)

//SingleColumnValueFilter例子2 -- RegexStringComparator
private static void scanFilter02() throws IOException,UnsupportedEncodingException {Configuration conf = HBaseConfiguration.create();conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");conf.set("hbase.zookeeper.quorum", "ncst");HTable ht = new HTable(conf, "users");   //值比较的正则表达式 -- RegexStringComparator//匹配info:age值以"4"结尾RegexStringComparator comparator = new RegexStringComparator(".4");//第四个参数不一样SingleColumnValueFilter scvf = new SingleColumnValueFilter("info".getBytes(), "age".getBytes(), CompareOp.EQUAL, comparator);Scan scan = new Scan();scan.setFilter(scvf);ResultScanner rs = ht.getScanner(scan);for(Result result : rs){for(Cell cell : result.rawCells()){System.out.println(new String(CellUtil.cloneRow(cell))+"\t"+new String(CellUtil.cloneFamily(cell))+"\t"+new String(CellUtil.cloneQualifier(cell))+"\t"+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"+cell.getTimestamp());}}ht.close();
}
hbase(main):032:0> scan 'users',{FILTER=>"SingleColumnValueFilter('info','age',=,'regexstring:.4')"}
ROW                                 COLUMN+CELL                                                                                         xiaoming01                         column=address:contry, timestamp=1442000277200, value=\xE4\xB8\xAD\xE5\x9B\xBD                      xiaoming01                         column=address:country, timestamp=1442000228945, value=\xE4\xB8\xAD\xE5\x9B\xBD                     xiaoming01                         column=info:age, timestamp=1441998917568, value=24                                                  xiaoming02                         column=info:age, timestamp=1441998917594, value=24                                                  xiaoming03                         column=info:age, timestamp=1441998919607, value=24
3 row(s) in 0.0130 seconds

//SingleColumnValueFilter例子2 -- SubstringComparator
private static void scanFilter03() throws IOException,UnsupportedEncodingException {Configuration conf = HBaseConfiguration.create();conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");conf.set("hbase.zookeeper.quorum", "ncst");HTable ht = new HTable(conf, "users");//检测一个子串是否存在于值中(大小写不敏感) -- SubstringComparator//过滤age值中包含'4'的RowKeySubstringComparator comparator = new SubstringComparator("4");//第四个参数不一样SingleColumnValueFilter scvf = new SingleColumnValueFilter("info".getBytes(), "age".getBytes(), CompareOp.EQUAL, comparator);Scan scan = new Scan();scan.setFilter(scvf);ResultScanner rs = ht.getScanner(scan);for(Result result : rs){for(Cell cell : result.rawCells()){System.out.println(new String(CellUtil.cloneRow(cell))+"\t"+new String(CellUtil.cloneFamily(cell))+"\t"+new String(CellUtil.cloneQualifier(cell))+"\t"+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"+cell.getTimestamp());}}ht.close();
}
hbase(main):033:0> scan 'users',{FILTER=>"SingleColumnValueFilter('info','age',=,'substring:4')"}
ROW                                 COLUMN+CELL                                                                                         xiaoming01                         column=address:contry, timestamp=1442000277200, value=\xE4\xB8\xAD\xE5\x9B\xBD                      xiaoming01                         column=address:country, timestamp=1442000228945, value=\xE4\xB8\xAD\xE5\x9B\xBD                     xiaoming01                         column=info:age, timestamp=1441998917568, value=24                                                  xiaoming02                         column=info:age, timestamp=1441998917594, value=24                                                  xiaoming03                         column=info:age, timestamp=1441998919607, value=24
3 row(s) in 0.0180 seconds

3.列名过滤器
由于HBase采用键值对保存内部数据,列名过滤器过滤一行的列名(ColumnFamily:Qualifiers)是否存在 , 对应前节所述列值的情况。

3.1.基于Columun Family列族过滤数据的FamilyFilter
FamilyFilter(CompareFilter.CompareOp familyCompareOp, ByteArrayComparable familyComparator)

注意:
1.如果希望查找的是一个已知的列族,则使用 scan.addFamily(family); 比使用过滤器效率更高.
2.由于目前HBase对多列族支持不完善,所以该过滤器目前用途不大.

//基于列族过滤数据的FamilyFilter
private static void scanFilter04() throws IOException,UnsupportedEncodingException {Configuration conf = HBaseConfiguration.create();conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");conf.set("hbase.zookeeper.quorum", "ncst");HTable ht = new HTable(conf, "users");//过滤 = 'address'的列族//FamilyFilter familyFilter = new FamilyFilter(CompareOp.EQUAL, new BinaryComparator("address".getBytes()));//过滤以'add'开头的列族FamilyFilter familyFilter = new FamilyFilter(CompareOp.EQUAL, new BinaryPrefixComparator("add".getBytes()));Scan scan = new Scan();scan.setFilter(familyFilter);ResultScanner rs = ht.getScanner(scan);for(Result result : rs){for(Cell cell : result.rawCells()){System.out.println(new String(CellUtil.cloneRow(cell))+"\t"+new String(CellUtil.cloneFamily(cell))+"\t"+new String(CellUtil.cloneQualifier(cell))+"\t"+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"+cell.getTimestamp());}}ht.close();
}
hbase(main):021:0> scan 'users',{FILTER=>"FamilyFilter(=,'binaryprefix:add')"}
ROW                                 COLUMN+CELL                                                                                         xiaoming                           column=address:city, timestamp=1441997498965, value=hangzhou                                        xiaoming                           column=address:contry, timestamp=1441997498911, value=china                                         xiaoming                           column=address:province, timestamp=1441997498939, value=zhejiang                                    xiaoming01                         column=address:contry, timestamp=1442000277200, value=\xE4\xB8\xAD\xE5\x9B\xBD                      xiaoming01                         column=address:country, timestamp=1442000228945, value=\xE4\xB8\xAD\xE5\x9B\xBD                     zhangyifei                         column=address:city, timestamp=1441997499108, value=jieyang                                         zhangyifei                         column=address:contry, timestamp=1441997499077, value=china                                         zhangyifei                         column=address:province, timestamp=1441997499093, value=guangdong                                   zhangyifei                         column=address:town, timestamp=1441997500711, value=xianqiao
3 row(s) in 0.0400 seconds

3.2.基于Qualifier列名过滤数据的QualifierFilter
QualifierFilter(CompareFilter.CompareOp op, ByteArrayComparable qualifierComparator)

说明:该过滤器应该比FamilyFilter更常用!

//基于Qualifier(列名)过滤数据的QualifierFilter
private static void scanFilter05() throws IOException,UnsupportedEncodingException {Configuration conf = HBaseConfiguration.create();conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");conf.set("hbase.zookeeper.quorum", "ncst");HTable ht = new HTable(conf, "users");//过滤列名 = 'age'所有RowKey//QualifierFilter qualifierFilter = new QualifierFilter(CompareOp.EQUAL, new BinaryComparator("age".getBytes()));//过滤列名  以'age'开头 所有RowKey(包含age)//QualifierFilter qualifierFilter = new QualifierFilter(CompareOp.EQUAL, new BinaryPrefixComparator("age".getBytes()));//过滤列名  包含'age' 所有RowKey(包含age)//QualifierFilter qualifierFilter = new QualifierFilter(CompareOp.EQUAL, new SubstringComparator("age"));//过滤列名  符合'.ge'正则表达式 所有RowKeyQualifierFilter qualifierFilter = new QualifierFilter(CompareOp.EQUAL, new RegexStringComparator(".ge"));Scan scan = new Scan();scan.setFilter(qualifierFilter);ResultScanner rs = ht.getScanner(scan);for(Result result : rs){for(Cell cell : result.rawCells()){System.out.println(new String(CellUtil.cloneRow(cell))+"\t"+new String(CellUtil.cloneFamily(cell))+"\t"+new String(CellUtil.cloneQualifier(cell))+"\t"+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"+cell.getTimestamp());}}ht.close();
}
hbase(main):020:0> scan 'users',{FILTER=>"QualifierFilter(=,'regexstring:.ge')"}
ROW                                 COLUMN+CELL                                                                                         xiaoming                           column=info:age, timestamp=1441997971945, value=38                                                  xiaoming01                         column=info:age, timestamp=1441998917568, value=24                                                  xiaoming02                         column=info:age, timestamp=1441998917594, value=24                                                  xiaoming03                         column=info:age, timestamp=1441998919607, value=24                                                  zhangyifei                         column=info:age, timestamp=1442247255446, value=18
5 row(s) in 0.0460 seconds

3.3.基于列名前缀过滤数据的ColumnPrefixFilter(该功能用QualifierFilter也能实现)
ColumnPrefixFilter(byte[] prefix) 
注意:一个列名是可以出现在多个列族中的,该过滤器将返回所有列族中匹配的列。

//ColumnPrefixFilter例子
private static void scanFilter06() throws IOException,UnsupportedEncodingException {Configuration conf = HBaseConfiguration.create();conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");conf.set("hbase.zookeeper.quorum", "ncst");HTable ht = new HTable(conf, "users");//匹配 以'ag'开头的所有的列ColumnPrefixFilter columnPrefixFilter = new ColumnPrefixFilter("ag".getBytes());Scan scan = new Scan();scan.setFilter(columnPrefixFilter);ResultScanner rs = ht.getScanner(scan);for(Result result : rs){for(Cell cell : result.rawCells()){System.out.println(new String(CellUtil.cloneRow(cell))+"\t"+new String(CellUtil.cloneFamily(cell))+"\t"+new String(CellUtil.cloneQualifier(cell))+"\t"+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"+cell.getTimestamp());}}ht.close();
}
hbase(main):018:0> scan 'users',{FILTER=>"ColumnPrefixFilter('ag')"}
ROW                                 COLUMN+CELL                                                                                         xiaoming                           column=info:age, timestamp=1441997971945, value=38                                                  xiaoming01                         column=info:age, timestamp=1441998917568, value=24                                                  xiaoming02                         column=info:age, timestamp=1441998917594, value=24                                                  xiaoming03                         column=info:age, timestamp=1441998919607, value=24                                                  zhangyifei                         column=info:age, timestamp=1442247255446, value=18
5 row(s) in 0.0280 seconds

3.4.基于多个列名前缀过滤数据的MultipleColumnPrefixFilter
MultipleColumnPrefixFilter 和 ColumnPrefixFilter 行为差不多,但可以指定多个前缀。

//MultipleColumnPrefixFilter例子
private static void scanFilter07() throws IOException,UnsupportedEncodingException {Configuration conf = HBaseConfiguration.create();conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");conf.set("hbase.zookeeper.quorum", "ncst");HTable ht = new HTable(conf, "users");//匹配 以'a'或者'c'开头 所有的列{二维数组}byte[][] prefixes =new byte[][]{"a".getBytes(), "c".getBytes()};        MultipleColumnPrefixFilter multipleColumnPrefixFilter = new MultipleColumnPrefixFilter(prefixes );Scan scan = new Scan();scan.setFilter(multipleColumnPrefixFilter);ResultScanner rs = ht.getScanner(scan);for(Result result : rs){for(Cell cell : result.rawCells()){System.out.println(new String(CellUtil.cloneRow(cell))+"\t"+new String(CellUtil.cloneFamily(cell))+"\t"+new String(CellUtil.cloneQualifier(cell))+"\t"+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"+cell.getTimestamp());}}ht.close();
}
hbase(main):017:0> scan 'users',{FILTER=>"MultipleColumnPrefixFilter('a','c')"}
ROW                                 COLUMN+CELL                                                                                         xiaoming                           column=address:city, timestamp=1441997498965, value=hangzhou                                        xiaoming                           column=address:contry, timestamp=1441997498911, value=china                                         xiaoming                           column=info:age, timestamp=1441997971945, value=38                                                  xiaoming                           column=info:company, timestamp=1441997498889, value=alibaba                                         xiaoming01                         column=address:contry, timestamp=1442000277200, value=\xE4\xB8\xAD\xE5\x9B\xBD                      xiaoming01                         column=address:country, timestamp=1442000228945, value=\xE4\xB8\xAD\xE5\x9B\xBD                     xiaoming01                         column=info:age, timestamp=1441998917568, value=24                                                  xiaoming02                         column=info:age, timestamp=1441998917594, value=24                                                  xiaoming03                         column=info:age, timestamp=1441998919607, value=24                                                  zhangyifei                         column=address:city, timestamp=1441997499108, value=jieyang                                         zhangyifei                         column=address:contry, timestamp=1441997499077, value=china                                         zhangyifei                         column=info:age, timestamp=1442247255446, value=18                                                  zhangyifei                         column=info:company, timestamp=1441997499039, value=alibaba
5 row(s) in 0.0430 seconds

3.5.基于列范围(不是行范围)过滤数据ColumnRangeFilter

  1. 可用于获得一个范围的列,例如,如果你的一行中有百万个列,但是你只希望查看列名从bbbb到dddd的范围
  2. 该方法从 HBase 0.92 版本开始引入
  3. 一个列名是可以出现在多个列族中的,该过滤器将返回所有列族中匹配的列

构造函数:
ColumnRangeFilter(byte[] minColumn, boolean minColumnInclusive, byte[] maxColumn, boolean maxColumnInclusive)
参数解释:

  • minColumn - 列范围的最小值,如果为空,则没有下限
  • minColumnInclusive - 列范围是否包含minColumn
  • maxColumn - 列范围最大值,如果为空,则没有上限
  • maxColumnInclusive - 列范围是否包含maxColumn
//ColumnRangeFilter例子
private static void scanFilter08() throws IOException,
UnsupportedEncodingException {Configuration conf = HBaseConfiguration.create();conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");conf.set("hbase.zookeeper.quorum", "ncst");HTable ht = new HTable(conf, "users");//匹配 以'a'开头到以'c'开头(不包含c) 所有的列    ColumnRangeFilter columnRangeFilter = new ColumnRangeFilter("a".getBytes(), true, "c".getBytes(), false);Scan scan = new Scan();scan.setFilter(columnRangeFilter);ResultScanner rs = ht.getScanner(scan);for(Result result : rs){for(Cell cell : result.rawCells()){System.out.println(new String(CellUtil.cloneRow(cell))+"\t"+new String(CellUtil.cloneFamily(cell))+"\t"+new String(CellUtil.cloneQualifier(cell))+"\t"+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"+cell.getTimestamp());}}ht.close();
}
hbase(main):016:0> scan 'users',{FILTER=>"ColumnRangeFilter('a',true,'c',false)"}
ROW                                 COLUMN+CELL                                                                                         xiaoming                           column=info:age, timestamp=1441997971945, value=38                                                  xiaoming                           column=info:birthday, timestamp=1441997498851, value=1987-06-17                                     xiaoming01                         column=info:age, timestamp=1441998917568, value=24                                                  xiaoming02                         column=info:age, timestamp=1441998917594, value=24                                                  xiaoming03                         column=info:age, timestamp=1441998919607, value=24                                                  zhangyifei                         column=info:age, timestamp=1442247255446, value=18                                                  zhangyifei                         column=info:birthday, timestamp=1441997498990, value=1987-4-17
5 row(s) in 0.0340 seconds

4.RowKey
当需要根据行键特征查找一个范围的行数据时,使用Scan的startRow和stopRow会更高效,但是,startRow和stopRow只能匹配行键的开始字符,而不能匹配中间包含的字符。当需要针对行键进行更复杂的过滤时,可以使用RowFilter。
构造函数:RowFilter(CompareFilter.CompareOp rowCompareOp, ByteArrayComparable rowComparator)

//RowFilter例子
private static void scanFilter09() throws IOException,UnsupportedEncodingException {Configuration conf = HBaseConfiguration.create();conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");conf.set("hbase.zookeeper.quorum", "ncst");HTable ht = new HTable(conf, "users");//匹配 行键包含'01' 所有的行    RowFilter rowFilter = new RowFilter(CompareOp.EQUAL, new SubstringComparator("01"));Scan scan = new Scan();scan.setFilter(rowFilter);ResultScanner rs = ht.getScanner(scan);for(Result result : rs){for(Cell cell : result.rawCells()){System.out.println(new String(CellUtil.cloneRow(cell))+"\t"+new String(CellUtil.cloneFamily(cell))+"\t"+new String(CellUtil.cloneQualifier(cell))+"\t"+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"+cell.getTimestamp());}}ht.close();
}
hbase(main):013:0> scan 'users',{FILTER=>"RowFilter(=,'substring:01')"}
ROW                                 COLUMN+CELL                                                                                         xiaoming01                         column=address:contry, timestamp=1442000277200, value=\xE4\xB8\xAD\xE5\x9B\xBD                      xiaoming01                         column=address:country, timestamp=1442000228945, value=\xE4\xB8\xAD\xE5\x9B\xBD                     xiaoming01                         column=info:age, timestamp=1441998917568, value=24
1 row(s) in 0.0190 seconds

5.PageFilter(Shell不支持?)
指定页面行数,返回对应行数的结果集。
需要注意的是,该过滤器并不能保证返回的结果行数小于等于指定的页面行数,因为过滤器是分别作用到各个region server的,它只能保证当前region返回的结果行数不超过指定页面行数。
构造函数:PageFilter(long pageSize)

//PageFilter例子
private static void scanFilter10() throws IOException,UnsupportedEncodingException {Configuration conf = HBaseConfiguration.create();conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");conf.set("hbase.zookeeper.quorum", "ncst");HTable ht = new HTable(conf, "users");//从RowKey为 "xiaoming" 开始,取3行(包含xiaoming)    PageFilter pageFilter = new PageFilter(3L);Scan scan = new Scan();scan.setStartRow("xiaoming".getBytes());scan.setFilter(pageFilter);ResultScanner rs = ht.getScanner(scan);for(Result result : rs){for(Cell cell : result.rawCells()){System.out.println(new String(CellUtil.cloneRow(cell))+"\t"+new String(CellUtil.cloneFamily(cell))+"\t"+new String(CellUtil.cloneQualifier(cell))+"\t"+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"+cell.getTimestamp());}}ht.close();
}

注意:由于该过滤器并不能保证返回的结果行数小于等于指定的页面行数,所以更好的返回指定行数的办法是ResultScanner.next(int nbRows),即:

//上面Demo的改动版private static void scanFilter11() throws IOException,UnsupportedEncodingException {Configuration conf = HBaseConfiguration.create();conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");conf.set("hbase.zookeeper.quorum", "ncst");HTable ht = new HTable(conf, "users");//从RowKey为 "xiaoming" 开始,取3行(包含xiaoming)    //PageFilter pageFilter = new PageFilter(3L);Scan scan = new Scan();scan.setStartRow("xiaoming".getBytes());//scan.setFilter(pageFilter);ResultScanner rs = ht.getScanner(scan);//指定返回3行数据for(Result result : rs.next(3)){for(Cell cell : result.rawCells()){System.out.println(new String(CellUtil.cloneRow(cell))+"\t"+new String(CellUtil.cloneFamily(cell))+"\t"+new String(CellUtil.cloneQualifier(cell))+"\t"+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"+cell.getTimestamp());}}ht.close();
}

6.SkipFilter(Shell不支持)
根据整行中的每个列来做过滤,只要存在一列不满足条件,整行都被过滤掉。
构造函数:SkipFilter(Filter filter)

例如,如果一行中的所有列代表的是不同物品的重量,则真实场景下这些数值都必须大于零,我们希望将那些包含任意列值为0的行都过滤掉。在这个情况下,我们结合ValueFilter和SkipFilter共同实现该目的:
scan.setFilter(new SkipFilter(new ValueFilter(CompareOp.NOT_EQUAL,new BinaryComparator(Bytes.toBytes(0))));

//SkipFilter例子
private static void scanFilter12() throws IOException,UnsupportedEncodingException {Configuration conf = HBaseConfiguration.create();conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");conf.set("hbase.zookeeper.quorum", "ncst");HTable ht = new HTable(conf, "users");//跳过列值中包含"24"的所有列SkipFilter skipFilter = new SkipFilter(new ValueFilter(CompareOp.NOT_EQUAL, new BinaryComparator("24".getBytes())));Scan scan = new Scan();scan.setFilter(skipFilter);ResultScanner rs = ht.getScanner(scan);for(Result result : rs){for(Cell cell : result.rawCells()){System.out.println(new String(CellUtil.cloneRow(cell))+"\t"+new String(CellUtil.cloneFamily(cell))+"\t"+new String(CellUtil.cloneQualifier(cell))+"\t"+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"+cell.getTimestamp());}}ht.close();
}

7.Utility--FirstKeyOnlyFilter
该过滤器仅仅返回每一行中第一个cell的值,可以用于高效的执行行数统计操作。估计实战意义不大。
构造函数:public FirstKeyOnlyFilter()

//FirstKeyOnlyFilter例子
private static void scanFilter12() throws IOException,UnsupportedEncodingException {Configuration conf = HBaseConfiguration.create();conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");conf.set("hbase.zookeeper.quorum", "ncst");HTable ht = new HTable(conf, "users");//返回每一行中的第一个cell的值FirstKeyOnlyFilter firstKeyOnlyFilter = new FirstKeyOnlyFilter();Scan scan = new Scan();scan.setFilter(firstKeyOnlyFilter);ResultScanner rs = ht.getScanner(scan);int i = 0;for(Result result : rs){for(Cell cell : result.rawCells()){System.out.println(new String(CellUtil.cloneRow(cell))+"\t"+new String(CellUtil.cloneFamily(cell))+"\t"+new String(CellUtil.cloneQualifier(cell))+"\t"+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"+cell.getTimestamp());i++;}}//输出总的行数System.out.println(i);ht.close();
}
hbase(main):009:0> scan 'users',{FILTER=>'FirstKeyOnlyFilter()'}
ROW                                COLUMN+CELL                                                                                         xiaoming                          column=address:city, timestamp=1441997498965, value=hangzhou                                        xiaoming01                        column=address:contry, timestamp=1442000277200, value=\xE4\xB8\xAD\xE5\x9B\xBD                      xiaoming02                        column=info:age, timestamp=1441998917594, value=24                                                  xiaoming03                        column=info:age, timestamp=1441998919607, value=24                                                  zhangyifei                        column=address:city, timestamp=1441997499108, value=jieyang
5 row(s) in 0.0240 seconds

转载于:https://www.cnblogs.com/davidwang456/p/8303056.html

HBase Filter及对应Shell--转相关推荐

  1. 一个自定义 HBase Filter -“通过RowKeys来高性能获取数据”

    摘要: 大家在使用HBase和Solr搭建系统中经常遇到的一个问题就是:"我通过SOLR得到了RowKeys后,该怎样去HBase上取数据".使用现有的Filter性能差劲,网上也 ...

  2. HBase安装phoenix实战shell操作

    Hbase安装参考https://rumenz.com/rumenbiji/hadoop-hbase-install.html 由于我们安装的是 hbase-2.3.1-bin.tar.gz ,所以需 ...

  3. spark读Hbase数据集成Hbase Filter(过滤器)

    文章目录 过滤器简介 spark 读Hbase集成Filter TableInputFormat 源码 代码示例 基于hbase版本2.3.5 过滤器简介 Hbase 提供了种类丰富的过滤器(filt ...

  4. HBase - Filter - 过滤器的介绍以及使用

    博文作者:那伊抹微笑 csdn 博客地址:http://blog.csdn.net/u012185296 1 过滤器 HBase 的基本 API,包括增.删.改.查等. 增.删都是相对简单的操作,与传 ...

  5. hbase filter原理_HBase应用|HBase在移动广告监测产品中的应用

    1 HBase在Ad Tracking的应用 1.1 Ad Tracking的业务场景 Ad Tracking是TalkingData的移动广告监测产品,其核心业务模型是归因.App用户点击广告之后, ...

  6. HBase Filter 过滤器概述

    HBase过滤器是一套为完成一些较高级的需求所提供的API接口. 过滤器也被称为下推判断器(push-down predicates),支持把数据过滤标准从客户端下推到服务器,带有 Filter 条件 ...

  7. 【HBase】HBase数据库基本操作(Shell)

    分享一个有趣的比喻: HBase像一个骑着大象的士兵,本身并不优秀,却可以耀武扬威--但需要养一头大象(Hadoop) 检查 ▶ cd到Hadoop,开启HDFS cd /usr/local/hado ...

  8. hbase命令集(shell 命令,如建表,清空表,增删改查)

    两篇可以参考的文章,讲的不错 http://www.cnblogs.com/nexiyi/p/hbase_shell.html (http://blog.iyunv.com/wulantian/art ...

  9. Linux、hbase、hive、shell、sqoop笔记总结+一键开启关闭脚本

    Linux笔记 pwd------查看当前目录 cd------切换目录ls------查看当前目录下的目录及文件 ls-l或ll------长格式查看当前文件 ls-a------查看当前隐藏文件 ...

最新文章

  1. xgboost重要参数1
  2. Visual C++ Attribute Programming
  3. linux经典书籍--Linux系统编程
  4. win10系统调用架构分析
  5. 东北大哥在线反套路hhhhhh | 今日最佳
  6. mysql独有的函数_数据库之MySQL函数(一)
  7. Spring Boot笔记-404错误统一管理
  8. Server.MapPath()
  9. 【经验分享】如何搭建本地MQTT服务器(Windows ),并进行上下行调测
  10. BZOJ 2019 [Usaco2009 Nov]找工作:spfa【最长路】【判正环】
  11. FTP的主、被动模式
  12. 去除maven父依赖_如何在Maven中从父项目中排除依赖项?
  13. 【PHP内核剖析】一、PHP基本架构
  14. 色 彩 RGB 值 对 照 表
  15. 我的世界游戏服务器改名
  16. 网站群发站内信数据库表设计
  17. 图像处理 有损压缩变换-离散余弦变换
  18. 我的武林秘籍设计模式之命令模式
  19. 全球服务器性能排行榜,跑分全球第4,王思聪「随手」装了台服务器,网友:壕无人性...
  20. 2020—开发记录以及知识总结,持续更新

热门文章

  1. 安卓中radiobutton不进入监听事件_Laravel模型事件的实现原理详解
  2. python中删除字典中所有元素的函数_在python中,按值删除字典项的最佳方法是什么?...
  3. envi 文件 生成mat_JVM 内存分析工具 MAT 的深度讲解与实践——入门篇
  4. java 市场_java市场前景怎样?
  5. mysql charindex_mysql中替代charindex的函数substring_index、find_in_set | 学步园
  6. 建立能够持续请求的CS网络程序
  7. python construct 字符串_通过字符串变量在Python中设置和获取@property方法
  8. 普中28335开发攻略_TMS320F28335项目开发记录1_CCS的使用介绍
  9. kuka机器人if逻辑编程_KUKA机器人调试的程序语法是什么
  10. c++ 交换变量实践