http://www.icaijing.com/hot/article4940159/

Phoenix二级索引那些事儿（下）

作者：中兴大数据| 发表时间：2015-7-30 03:31:18

索引配置

公共配置

hbase-site.xml

hbase.regionserver.wal.codec
org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec

全局索引配置

hbase-site.xml配置项

支持（HBase0.98.4+ and Phoenix 4.3.1+ only）

hbase.region.server.rpc.scheduler.factory.class
org.apache.hadoop.hbase.ipc.PhoenixRpcSchedulerFactory
Factory to create thePhoenix RPC Scheduler that uses separate queues for index and metadataupdates

hbase.rpc.controllerfactory.class
org.apache.hadoop.hbase.ipc.controller.ServerRpcControllerFactory
Factory to create thePhoenix RPC Scheduler that uses separate queues for index and metadataupdates

局部索引配置

hbase-site.xml配置项

hbase.master.loadbalancer.class
org.apache.phoenix.hbase.index.balancer.IndexLoadBalancer

hbase.coprocessor.master.classes
org.apache.phoenix.hbase.index.master.IndexMasterObserver

hbase.coprocessor.regionserver.classes
org.apache.hadoop.hbase.regionserver.LocalIndexMerger

索引配置调优

hbase-site.xml配置项

index.builder.threads.max
Default: 10

根据主表更新建立索引表更新的线程数目
调高这个值可以克服读取Region的row state的瓶颈，如果调的太高，HRegion又会遇到处理太多并发scan requests的瓶颈以及 generalthread-swapping 障碍.

index.builder.threads.keepalivetime
Default: 60

index builder线程池里的线程过期之后存活的时间
超过这个存活时间，未使用会立马被释放，核心线程是不会被保留的．如果从负载角度考虑，是可以手动去释放线程

index.writer.threads.max
Default: 10

将index update写入index table的线程数目
应该大致对应index table的数目

index.writer.threads.keepalivetime
Default: 60

类似index.builder.threads.keepalivetime

hbase.htable.threads.max
Default: 2,147,483,647

索引表可以使用的写线程最大数目
增加这个值会提高索引更新的并发量，提升全局吞吐

hbase.htable.threads.keepalivetime
Default: 60

类似index.builder.threads.keepalivetime

index.tablefactory.cache.size
Default: 10

放入缓存的索引表的数量
增加这个值，可以确保写index时不需要重新创建indexHTable，但是值越大，memory压力越大．

org.apache.phoenix.regionserver.index.priority.min
Default: 1000

Value to specify to bottom (inclusive) of therange in which index priority may lie.

org.apache.phoenix.regionserver.index.priority.max
Default: 1050

Value to specify to top (exclusive) of therange in which index priority may lie.
Higher priorites within the index min/max rangedo not means updates are processed sooner.

org.apache.phoenix.regionserver.index.handler.count
Default: 30

Number of threads to use when serving indexwrite requests for global index maintenance.
Though the actual number of threads is dictatedby the Max(number of call queues, handler count), where the number of callqueues is determined by standard HBase configuration. To further tune thequeues, you can adjust the standard rpc queue length parameters (currently,there are no special knobs for the index queues), specificallyipc.server.max.callqueue.length and ipc.server.callqueue.handler.factor. Seethe HBase Reference Guide for more details.

其它功能

Phoenix子查询

IN和Not In的子查询
例子

SELECT ItemName
FROM Items
WHERE ItemID IN
(SELECT ItemID
FROM Orders
WHERE Date >= to_date('2013/09/02'));

Exists和Not Exists的子查询
例子

SELECT ItemName
FROM Items i
WHERE EXISTS
(SELECT *
FROM Orders
WHERE Date >= to_date('2013/09/02')
AND ItemID = i.ItemID);

半连接、反连接、join
例子

SELECTd.dept_id,e.dept_id,e.name FROM DEPT d JOIN EMPL e ON e.dept_id = d.dept_id;
JOIN支持：
INNER
LEFT OUTER
RIGHT

比较运算
例子

SELECT ID, Name
FROM Contest
WHERE Score >
(SELECT avg(Score)
FROM Contest)
ORDER BY ScoreDESC;

ANY/SOME/ALL运算

例子

SELECT OrderID
FROM Orders
WHERE quantity>= ANY
(SELECT max(quantity)
FROM Orders
GROUP BY ItemID);

Phoenix二级索引那些事儿（下）相关推荐

2021年大数据HBase（十二）：Apache Phoenix 二级索引
全网最详细的大数据HBase文章系列,强烈建议收藏加关注! 新文章都已经列出历史文章目录,帮助大家回顾前面的知识重点. 目录系列历史文章前言 Apache Phoenix 二级索引一.索引分类 ...
HBase phoenix二级索引
1. 为什么需要用二级索引? 对于HBase而言,如果想精确地定位到某行记录,唯一的办法是通过rowkey来查询.如果不通过rowkey来查找数据,就必须逐行地比较每一列的值,即全表扫瞄.对于较大的表 ...
Hbase索引（ Phoenix二级索引）
Hbase索引( Phoenix二级索引) 1. Phoenix简介 1.1.Phoenix安装 1.2.常用命令 1.3.phoenix表映射 1.3.1.视图映射 1.3.2.表映射 1.3.3. ...
Phoenix 二级索引的使用
二级索引二级索引是从主访问路径访问数据的一种正交方式.在HBase中,你有一个索引,它按照主行键按字典顺序排序.除了通过主行之外,以任何方式访问记录都可能需要扫描表中的所有行,以便根据筛选器对它们进 ...
Phoenix 二级索引探究
版本信息: HDP -> 3.0.0 Hadoop -> 3.0.1 HBase -> 2.0.0 Phoenix -> 5.0.0 HBASE 是 Google-Bigtab ...
Phoenix二级索引(Secondary Indexing)的使用（转：https://www.cnblogs.com/MOBIN/p/5467284.html）
摘要 HBase只提供了一个基于字典排序的主键索引,在查询中你只能通过行键查询或扫描全表来获取数据,使用Phoenix提供的二级索引,可以避免在查询数据时全表扫描,提高查过性能,提升查询效率测试环境 ...
HBase优化之Apache Phoenix二级索引
索引分类全局索引本地索引覆盖索引函数索引全局索引全局索引适用于读多写少业务当构建了全局索引时,Phoenix会拦截写入(DELETE.UPSERT值和UPSERT SELECT)上的数据 ...
HBase 集成 Phoenix 构建二级索引实践
Phoenix 在 HBase 生态系统中占据了非常重要的地位,本文主要包括以下几方面内容: Phoenix 介绍 CDH HBase 集成 Phoenix 使用 Phoenix 创建 HBase 二 ...
阿里云EMR异步构建云HBase二级索引
一.非HA EMR构建二级索引云HBase借助Phoenix实现二级索引功能,对于Phoenix二级索引的详细介绍可参考https://yq.aliyun.com/articles/536850?s ...

Phoenix二级索引那些事儿（下）

http://www.icaijing.com/hot/article4940159/

Phoenix二级索引那些事儿（下）

Phoenix二级索引那些事儿（下）相关推荐

最新文章

热门文章