Clickhouse简介

Clickhouse是什么

1. 开源的列存储数据库管理系统

2. 支持线性扩展

3. 简单方便

4. 高可靠性

5. 容错(支持多主机异步复制,可以跨多个数据中心部署。 单个节点或整个数据中心的停机时间不会影响系统的读写可用性)

clickhouse架构及存储方式

clickhouse架构未开源

clickhouse特点

用于对干净,结构良好且不可变的事件或日志进行分析。建议将每个这样的流放入一个带有预加入尺寸的单一宽事实表中。

Clickhouse使用场景

可行的应用程序的一些例子:

  • Web和App分析
  • 广告网络和RTB
  • 电信
  • 电子商务和金融
  • 信息安全
  • 监测和遥测
  • 时间序列
  • 商业智能
  • 线上游戏
  • 物联网
  • 事务性工作负载(OLTP)
  • 高请求率的键值访问
  • Blob或文档存储
  • 超标准化的数据

不适用场景

clickhouse安装

clickhouse单节点安装

检查系统是否支持clickhouse安装

执行命令:

grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"

若显示为SSE4.2suported 则可以继续安装如为后者:

那么很不幸的告诉你你的电脑cpu不支持sse指令集,请自想办法。

拉取repo源文件

curl -s https://packagecloud.io/install/repositories/altinity/clickhouse/script.rpm.sh | sudo bash

或者直接新建:

altinity_clickhouse.repo文件

将此内容插入centos6版本

[altinity_clickhouse]

name=altinity_clickhouse

baseurl=https://packagecloud.io/altinity/clickhouse/el/6/$basearch

repo_gpgcheck=1

gpgcheck=0

enabled=1

gpgkey=https://packagecloud.io/altinity/clickhouse/gpgkey

sslverify=1

sslcacert=/etc/pki/tls/certs/ca-bundle.crt

metadata_expire=300

[altinity_clickhouse-source]

name=altinity_clickhouse-source

baseurl=https://packagecloud.io/altinity/clickhouse/el/6/SRPMS

repo_gpgcheck=1

gpgcheck=0

enabled=1

gpgkey=https://packagecloud.io/altinity/clickhouse/gpgkey

sslverify=1

sslcacert=/etc/pki/tls/certs/ca-bundle.crt

metadata_expire=300

centos7版本

[altinity_clickhouse]

name=altinity_clickhouse

baseurl=https://packagecloud.io/altinity/clickhouse/el/7/$basearch

repo_gpgcheck=1

gpgcheck=0

enabled=1

gpgkey=https://packagecloud.io/altinity/clickhouse/gpgkey

sslverify=1

sslcacert=/etc/pki/tls/certs/ca-bundle.crt

metadata_expire=300

[altinity_clickhouse-source]

name=altinity_clickhouse-source

baseurl=https://packagecloud.io/altinity/clickhouse/el/7/SRPMS

repo_gpgcheck=1

gpgcheck=0

enabled=1

gpgkey=https://packagecloud.io/altinity/clickhouse/gpgkey

sslverify=1

sslcacert=/etc/pki/tls/certs/ca-bundle.crt

metadata_expire=300

yum list  ‘clickhouse*’

yum –y install  ‘clickhouse*

clickhouse多节点安装

在每台机器上安装click house数据库然后,在每台机器上做如下修改

修改host文件

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4

::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.3.251 host1

192.168.3.252 host2

192.168.3.247 host3

~

新建文件metrika.xml

在/etc下新建文件cd /etc

vi   metrika.xml

将以下内容修改后粘贴入metrika.xml

<yandex>

<clickhouse_remote_servers>

<perftest_3shards_1replicas>

<shard>

<internal_replication>true</internal_replication>

<replica>

<host>192.168.3.247</host>

<port>9000</port>

</replica>

</shard>

<shard>

<replica>

<internal_replication>true</internal_replication>

<host>192.168.3.252</host>

<port>9000</port>

</replica>

</shard>

<shard>

<replica>

<internal_replication>true</internal_replication>

<host>192.168.3.251</host>

<port>9000</port>

</replica>

</shard>

</perftest_3shards_1replicas>

</clickhouse_remote_servers>

<zookeeper-servers>

<node index="1">

<host>192.168.3.251</host>

<port>2181</port>

</node>

</zookeeper-servers>

<macros>

<replica>192.168.3.252</replica>

</macros>

<networks>

<ip>::/0</ip>

</networks>

<clickhouse_compression>

<case>

<min_part_size>10000000000</min_part_size>

<min_part_size_ratio>0.01</min_part_size_ratio>

<method>lz4</method>

</case>

</clickhouse_compression>

</yandex>

修改/etc/clickhouse-server下的config.xml文件

<!-- Listen specified host. use :: (wildcard IPv6 address), if you want to accept connections both with IPv4 and IPv6 from everywhere. -->

<!-- <listen_host>::</listen_host> -->

<listen_host>::1</listen_host>

<listen_host>192.168.3.252</listen_host>

clickhouse使用

简单的使用

启动

/etc/init.d/clickhouse-server start

命令行clickhouse-client –h host –u –p

默认即可:使用clickhouse-client 进入客户端。

DML(data manipulation language)

insert into funtest values(3,'xiaoming',22,'2017-11-09')

insert into funtest values(32,'xiaolan',33,'2017-11-08')

insert into funtest values(35,'xiaotong',33,'2017-11-07')

insert into funtest values(4,'xiaohuang',33,'2017-11-08')

insert into funtest values(44,'xiaolvas',34,'2017-11-05')

insert into funtest values(6,'xiaohuanasg',32,'2017-11-28')

select *  from funtest

select *  from funtest order by id

select * from funtest order by  id desc

select avg(age)  from funtest

select count(name) from funtest

select age from funtest group by age

select round(age/3) FROM funtest

select cast('2015-12-22' as date) from funtest

select cast('2015-12-22' as date)+30 from funtest

select stddev_samp(age) FROM funtest

select upper('hhh') from funtest

select upper(name) from funtest

select abs(-1) from funtest

select * FROM funtest where times =cast('2015-12-22' as date)

select max(age) from funtest

select case when name ='xiaoming' then concat(name,'dddd') else 'ddddfdfdfdf' end  from funtest

select substring(name,1,3) from funtest

select rand() from funtest

DDL(data definition language)

create table funtest(id UInt32, name String ,age UInt32,times Date)ENGINE=Log

drop table funtest

alter table ontime_all add COLUMN name String;

性能测试

性能测试代码如下

获取数据

for s in `seq 1987 2017`

do

for m in `seq 1 12`

do

echo http://transtats.bts.gov/PREZIP/On_Time_On_Time_Performance_${s}_${m}.zip >> a.lst

done

done

解压上传至click house数据库

for i in *.zip; do echo $i; unzip -cq $i '*.csv' | sed 's/\.00//g' | clickhouse-client  --query="INSERT INTO ontime_test FORMAT CSVWithNames"; done

创建hive表

CREATE TABLE ontime

(

Year int,

Quarter int,

Month int,

DayofMonth int,

DayOfWeek int,

FlightDate Date,

UniqueCarrier String,

AirlineID int,

Carrier String,

TailNum String,

FlightNum String,

OriginAirportID int,

OriginAirportSeqID int,

OriginCityMarketID int,

Origin String,

OriginCityName String,

OriginState String,

OriginStateFips String,

OriginStateName String,

OriginWac int,

DestAirportID int,

DestAirportSeqID int,

DestCityMarketID int,

Dest String,

DestCityName String,

DestState String,

DestStateFips String,

DestStateName String,

DestWac int,

CRSDepTime int,

DepTime int,

DepDelay int,

DepDelayMinutes int,

DepDel15 int,

DepartureDelayGroups String,

DepTimeBlk String,

TaxiOut int,

WheelsOff int,

WheelsOn int,

TaxiIn int,

CRSArrTime int,

ArrTime int,

ArrDelay int,

ArrDelayMinutes int,

ArrDel15 int,

ArrivalDelayGroups int,

ArrTimeBlk String,

Cancelled int,

CancellationCode String,

Diverted int,

CRSElapsedTime int,

ActualElapsedTime int,

AirTime int,

Flights int,

Distance int,

DistanceGroup int,

CarrierDelay int,

WeatherDelay int,

NASDelay int,

SecurityDelay int,

LateAircraftDelay int,

FirstDepTime String,

TotalAddGTime String,

LongestAddGTime String,

DivAirportLandings String,

DivReachedDest String,

DivActualElapsedTime String,

DivArrDelay String,

DivDistance String,

Div1Airport String,

Div1AirportID int,

Div1AirportSeqID int,

Div1WheelsOn String,

Div1TotalGTime String,

Div1LongestGTime String,

Div1WheelsOff String,

Div1TailNum String,

Div2Airport String,

Div2AirportID int,

Div2AirportSeqID int,

Div2WheelsOn String,

Div2TotalGTime String,

Div2LongestGTime String,

Div2WheelsOff String,

Div2TailNum String,

Div3Airport String,

Div3AirportID int,

Div3AirportSeqID int,

Div3WheelsOn String,

Div3TotalGTime String,

Div3LongestGTime String,

Div3WheelsOff String,

Div3TailNum String,

Div4Airport String,

Div4AirportID int,

Div4AirportSeqID int,

Div4WheelsOn String,

Div4TotalGTime String,

Div4LongestGTime String,

Div4WheelsOff String,

Div4TailNum String,

Div5Airport String,

Div5AirportID int,

Div5AirportSeqID int,

Div5WheelsOn String,

Div5TotalGTime String,

Div5LongestGTime String,

Div5WheelsOff String,

Div5TailNum String

)row format delimited

fields terminated by ','

stored as textfile;

load data inpath ‘/data’into table ontime;

修改hive存储格式
orc

与spark对比测试

创建clickhouse本地表

CREATE TABLE ontime

(

Year UInt16,

Quarter UInt8,

Month UInt8,

DayofMonth UInt8,

DayOfWeek UInt8,

FlightDate Date,

UniqueCarrier FixedString(7),

AirlineID Int32,

Carrier FixedString(2),

TailNum String,

FlightNum String,

OriginAirportID Int32,

OriginAirportSeqID Int32,

OriginCityMarketID Int32,

Origin FixedString(5),

OriginCityName String,

OriginState FixedString(2),

OriginStateFips String,

OriginStateName String,

OriginWac Int32,

DestAirportID Int32,

DestAirportSeqID Int32,

DestCityMarketID Int32,

Dest FixedString(5),

DestCityName String,

DestState FixedString(2),

DestStateFips String,

DestStateName String,

DestWac Int32,

CRSDepTime Int32,

DepTime Int32,

DepDelay Int32,

DepDelayMinutes Int32,

DepDel15 Int32,

DepartureDelayGroups String,

DepTimeBlk String,

TaxiOut Int32,

WheelsOff Int32,

WheelsOn Int32,

TaxiIn Int32,

CRSArrTime Int32,

ArrTime Int32,

ArrDelay Int32,

ArrDelayMinutes Int32,

ArrDel15 Int32,

ArrivalDelayGroups Int32,

ArrTimeBlk String,

Cancelled UInt8,

CancellationCode FixedString(1),

Diverted UInt8,

CRSElapsedTime Int32,

ActualElapsedTime Int32,

AirTime Int32,

Flights Int32,

Distance Int32,

DistanceGroup UInt8,

CarrierDelay Int32,

WeatherDelay Int32,

NASDelay Int32,

SecurityDelay Int32,

LateAircraftDelay Int32,

FirstDepTime String,

TotalAddGTime String,

LongestAddGTime String,

DivAirportLandings String,

DivReachedDest String,

DivActualElapsedTime String,

DivArrDelay String,

DivDistance String,

Div1Airport String,

Div1AirportID Int32,

Div1AirportSeqID Int32,

Div1WheelsOn String,

Div1TotalGTime String,

Div1LongestGTime String,

Div1WheelsOff String,

Div1TailNum String,

Div2Airport String,

Div2AirportID Int32,

Div2AirportSeqID Int32,

Div2WheelsOn String,

Div2TotalGTime String,

Div2LongestGTime String,

Div2WheelsOff String,

Div2TailNum String,

Div3Airport String,

Div3AirportID Int32,

Div3AirportSeqID Int32,

Div3WheelsOn String,

Div3TotalGTime String,

Div3LongestGTime String,

Div3WheelsOff String,

Div3TailNum String,

Div4Airport String,

Div4AirportID Int32,

Div4AirportSeqID Int32,

Div4WheelsOn String,

Div4TotalGTime String,

Div4LongestGTime String,

Div4WheelsOff String,

Div4TailNum String,

Div5Airport String,

Div5AirportID Int32,

Div5AirportSeqID Int32,

Div5WheelsOn String,

Div5TotalGTime String,

Div5LongestGTime String,

Div5WheelsOff String,

Div5TailNum String

) ENGINE = MergeTree(FlightDate, (Year, FlightDate), 8192)

创建分区表

CREATE TABLE ontimetest AS ontime ENGINE = Distributed(perftest_3shards_1replicas, default, ontime, rand())

注意:

每个节点分别创建本地表,和分区表

转载于:https://www.cnblogs.com/tsxylhs/p/7837707.html

clickhouse安装使用文档相关推荐

  1. gnokii 短信猫 中文安装使用文档

    gnokii 短信猫 中文安装使用文档 2010年11月19日 - admin 8月份做的一个东东,重新整理了一下发上来.当日后使用文档! 环境: centos 5.2,短信猫设备: wave 安装软 ...

  2. gnokii 中文安装使用文档

    gnokii 中文安装使用文档 2010年11月19日 - admin 8月份做的一个东东,重新整理了一下发上来.当日后使用文档! 环境: centos 5.2,设备: wave 安装软件地址: ht ...

  3. linux上搭载was应用上传中文文件,受支持的Linux操作系统和WAS ND 9.0安装部署文档的资料说明...

    本文档的主要内容详细介绍的是受支持的Linux操作系统和WAS ND 9.0安装部署文档的资料说明. 从was9.0开始支持的最低版本的red hat Linux系统为6.6且仅支持64位操作系统 计 ...

  4. centos7安装rabbitmq_rabbitmq v3.7.16安装部署文档

    RabbitMQ v3.7.16安装部署文档 部署安装过程严格按照官方文档的流程. 前言 软件版本 os centos7(ubuntu也适用,需要替换部分命令) rabbitmq v3.7.16 (r ...

  5. oracle12c配置文档,Oracle12C安装配置文档

    Oracle12C安装配置文档 Oracle12C安装配置文档 准备软件: 开始安装: 打开从官网下载下来的两个压缩包,进行解压 打开解压好的后缀为2of2的文件夹 找到路径为database下的&q ...

  6. 如何在 Windows 上安装 ONLYOFFICE 文档 v7.2

    通过阅读本文,了解如何在Windows上安装ONLYOFFICE文档v7.2. 引言 使用社区版,您可以在本地服务器上安装 ONLYOFFICE 文档,并将在线编辑器与 ONLYOFFICE 协作平台 ...

  7. 深入浅出理解数据分析系列之:Python安装Excel文档库openpyxl和Pycharm为项目安装Excel文档库openpyxl

    深入浅出理解数据分析系列之:Python安装Excel文档库openpyxl和Pycharm为项目安装Excel文档库openpyxl 一.Python安装openpyxl 二.Pycharm为项目安 ...

  8. 【ubuntu】Ubuntu系统下安装石墨文档

    文章目录 1.下载石墨文档安装包 2.安装石墨文档软件 3.搜索并打开石墨文档 1.下载石墨文档安装包 下载链接:https://shimo.im/download 石墨文档提供4版linux下载包: ...

  9. php gb28181,EasyGBS国标流媒体服务器GB28181国标方案安装使用文档

    EasyGBS - GB28181 国标方案安装使用文档 下载 安装包下载,正式使用需商业授权, 功能一致 架构图 EasySIPCMS SIP 中心信令服务, 单节点, 自带一个 Redis Ser ...

  10. 如何使用 snap 包在 Linux 上安装 ONLYOFFICE 文档

    ONLYOFFICE 文档是一款符合 GNU AGPL v3.0 的开源办公套件.其中包含基于 Web 的查看器和协作编辑器,可用于处理文本文档.电子表格.演示文稿以及兼容 OOXML 格式的表格. ...

最新文章

  1. 动态分辨率是什么意思_什么是1080p、2k、4k?视频基础参数解释
  2. kubernetes(三)k8s中通信和Service
  3. 多功能照片图片处理器小程序源码_支持流量主
  4. 【Flink】Flink yarn 下报错ClassNotFoundException: org.apache.hadoop.yarn.api.ApplicationConstants$Environ
  5. arduino编程语言教程_Arduino|编程语言说明
  6. 词根词缀的实践应用 - 词根词缀词典墨墨详细使用
  7. VR全景虚拟校园提高学校的知名度和美誉度
  8. “移动媒体产品”的三个方向
  9. 推荐几款免费的MacOS/MacBook pro/MacBook air读取NTFS格式磁盘软件(完全免费)
  10. 空间直角坐标系、左手坐标系、右手坐标系
  11. 2010 27寸 imac 升级固态_2017 款 iMac,27 寸升级换 SSD 固态硬盘拆机详解
  12. C++学习(二八一)Gradle下载目录里的随机码是什么
  13. 曾因“贿赂”苹果被罚款 10.3 亿美元,高通上诉成功
  14. spring bean的init、destory的几种方法及生命周期
  15. Oracle序列的使用
  16. vue开发常用css,js(持续更新)
  17. 计算机陕西工业职业技术学院,2019年度陕西省中等职业学校教师省级培训计算机动漫与游戏制作开班典礼在我院举行...
  18. python安装后怎么打开_python软件怎么打开
  19. ProE4.0鼠标产品造型建模与结构设计视频教程
  20. apksigner(一级命令)

热门文章

  1. 用sqoop从mysql导数hive_使用sqoop从mysql导入到hive基本操作
  2. div 中的i标签如何点击事件_前端优化:语义标签进化史
  3. apollo本地启动调方式
  4. gis投影中未定义的地理转换_ArcGIS中5分钟搞懂坐标系相关知识
  5. unity animator 动画 结束后保持位移_Unity动画系统详解9:Target Matching是什么?
  6. 【渝粤教育】国家开放大学2018年春季 0420-22T酒店管理概论 参考试题
  7. 铋- Bismuth
  8. 教大家如何修改博客背景
  9. [uva11997]k个最小和
  10. HDU - 2602 01背包