The New InfluxDB Storage Engine: Time Structured Merge Tree

by Paul Dix | Oct 7, 2015 | InfluxDB | 0 comments

For more than a year we’ve been talking about potentially making a storage engine purpose-built for our use case of time series data. Today I’m excited to announce that we have the first version of our new storage engine available in a nightly build for testing. We’re calling it the Time Structured Merge Tree or TSM Tree for short.

In our early testing we’ve seen up to a 45x improvement in disk space usage over 0.9.4 and we’ve written 10,000,000,000 (10B) data points (divided over 100k unique series written in 5k point batches) at a sustained rate greater than 300,000 per second on an EC2 c3.8xlarge instance. This new engine uses up to 98% less disk space to store the same data as 0.9.4 with no reduction in query performance.

In this post I’ll talk a little bit about the new engine and give pointers to more detailed write-ups and instructions on how to get started with testing it out.

Columnar storage and unlimited fields

The new storage engine is a columnar format, which means that having multiple fields won’t negatively affect query performance. For this engine we’ve also lifted the limitation on the number of fields you can have in a measurement. For instance, you could have MySQL as the thing you’re measuring and represent each of the few hundred different metrics that you gather from MySQL as separate fields.

Even though the engine isn’t optimized for updates, the new columnar format also means that it’s possible to update single fields without touching the other fields for a given data point.

Compression

We use multiple compression techniques which vary depending on the data type of the field and the precision of the timestamps. Timestamp precision matters because you can represent them down to nanosecond scale. For timestamps we use delta encoding, scaling and compression using simple8b, run-length encoding or falling back to no compression if the deltas are too large. Timestamps in which the deltas are small and regular compress best. For instance, we can get great compression on nanosecond timestamps if they’re only 10ns apart each. We’d achieve the same level of compression for second precision timestamps that are 10s apart.

We use the same delta encoding for floats mentioned in Facebook’s Gorilla paper, bits for booleans, delta encoding for integers, and Snappy compression for strings. We’re also considering adding dictionary style compression for strings, which is very efficient if you have repeated strings.

Depending on the shape of your data, the total size for storage including all tag metadata can range from 2 bytes per point on the low end to more for random data. We found that random floats with second level precision in series sampled every 10 seconds take about 3 bytes per point. For reference, Graphite’s Whisper storage uses 12 bytes per point. Real world data will probably look a bit better since there are often repeated values or small deltas.

LSM Tree similarities

The new engine has similarities with LSM Trees (like LevelDB and Cassandra’s underlying storage). It has a write ahead log, index files that are read only, and it occasionally performs compactions to combine index files. We’re calling it a Time Structured Merge Tree because the index files keep contiguous blocks of time and the compactions merge those blocks into larger blocks of time.

Compression of the data improves as the index files are compacted. Once a shard becomes cold for writes it will be compacted into as few files as possible, which yield the best compression.

转自:https://www.influxdata.com/new-storage-engine-time-structured-merge-tree/

转载于:https://www.cnblogs.com/bonelee/p/6794747.html

InfluxDB存储引擎Time Structured Merge Tree——本质上和LSM无异,只是结合了列存储压缩,其中引入fb的float压缩,字串字典压缩等...相关推荐

  1. HBase底层存储原理——我靠,和cassandra本质上没有区别啊!都是kv 列存储,只是一个是p2p另一个是集中式而已!...

    理解HBase(一个开源的Google的BigTable实际应用)最大的困难是HBase的数据结构概念究竟是什么?首先HBase不同于一般的关系数据库,它是一个适合于非结构化数据存储的数据库.另一个不 ...

  2. MySQL的存储引擎InnoDB,B+Tree数据结构索引的实现原理图(聚簇索引/聚集索引)

    1.表数据文件本身就是按B+Tree组织的一个索引结构文件 2.InnoDB的B+Tree的索引数据结构中,表数据和索引数据合并在一起,即叶子节点包含了完整的数据记录,这样的索引叫聚簇索引.

  3. openGauss存储技术(三)——列存储引擎

    上一篇内容我们介绍了openGauss存储技术--行存储引擎,本文重点介绍openGauss列存储引擎. openGauss列存储引擎 传统行存储数据压缩率低,必须按行读取,即使读取一列也必须读取整行 ...

  4. Mysql 索引 总结 —— 概述 || 索引优势劣势|| 索引结构(索引是在MySQL的存储引擎层中实现的)|| BTREE 结构||B+TREE 结构||MySQL中的B+Tree||索引分类

    索引概述 MySQL官方对索引的定义为:索引(index)是帮助MySQL高效获取数据的数据结构(有序). 在数据之外,数据库系统还维护者满足特定查找算法的数据结构, 这些数据结构以某种方式引用(指向 ...

  5. mysql merge事务_mysql菜鸟手迹11--mysql存储引擎之Merge

    merge engine存储引擎: 一个Merge表是一组MySIAM表的集合,每个Merage表在磁盘上是一个.frm 的结构文件和一个.mrg的文件,这个文件是一个文本文件里面存放的是组成这个me ...

  6. 从零实现一个 k-v 存储引擎

    写这篇文章的目的,是为了帮助更多的人理解 rosedb,我会从零开始实现一个简单的包含 PUT.GET.DELETE 操作的 k-v 存储引擎. 你可以将其看做是一个简易版本的 rosedb,就叫它 ...

  7. 第 3 章 MySQL 存储引擎简介

    3.1 MySQL 存储引擎概述 MyISAM存储引擎是MySQL默认的存储引擎,也是目前MySQL使用最为广泛的存储引擎之一.他的前身就是我们在MySQL发展历程中所提到的 ISAM,是ISAM的升 ...

  8. MySQL InnoDB存储引擎

    呵呵哒... MySQL体系结构和存储引擎 首先要搞懂的是什么是数据库,什么是数据库实例. 数据库:物理操作系统文件或其他形式文件类型的集合. 实例:MySQL数据库由后台线程以及一个共享内存区组成, ...

  9. 国产数据库存储引擎X-Engine的科研之路

    X-Engine是阿里云RDS MySQL 的存储引擎之一,基于Log-structured Merge Tree (LSM-tree),较基于 B-tree 一族的其它存储引擎而言年轻很多,所以在实 ...

最新文章

  1. LeetCode简单题之统计匹配检索规则的物品数量
  2. python3 opencv_Python3 OpenCV3 图像处理基础
  3. [JDK翻译][Executor][ExecutorService]
  4. android复制链接到粘贴板,Android复制粘贴到剪贴板
  5. Swift中switch比较元组类型
  6. java cv bgr2gray_CV_BGR2GRAY vs CV_GRAY2BGR | 学步园
  7. overfitting(过度拟合)的概念
  8. linux c 网络编程与信号量,linux网络编程-----线程同步--信号量
  9. Ubuntu设置目录的读写权限(Linux命令chmod 777 dirName)
  10. cordova构建项目命令小结
  11. Bootstrap 轻量级后台管理系统模板--ACE使用介绍
  12. 微信管理工具用什么比较好呀
  13. 使用图灵机器人api接口开发智能聊天机器人
  14. c语言计算梯形的面积
  15. python从srt文件中只提取歌词
  16. java架构中:亿级用户中心的设计与实践
  17. 大漠插件问题:解决win10win7win8系统找不到指定的模块,注册不了大漠插件的问题
  18. 实现一个文字识别(图片转文字)工具
  19. Unity HDRP Volume框架 — Lighting(光照)
  20. kafka入门介绍「详细教程」

热门文章

  1. ldd 3 重定向打印开启 misc-progs
  2. springboot的jsp应该放在哪_详解SpringBoot 添加对JSP的支持(附常见坑点)
  3. python坐标轴刻度为经纬度_python各类经纬度转换
  4. java 执行linux命令行_10个高效Linux技巧及Vim命令对比
  5. python装饰器的案例_Python之装饰器的实例
  6. 微型计算机总线不包括,微型计算机总线不包括( )。
  7. 二级域名session共享php本地,PHP二级域名session共享方案
  8. java se环境变量设置_JavaSE中环境变量的配置
  9. 分享一款jquery的日期插件
  10. React Native开发错误警告处理总结(已解决 !持续更新)