zookeeper客户端库curator分析

  • 前言
  • 综述
  • zookeeper保证
    • 理解zookeeper的顺序一致性
  • 之前使用zookeeper客户端踩到的坑
  • curator 连接保证
  • 连接状态监控以及重试机制
  • 实例管理
  • Recipes 场景支持
    • 基本操作
    • 监听watch
    • 实现的recipes
      • Elections 选举
      • locks 锁
    • counters 计数器
      • caches 缓存
      • Nodes/Watchers
      • Queues 队列
      • 事务
    • tech note
  • 参考链接

前言

笔者在日常工作中主要使用的编程语言是C++,但从事互联网行业总离不开要和分布式共识协议下的注册中心打交道。笔者所在的公司主要用zookeeper。最初是裸着用,出现了很多问题,后来痛定思痛,决定研究curator,并根据其思路开发一套c++的客户端。下面是笔者在阅读curator代码和设计文档的过程中的笔记。

本文对java源码探究不深,原因一是笔者本身java水平不高,二是笔者阅读curator代码的目的只是为了学习其设计思路,启发自己设计c++版本库。

综述

zookeeper不是为高可用性设计的,但它使用ZAB协议达到了极高的一致性。所以它经常被选作注册中心、配置中心、分布式锁等场景。zookeeper是最终一致性系统,而很多实际应用需要保证强一致。

官方文档这样描述Curator存在的意义:ZooKeeper is a very low level system that requires users to do a lot of housekeeping. See: Zookeeper FAQ. The Curator Framework is designed to hide as much of the details/tedium of this housekeeping as is possible.

目前看到有两套两款比较好的开源客户端,对zookeeper的原生API进行了包装:zkClient和curator。后者是Nexflix的开源项目,目前运作在Apache基金会名下,也是spring全家桶的选择。

zookeeper保证

根据zookeeper官方文档,zookeeper提供了如下保证:

  • Sequential Consistency - Updates from a client will be applied in the order that they were sent.
  • Atomicity - Updates either succeed or fail. No partial results.
  • Single System Image - A client will see the same view of the service regardless of the server that it connects to. i.e., a client will never see an older view of the system even if the client fails over to a different server with the same session. 如果client首先看到了新数据,再尝试重连到存有旧数据的follower,该follower会拒绝该连接(client的zxid高于follower)
  • Reliability - Once an update has been applied, it will persist from that time forward until a client overwrites the update.
  • Timeliness - The clients view of the system is guaranteed to be up-to-date within a certain time bound.

根据我的实践,认为zookeeper只是一个最终一致性的分布式系统,并且历史上zookeeper还经常爆出违反分布式共识的bug,比如expired ephemeral node reappears after ZK leader change这个,session expired之后,临时节点仍然存在

理解zookeeper的顺序一致性

ZooKeeper Programmer’s Guide提到:

Sometimes developers mistakenly assume one other guarantee that ZooKeeper does not in fact make. This is:
Simultaneously Conistent Cross-Client Views
ZooKeeper does not guarantee that at every instance in time, two different clients will have identical views of ZooKeeper data. Due to factors like network delays, one client may perform an update before another client gets notified of the change. Consider the scenario of two clients, A and B. If client A sets the value of a znode /a from 0 to 1, then tells client B to read /a, client B may read the old value of 0, depending on which server it is connected to. If it is important that Client A and Client B read the same value, Client B should should call the sync() method from the ZooKeeper API method before it performs its read.
So, ZooKeeper by itself doesn’t guarantee that changes occur synchronously across all servers, but ZooKeeper primitives can be used to construct higher level functions that provide useful client synchronization.

就是说zookeeper并不保证每次从其一个server读到的值是最新的,它只保证这个server中的值是顺序更新的,如果想要读取最新的值,必须在get之前调用sync()(zoo_async)

之前使用zookeeper客户端踩到的坑

  1. zk session 处理

    • 忽略了connecting事件,client与server心跳超时之后没有将选主服务及时下线掉,导致双主。
    • 多个线程处理zk的连接状态,导致产生了多套zk线程连接zkserver。
    • zk超时时间不合理,导致重连频率太高,打爆zkserver。
    • 所有的zkserver全部重置(zk server全部状态被重置),这种情况下客户端不会受到expired事件,我之前实现的客户端也不会重新去建立zk session。导致之前的zkclient建立的session全部不可用,陷入无限重连而连不上的窘境。
  2. 多线程竞态
    • zk自己的线程do_completion会调用watcher的回调函数,和业务线程产生竞争,导致core dump。
  3. 同步api
    • 同步API没有超时时间,如果zkserver状态不对,会导致调用同步zk API的线程卡死。
    • 供业务使用的api设计不当,导致初始化时调用的同步版本api造成死锁。

curator 连接保证

Curator会监控所有的zookeeper连接,并且所有的操作都会有重试机制,因此curator可以保证:

  1. 所有的Curator operation(create、get.sync等)都会在zookeeper连接建立之后再进行
  2. 所有的Curator operation都可以通过重试机制正确的处理zookeeper session loss/expireds事件
  3. 如果当前session lost了,Curator operation可以保持一致重试直到成功
  4. 所有的curator client都会以一种合理的方式处理zookeeper连接问题。

连接状态监控以及重试机制

ConnectionStateListener

Curator will set the LOST state when it believes that the ZooKeeper session has expired. ZooKeeper connections have a session. When the session expires, clients must take appropriate action. In Curator, this is complicated by the fact that Curator internally manages the ZooKeeper connection. Curator will set the LOST state when any of the following occurs: a) ZooKeeper returns a Watcher.Event.KeeperState.Expired or KeeperException.Code.SESSIONEXPIRED; b) Curator closes the internally managed ZooKeeper instance; c) The session timeout elapses during a network partition. It is possible to get a RECONNECTED state after this but you should still consider any locks, etc. as dirty/unstable.

checkSessionExpiration如果一定时间内收不到zkserver的任何时间,则认为当前连接已经expire

实例管理

类构造函数

    /*** @param ensembleProvider the ensemble provider  连接ipstring* @param sessionTimeoutMs session timeout  就是我们设置的sessiontimeout超时时间* @param connectionTimeoutMs connection timeout  连接超时,这么久还没有连上就不连了* @param watcher default watcher or null* @param retryPolicy the retry policy to use  retryforever*/public CuratorZookeeperClient(EnsembleProvider ensembleProvider, int sessionTimeoutMs, int connectionTimeoutMs, Watcher watcher, RetryPolicy retryPolicy){this(new DefaultZookeeperFactory(), ensembleProvider, sessionTimeoutMs, connectionTimeoutMs, watcher, retryPolicy, false);}

状态转换时调用ConnectionStateListener

Recipes 场景支持

curator有不同的Recipes执行不同的功能,并且都集成了zookeeper很多底层语句,比如节点选举,会首先注册整合path,再注册和watch选举znode。

Most Curator recipes will autocreate parent nodes of paths given to the recipe as CreateMode

基本操作

curator采用fluent风格api,提供同步和异步(BackgroundCallback)两种

监听watch

addListener加入的监听器不用重复添加

Zookeeper原生支持通过注册Watcher来进行事件监听,但是开发者需要反复注册(Watcher只能单次注册单次使用)。CacheCurator中对事件监听的包装,可以看作是对事件监听的本地缓存视图,能够自动为开发者处理反复注册监听。Curator提供了三种Watcher(Cache)来监听结点的变化。可以兼用以下状态的变化:

  • zk挂掉type=CONNECTION_SUSPENDED,,一段时间后type=CONNECTION_LOST
  • 重启zk:type=CONNECTION_RECONNECTED, data=null
  • 更新子节点:type=CHILD_UPDATED
  • 删除子节点type=CHILD_REMOVED

实现的recipes

curator提供了各种recipes提供各种功能直接为上层业务使用。

Elections 选举

  • LeaderSelector:只要takeLeadership不退出,当前节点就一直是leader。实际上是用InterProcessMutex做的
  • LeaderLatch:一旦选举出Leader,除非有客户端挂掉重新触发选举,否则不会交出领导权。

locks 锁

分布式锁

counters 计数器

由于zk的写是递交到leader去写的,而读是follower就可以读,所以不知道这个计数器会不会引起stale read

caches 缓存

有不同级别的缓存,比如node、path、tree。并且它会注册watcher,如果节点有变更,curator会及时更新cache

Nodes/Watchers

这个主要指创建一些persist node,和与之对应的watcher

Queues 队列

zookeeper顺序节点本身就可以作为队列使用

事务

tech note

  1. 所有的watcher事件都应该在同一个线程里执行,然后再这个线程里对访问的资源加锁(这个操作应该由zk库在zk线程里自己完成)
  2. 认真对待session生命周期,如果expired就需要重连,如果session已经expired了,所有与这个session相关的操作也应该失败。session和临时节点是绑定的,session expired了临时节点也就没了
  3. zookeeper可以把sessionid和password保存起来,下次新建连接的时候可以直接用之前的
  4. zookeeper不适合做消息队列,因为
    • zookeeper有1M的消息大小限制
    • zookeeper的children太多会极大的影响性能
    • znode太大也会影响性能
    • znode太大会导致重启zkserver耗时10-15分钟
    • zookeeper仅使用内存作为存储,所以不能存储太多东西。
  5. 最好单线程操作zk客户端,不要并发,临界、竞态问题太多
  6. Curator session 生命周期管理:
    • CONNECTED:第一次建立连接成功时收到该事件
    • READONLY:标明当前连接是read-only状态
    • SUSPENDED:连接目前断开了(收到KeeperState.Disconnected事件,也就是说curator目前没有连接到任何的zk server),leader选举、分布式锁等操作遇到SUSPENED事件应该暂停自己的操作直到重连成功。Curator官方建议把SUSPENDED事件当作完全的连接断开来处理。意思就是把收到SUSPENDED事件的时候就当作自己注册的所有临时节点已经掉了。
    • LOST:如下几种情况会进出LOST事件
      • curator收到zkserver发来的EXPIRED事件。
      • curator自己关掉当前zookeeper session
      • 当curator断定当前session被zkserver认为已经expired时设置该事件。在Curator 3.x,Curator会有自己的定时器,如果收到SUSPENDED事件一直没有没有收到重连成功的事件,超时一定时间(2/3 * session_timeout)。curator会认为当前session已经在server侧超时,并进入LOST事件。
    • RECONNECTED:重连成功

对于何时进入LOST状态,curator的建议:

When Curator receives a KeeperState.Disconnected message it changes its state to SUSPENDED (see TN12, errors, etc.). As always, our recommendation is to treat SUSPENDED as a complete connection loss. Exit all locks, leaders, etc. That said, since 3.x, Curator tries to simulate session expiration by starting an internal timer when KeeperState.Disconnected is received. If the timer expires before the connection is repaired, Curator changes its state to LOST and injects a session end into the managed ZooKeeper client connection. The duration of the timer is set to the value of the “negotiated session timeout” by calling ZooKeeper#getSessionTimeout().
The astute reader will realize that setting the timer to the full value of the session timeout may not be the correct value. This is due to the fact that the server closes the connection when 2/3 of a session have already elapsed. Thus, the server may close a session well before Curator’s timer elapses. This is further complicated by the fact that the client has no way of knowing why the connection was closed. There are at least three possible reasons for a client connection to close:

  • The server has not received a heartbeat within 2/3 of a session
  • The server crashed
  • Some kind of general TCP error which causes a connection to fail

In situtation 1, the correct value for Curator’s timer is 1/3 of a session - i.e. Curator should switch to LOST if the connection is not repaired within 1/3 of a session as 2/3 of the session has already lapsed from the server’s point of view. In situations 2 and 3 however, Curator’s timer should be the full value of the session (possibly plus some “slop” value). In truth, there is no way to completely emulate in the client the session timing as managed by the ZooKeeper server. So, again, our recommendation is to treat SUSPENDED as complete connection loss.

curator默认使用100%的session timeout时间作为SUSPENDED到LOST的转换时间,但是用户可以根据需求配置为33%的session timeout以满足上文所说的情况的场景

参考链接

  1. 基于Apache Curator框架的ZooKeeper使用详解
  2. Zookeeper客户端Curator使用详解
  3. ZooKeeper和Curator相关经验总结
  4. Welcome to Curator
  5. Curator Error Handling
  6. Recipescurator支持的业务类型,比如选举,计数,跨线程锁等
  7. 基于Zookeeper实现的分布式互斥锁 - InterProcessMutex
  8. how to properly recreate ephemeral nodes and reset watches after a session expiry

zookeeper客户端库curator分析相关推荐

  1. Zookeeper 客户端之 Curator

    之前写的一个在 Linux 上安装部署 Zookeeper 的笔记,其他操作系统请自行谷歌教程吧. 本文案例工程已经同步到了 github,传送门. PS : 目前还没有看过Curator的具体源码, ...

  2. Zookeeper客户端kazoo的watch流程详解

    前言 关于watch,zk做如下保证: 1.atch是针对其他事件.其他watch和异步答复而排序的.ZooKeeper客户端库可确保按顺序分派所有内容. 2.客户端将看到它正在监视的znode的wa ...

  3. kazoo源码分析:Zookeeper客户端start概述

    kazoo源码分析 kazoo-2.6.1 kazoo客户端 kazoo是一个由Python编写的zookeeper客户端,实现了zookeeper协议,从而提供了Python与zookeeper服务 ...

  4. Zookeeper开源客户端框架Curator的简单使用

    为什么80%的码农都做不了架构师?>>>    Curator最初由Netflix的Jordan Zimmerman开发, Curator提供了一套Java类库, 可以更容易的使用Z ...

  5. 聊聊、Zookeeper 客户端 Curator

    [Curator]   和 ZkClient 一样,Curator 也是开源客户端,Curator 是 Netflix 公司开源的一套框架. <dependency><groupId ...

  6. Zookeeper客户端Curator使用详解

    http://www.jianshu.com/p/70151fc0ef5d Zookeeper客户端Curator使用详解 简介 Curator是Netflix公司开源的一套zookeeper客户端框 ...

  7. [转载]Zookeeper开源客户端框架Curator简介

    转载声明:http://macrochen.iteye.com/blog/1366136 Zookeeper开源客户端框架Curator简介 博客分类: Distributed Open Source ...

  8. Zookeeper客户端Curator详解

    一.Curator 客户端使用 Curator是 Netflix公司开源的一套ZooKeeper客户端框架,和 ZkClient一样它解决了非常底层的细节开发工作,包括连接.重连.反复注册Watche ...

  9. ZooKeeper客户端Curator的基本使用

    前提:ZooKeeper版本:3.4.14      Curator版本:2.13.0 1.什么是Curator Curator是Netflix公司开源的一套zookeeper客户端框架,解决了很多Z ...

最新文章

  1. 错误:'sys'未定义解决方法.(asp.net Ajax v1.0.61025版)
  2. sigal mq_notify
  3. 【Recorder.js+百度语音识别】全栈方案技术细节
  4. linux服务器重启为啥重新新增端口,Linux服务器上新增开放端口号
  5. (二)CXF之用CXF官方工具生成客户端Client
  6. js导出变量 vue_Vue+Element前端导入导出Excel
  7. 单片机c语言程序设计软件下载,《手把手教你学单片机C程序设计》PDF免费版下载...
  8. 一级计算机选择题汇总,计算机一级考试选择题汇总.pdf
  9. vc2005编译出来的程序实现绿色版,即无须安装运行库
  10. CprimePlus 函数2
  11. 输入框限制输入表情的方法
  12. 二、JAVA BIO
  13. 无变压器的最简单开关稳压电源
  14. som java_SOM网络聚类完整示例(利用python和java)
  15. 西安IT男的前景: 我是IT程序员,没有成堆的快递箱, 却有你们看不到的追逐自我
  16. JAVA使用JDBC批量插入SQL
  17. Qimage颜色显示反色总结
  18. 什么是 ImageX?
  19. java 接口校验接收参数_java接口参数校验
  20. 币胜徐小喆:投资需要专业、科技、经验和耐心,打造区块链的摩根士丹利

热门文章

  1. Python解析XML文件
  2. 整数域上的多项式辗转相除
  3. nginx系列之二:配置文件解读
  4. vector的reserve和resize
  5. 用FlatBuffers提升Android平台上Facebook的性能
  6. ClickHouse 副本协同原理:ReplicatedMergeTree引擎
  7. 重新深入理解零拷贝技术
  8. 如何进行I/O评估、监控、定位和优化?
  9. 中秋水文 | 安利一发国漫
  10. 赏析 Singleflight 设计